Open Source

gitgap is MIT-licensed. The full gap detection pipeline, lifecycle management, globe visualization, and API are open source and self-hostable.

License

MIT License — permissive, allows institutional forks, commercial wrappers, and academic modifications without restriction. Attribution is required.

Repository

Source code is hosted at GitHub under the aybllc organization. The gap detection pipeline and eaiou authoring layer are separate repositories — eaiou is governed independently.

Self-hosting

Requirements

  • Python 3.11+
  • SQLite (default, zero config) or PostgreSQL
  • An OpenAI-compatible API key (for gap extraction LLM calls)
  • Optional: NCBI API key for higher PMC rate limits

Quickstart

git clone https://github.com/aybllc/gitgap
cd gitgap
cp .env.example .env
# Edit .env — add OPENAI_API_KEY at minimum
pip install -r requirements.txt
uvicorn app.main:app --reload

The database is created automatically on first run. Navigate to http://localhost:8000 to start.

Docker

docker-compose up

The default compose file includes the app and a data volume for SQLite persistence. For production, set DATABASE_URL to a PostgreSQL connection string in .env.

Environment variables

VariableRequiredDescription
OPENAI_API_KEYYesLLM API key for gap extraction and embedding
DATABASE_URLNoSQLite (default) or PostgreSQL connection string
NCBI_API_KEYNoHigher PMC rate limits (10/sec vs 3/sec)
GITGAP_API_KEYNoMaster API key for push endpoints (auto-generated if not set)
EAIOU_API_URLNoeaiou instance URL for wheelhouse rescore webhook
EAIOU_MASTER_API_KEYNoeaiou API key for internal webhooks

Contributing

Contributions are welcome. The highest-impact areas:

  • Keeper review quality — improving the gap extraction classifier reduces the keeper review burden. The classifier lives in app/ingest/classify.py.
  • OAI-PMH harvesting — the harvester in app/ingest/ needs per-format parsers for journals that use extended Dublin Core or custom OAI schemas.
  • Publisher API clients — PLOS, Springer Nature, Europe PMC clients in app/ingest/. Each source has a registered entry in api_sources — clients implement the corresponding slug.
  • Globe performance — Three.js rendering for 1K+ gaps. Globe code is in app/templates/globe.html.

Pull request process

  1. Fork the repository
  2. Create a branch — feature/your-feature or fix/your-fix
  3. Write tests if adding pipeline logic
  4. Submit PR against main
  5. PRs are reviewed within 48 hours

What stays proprietary

Nothing in the gap detection pipeline. The eaiou authoring layer (author workspace, editorial workflow, Q scoring) is a separate project governed independently — but its API contract with gitgap is documented in API Ingest.

Code of conduct

gitgap uses the Contributor Covenant v2.1 . The short version: be constructive, assume good intent, don't be hostile.