Open Source
gitgap is MIT-licensed. The full gap detection pipeline, lifecycle management, globe visualization, and API are open source and self-hostable.
License
MIT License — permissive, allows institutional forks, commercial wrappers, and academic modifications without restriction. Attribution is required.
Repository
Source code is hosted at GitHub under the aybllc organization. The gap detection pipeline and eaiou authoring layer are separate repositories — eaiou is governed independently.
Self-hosting
Requirements
- Python 3.11+
- SQLite (default, zero config) or PostgreSQL
- An OpenAI-compatible API key (for gap extraction LLM calls)
- Optional: NCBI API key for higher PMC rate limits
Quickstart
git clone https://github.com/aybllc/gitgap
cd gitgap
cp .env.example .env
# Edit .env — add OPENAI_API_KEY at minimum
pip install -r requirements.txt
uvicorn app.main:app --reload
The database is created automatically on first run. Navigate to
http://localhost:8000 to start.
Docker
docker-compose up
The default compose file includes the app and a data volume for SQLite persistence.
For production, set DATABASE_URL to a PostgreSQL connection string in
.env.
Environment variables
| Variable | Required | Description |
|---|---|---|
| OPENAI_API_KEY | Yes | LLM API key for gap extraction and embedding |
| DATABASE_URL | No | SQLite (default) or PostgreSQL connection string |
| NCBI_API_KEY | No | Higher PMC rate limits (10/sec vs 3/sec) |
| GITGAP_API_KEY | No | Master API key for push endpoints (auto-generated if not set) |
| EAIOU_API_URL | No | eaiou instance URL for wheelhouse rescore webhook |
| EAIOU_MASTER_API_KEY | No | eaiou API key for internal webhooks |
Contributing
Contributions are welcome. The highest-impact areas:
-
Keeper review quality — improving the gap extraction classifier reduces the keeper review burden.
The classifier lives in
app/ingest/classify.py. -
OAI-PMH harvesting — the harvester in
app/ingest/needs per-format parsers for journals that use extended Dublin Core or custom OAI schemas. -
Publisher API clients — PLOS, Springer Nature, Europe PMC clients in
app/ingest/. Each source has a registered entry inapi_sources— clients implement the corresponding slug. -
Globe performance — Three.js rendering for 1K+ gaps.
Globe code is in
app/templates/globe.html.
Pull request process
- Fork the repository
- Create a branch —
feature/your-featureorfix/your-fix - Write tests if adding pipeline logic
- Submit PR against
main - PRs are reviewed within 48 hours
What stays proprietary
Nothing in the gap detection pipeline. The eaiou authoring layer (author workspace, editorial workflow, Q scoring) is a separate project governed independently — but its API contract with gitgap is documented in API Ingest.
Code of conduct
gitgap uses the Contributor Covenant v2.1 . The short version: be constructive, assume good intent, don't be hostile.