Roadmap
gitgap is in active development. The core lifecycle loop is operational. The next phase focuses on scale, intelligence depth, and open platform access.
Phase 1 — Lifecycle Complete
The full NAUGHT → CAUGHT → FOUND / REJECTED loop, keeper review, and gap index.
- Gap extraction pipeline — PMC ingest, JATS parsing, LLM classification
- Keeper review — individual and batch pass/fail
- Globe — 3D gap visualization with lifecycle state rendering
- FOUND state — gap sealing on publication with cosmoid + DOI
- REJECTED trail — rejection mode, notes, pickup instructions preserved
- Catch confidence — cosine similarity between submission and gap declaration
Phase 2 — Editorial Integration Complete
Deep integration with the eaiou authoring layer.
- Author submission → gap registration at seal time
- Revision workflow — round tracking, editor instructions
- Rejection reason capture — feeds pickup instructions into gitgap
- Status machine validation — enforced transition graph
- Author notifications on status change
- DOI assignment on publication — Zenodo receipt as submission capstone
Phase 3 — Intelligence Depth Complete
Gap intelligence beyond extraction — clustering, scoring, enrichment.
- Discipline enrichment — source discipline, bridge potential, structural holes view
- Convergence clustering — agreed-upon gaps across ≥3 papers from ≥2 sources
- CAP score — Corpus-Appreciated Phenomenon score for gap ripeness
- Free-text ingest — non-PMC papers (Zenodo, preprints, DOI-only)
- Batch keeper review + CSV export
Phase 4 — Scale In Progress
Admin infrastructure and journal network expansion.
- Admin interface — API sources, journal registry, reconcile log, discovery probe
- OAI-PMH harvesting — per-journal endpoint, incremental sync
- Journal submission queue — public request form (submit yours)
- Globe sampling — server-side max 500 gaps prioritized by bridge potential
- Export API —
GET /gaps/export?format=csv|jsonl
Phase 5 — Open Platform Planned
Infrastructure for a multi-source, multi-instance gap network.
- Discipline ontology admin — configurable discipline graph, not hardcoded markers
- Vector DB migration — pgvector or Qdrant for O(log n) similarity search at scale
- OpenAlex, Crossref, PLOS, Springer Nature client implementations
- Field mapping engine — DB-driven parse-time field resolution
- Self-hosted packaging — Docker Compose, environment template, zero-config SQLite default
- Stable push API v1 — versioned, documented, external system compatible
Phase 6 — Federation Proposed
Multiple gitgap instances sharing a common gap index.
- Instance registry — institutions run their own gitgap, contribute to shared index
- Gap deduplication across instances — cosine-based merge with provenance preserved
- Convergence across instances — agreed-upon gaps from multiple independent deployments
How to influence the roadmap
Priority is driven by gap index size, journal coverage, and community need. The current focus is on journal network expansion (Phase 4) — more journals means more gaps, more convergence detection, more structural holes surfaced.
- Submit your journal for aggregation
- Report issues and feature requests via GitHub Issues
- See Open Source for how to contribute directly