Roadmap

gitgap is in active development. The core lifecycle loop is operational. The next phase focuses on scale, intelligence depth, and open platform access.

Phase 1 — Lifecycle Complete

The full NAUGHT → CAUGHT → FOUND / REJECTED loop, keeper review, and gap index.

  • Gap extraction pipeline — PMC ingest, JATS parsing, LLM classification
  • Keeper review — individual and batch pass/fail
  • Globe — 3D gap visualization with lifecycle state rendering
  • FOUND state — gap sealing on publication with cosmoid + DOI
  • REJECTED trail — rejection mode, notes, pickup instructions preserved
  • Catch confidence — cosine similarity between submission and gap declaration

Phase 2 — Editorial Integration Complete

Deep integration with the eaiou authoring layer.

  • Author submission → gap registration at seal time
  • Revision workflow — round tracking, editor instructions
  • Rejection reason capture — feeds pickup instructions into gitgap
  • Status machine validation — enforced transition graph
  • Author notifications on status change
  • DOI assignment on publication — Zenodo receipt as submission capstone

Phase 3 — Intelligence Depth Complete

Gap intelligence beyond extraction — clustering, scoring, enrichment.

  • Discipline enrichment — source discipline, bridge potential, structural holes view
  • Convergence clustering — agreed-upon gaps across ≥3 papers from ≥2 sources
  • CAP score — Corpus-Appreciated Phenomenon score for gap ripeness
  • Free-text ingest — non-PMC papers (Zenodo, preprints, DOI-only)
  • Batch keeper review + CSV export

Phase 4 — Scale In Progress

Admin infrastructure and journal network expansion.

  • Admin interface — API sources, journal registry, reconcile log, discovery probe
  • OAI-PMH harvesting — per-journal endpoint, incremental sync
  • Journal submission queue — public request form (submit yours)
  • Globe sampling — server-side max 500 gaps prioritized by bridge potential
  • Export API — GET /gaps/export?format=csv|jsonl

Phase 5 — Open Platform Planned

Infrastructure for a multi-source, multi-instance gap network.

  • Discipline ontology admin — configurable discipline graph, not hardcoded markers
  • Vector DB migration — pgvector or Qdrant for O(log n) similarity search at scale
  • OpenAlex, Crossref, PLOS, Springer Nature client implementations
  • Field mapping engine — DB-driven parse-time field resolution
  • Self-hosted packaging — Docker Compose, environment template, zero-config SQLite default
  • Stable push API v1 — versioned, documented, external system compatible

Phase 6 — Federation Proposed

Multiple gitgap instances sharing a common gap index.

  • Instance registry — institutions run their own gitgap, contribute to shared index
  • Gap deduplication across instances — cosine-based merge with provenance preserved
  • Convergence across instances — agreed-upon gaps from multiple independent deployments

How to influence the roadmap

Priority is driven by gap index size, journal coverage, and community need. The current focus is on journal network expansion (Phase 4) — more journals means more gaps, more convergence detection, more structural holes surfaced.