Monorepo with two Python packages:
maturity_tools: tools to assess data maturity.data_viewer: Streamlit UI to visualize maturity results.
They are separate so maturity_tools can be used as a dependency without pulling UI dependencies.
Runs as a small stack (Postgres + API + Streamlit viewer) via Docker Compose.
- Docker + Docker Compose
- Secret files (not committed): see
secrets/README.md
docker compose up -d --buildThe scheduler refreshes cached data, writes metrics snapshots, and runs summaries after each refresh.
Start it as soon as the stack is up so the database populates quickly and summaries stay current.
It needs GITHUB_TOKEN, API_KEY, and OPENAI_API_KEY (or their _FILE variants) when summaries are enabled.
docker compose --profile scheduler up -d --build- Viewer: http://localhost:8501
- API docs: http://localhost:8000/docs
- Database:
POSTGRES_DB(default:maturity)POSTGRES_USER(default:maturity)POSTGRES_PASSWORD(default:maturity)
- Scheduler (profile
scheduleronly):REFRESH_OWNERS=owner1,owner2(recommended; falls back toDISTINGUISHED_OWNERSif unset)REFRESH_REPO(optional, single repo name)REFRESH_INTERVAL_DAYS(default:7)REFRESH_INTERVAL_SECONDS(optional override; mainly for testing)FORCE_REFRESH(default:false)RUN_SUMMARIES(default:true)SUMMARY_BASE_URL(default:http://api:8000)SUMMARY_MODEL,SUMMARY_HISTORY,SUMMARY_MAX_AGE_DAYS,SUMMARY_FORCE,SUMMARY_NO_STORE(optional)
- Cache and metrics snapshots are stored in Postgres only.
DATABASE_URLis required for non-Docker runs (SQLAlchemy + psycopg).- Refresh cache/metrics:
python scripts/refresh_cache.py --owner <org> - Summaries are generated by
scripts/summarize.py(or by the scheduler whenRUN_SUMMARIES=true).
- MkDocs site:
docs/(config inmkdocs.yml) - Scripts:
docs/scripts.md - Storage/cache:
docs/storage.md - API:
docs/api.md