The Humanitarian Intelligence System is an AI-driven platform designed to support flood risk prediction and humanitarian information retrieval. It combines geospatial data, document embedding search, and machine learning inference to help responders generate operational insights.
This repository contains the core backend, data modeling, and ML components. The frontend/ and infrastructure/ folders are intentionally left in place but are not described in detail here.
- Flood risk prediction: A supervised ML model evaluates flood risk probability from structured inputs such as rainfall, river level, population density, elevation, soil moisture, and prior flood history.
- Humanitarian document ingestion: Text reports are embedded and stored in PostgreSQL with vector search support to enable similarity-based retrieval.
- Situational reporting: Retrieved documents can be used as context for a language model to generate operational summaries, supporting humanitarian decision-making.
- Geospatial support: PostGIS is enabled in the database to support spatial data and future GIS services.
The backend is built with:
FastAPIfor HTTP API endpointsSQLAlchemywith async support for database accessPostgreSQLwithPostGISandpgvectorextensions for spatial and vector storageSentenceTransformersfor document embedding generationjoblibfor loading a trained flood prediction model and preprocessing pipeline
-
app/main.py- Defines the FastAPI application
- Registers API routers
- Runs
init_db()during lifespan startup
-
app/api/flood_prediction.py- Exposes
/api/predict/flood - Accepts flood prediction inputs and returns a risk probability
- Exposes
-
app/db/init_db.py- Creates required database extensions (
vector,postgis) - Creates database tables from SQLAlchemy models
- Creates required database extensions (
-
app/db/session.py- Configures the async SQLAlchemy engine and session maker
- Connects to PostgreSQL using
postgresql+asyncpg
-
app/models/- Defines domain entities such as humanitarian documents, disaster events, regions, and infrastructure
- Example:
HumanitarianDocumentstores text, title, and a 384-dimensional vector embedding
-
app/schemas/flood_prediction.py- Pydantic models for request/response validation
-
app/ml/inference/flood_predictor.py- Loads
flood_model.pklandpreprocessing.pkl - Transforms structured input features and predicts flood probability
- Loads
-
app/ml/features/preprocessing.py- Contains preprocessing logic used during model training
-
app/ml/training/train_flood_model.py- Training script for the flood risk model
-
app/ml/data/generate_dataset.py- Generates dataset artifacts for training or experimentation
-
app/services/embedding.py- Builds text embeddings using
all-MiniLM-L6-v2
- Builds text embeddings using
-
app/services/document_ingestion.py- Ingests documents into the database with embeddings
-
app/services/retrieval.py- Executes vector similarity search against stored document embeddings
-
app/services/reporting_service.py- Uses retrieved document context to build a prompt for the language model
-
app/services/llm_service.py- Placeholder service for generating summaries from prompts
-
app/services/gis_service.py- Converts region geometry to GeoJSON for GIS integration
The project can be launched with Docker Compose:
docker-compose.yamldefines:postgres:kartoza/postgis:18-3.6withpostgisandpgvectorbackend: builds from./backendand exposes port8000
The backend depends on the PostgreSQL service and waits for it to become healthy before starting.
- Backend application with REST API
- Database setup and async SQLAlchemy configuration
- ML model inference pipeline
- Document embedding and retrieval components
- Placeholder LLM summarization service
The frontend/ and infrastructure/ folders remain in the repository but are not documented in this README. They are intentionally left as separate concerns from the core backend and AI architecture described above.
- The LLM service currently returns a placeholder response and can be extended to integrate with a real language model API.
- The ML inference pipeline assumes saved model files under
backend/app/ml/models/. - Database connection values are configured for Docker Compose service names and credentials in
backend/app/db/session.py.