Skip to content

beki-6/humanitarian_intelligence_system

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Humanitarian Intelligence System

Project Overview

The Humanitarian Intelligence System is an AI-driven platform designed to support flood risk prediction and humanitarian information retrieval. It combines geospatial data, document embedding search, and machine learning inference to help responders generate operational insights.

This repository contains the core backend, data modeling, and ML components. The frontend/ and infrastructure/ folders are intentionally left in place but are not described in detail here.

Key Concepts

  • Flood risk prediction: A supervised ML model evaluates flood risk probability from structured inputs such as rainfall, river level, population density, elevation, soil moisture, and prior flood history.
  • Humanitarian document ingestion: Text reports are embedded and stored in PostgreSQL with vector search support to enable similarity-based retrieval.
  • Situational reporting: Retrieved documents can be used as context for a language model to generate operational summaries, supporting humanitarian decision-making.
  • Geospatial support: PostGIS is enabled in the database to support spatial data and future GIS services.

Architecture

Backend

The backend is built with:

  • FastAPI for HTTP API endpoints
  • SQLAlchemy with async support for database access
  • PostgreSQL with PostGIS and pgvector extensions for spatial and vector storage
  • SentenceTransformers for document embedding generation
  • joblib for loading a trained flood prediction model and preprocessing pipeline

Services and Modules

  • app/main.py

    • Defines the FastAPI application
    • Registers API routers
    • Runs init_db() during lifespan startup
  • app/api/flood_prediction.py

    • Exposes /api/predict/flood
    • Accepts flood prediction inputs and returns a risk probability
  • app/db/init_db.py

    • Creates required database extensions (vector, postgis)
    • Creates database tables from SQLAlchemy models
  • app/db/session.py

    • Configures the async SQLAlchemy engine and session maker
    • Connects to PostgreSQL using postgresql+asyncpg
  • app/models/

    • Defines domain entities such as humanitarian documents, disaster events, regions, and infrastructure
    • Example: HumanitarianDocument stores text, title, and a 384-dimensional vector embedding
  • app/schemas/flood_prediction.py

    • Pydantic models for request/response validation

AI / ML Pipeline

  • app/ml/inference/flood_predictor.py

    • Loads flood_model.pkl and preprocessing.pkl
    • Transforms structured input features and predicts flood probability
  • app/ml/features/preprocessing.py

    • Contains preprocessing logic used during model training
  • app/ml/training/train_flood_model.py

    • Training script for the flood risk model
  • app/ml/data/generate_dataset.py

    • Generates dataset artifacts for training or experimentation

Document & Vector Services

  • app/services/embedding.py

    • Builds text embeddings using all-MiniLM-L6-v2
  • app/services/document_ingestion.py

    • Ingests documents into the database with embeddings
  • app/services/retrieval.py

    • Executes vector similarity search against stored document embeddings
  • app/services/reporting_service.py

    • Uses retrieved document context to build a prompt for the language model
  • app/services/llm_service.py

    • Placeholder service for generating summaries from prompts
  • app/services/gis_service.py

    • Converts region geometry to GeoJSON for GIS integration

Deployment

The project can be launched with Docker Compose:

  • docker-compose.yaml defines:
    • postgres: kartoza/postgis:18-3.6 with postgis and pgvector
    • backend: builds from ./backend and exposes port 8000

The backend depends on the PostgreSQL service and waits for it to become healthy before starting.

What’s Included

  • Backend application with REST API
  • Database setup and async SQLAlchemy configuration
  • ML model inference pipeline
  • Document embedding and retrieval components
  • Placeholder LLM summarization service

What Is Not Covered Here

The frontend/ and infrastructure/ folders remain in the repository but are not documented in this README. They are intentionally left as separate concerns from the core backend and AI architecture described above.

Notes

  • The LLM service currently returns a placeholder response and can be extended to integrate with a real language model API.
  • The ML inference pipeline assumes saved model files under backend/app/ml/models/.
  • Database connection values are configured for Docker Compose service names and credentials in backend/app/db/session.py.

About

RandomForestClassifier model trained using synthetic climate data as a demonstration of a stack: Scikit-Learn (Machine Learning), PostgreSQL(+PostGIS and pgvector for geospatial and vector data), LLMs and sentence-transformer (for semantic search, RAG and SitRep generation), FastAPI and SQLAlchemy (ingestion server)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors