-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Task Details
Create a new top-level backend-ai/ service within the monorepo to isolate AI inference logic from the existing backend/ API service. The new service must have its own dependency graph, Dockerfile, and CI pipeline. It must not share internal modules with backend/ and must communicate strictly via HTTP using a defined API contract. The goal is to reduce backend image size, isolate ML dependencies, and prepare for independent scaling and future GPU deployment (if possible).
Steps to Complete
- Create
backend-ai/directory at repository root. - Initialize separate
pyproject.tomlfor AI service. - Implement minimal FastAPI application with health endpoint.
- Define initial inference endpoints (e.g.,
/summarize,/compare,/embed). - Add independent Dockerfile for
backend-ai. - Update
docker-compose.ymlto includebackend-aiservice. - Configure internal networking between
backendandbackend-ai. - Add environment variables for service URL configuration in
backend. - Implement async HTTP client in
backendfor AI service calls. - Add CI workflow for
backend-ai(lint, test, build). - Ensure no cross-imports between
backendandbackend-ai. - Validate container builds independently.
- Update documentation to describe service boundaries and deployment model.
Additional Notes
The AI service must remain stateless and should not directly access the primary database. All schema sharing should occur through explicit request/response models rather than shared internal modules. The architecture must allow future extraction into a separate repository with minimal refactoring.