Skip to content

Latest commit

 

History

History
225 lines (159 loc) · 13.5 KB

File metadata and controls

225 lines (159 loc) · 13.5 KB

AGENTS.md — Instructions for AI Coding Agents

Project: ARGUS — Automated Registry of Global User-accessible Streams

This file describes the rules and constraints that AI coding agents (Claude, Copilot, etc.) must follow when modifying this repository.

Architecture summary

argus/
├── backend/          Python 3.12, FastAPI, SQLite/PostgreSQL persistence
│   ├── config.py     Single source of truth for all tuneable values
│   ├── models/       camera.py (Camera, ScrapeProgress, ScrapeSourceConfig)
│   │                 ai.py     (QueueStatus, QueueProgressEvent)
│   ├── utils/        http.py (HTTP session factory with retry logic)
│   ├── db/           models.py (ORM tables), init_db.py, session.py
│   ├── ai/           Optional AI layer (disabled by default)
│   └── sources/      Source plugins — see "Sources system" below
│       └── api/      routes.py, store.py, app.py
└── frontend/         React 18, TypeScript, Vite, Leaflet, lucide-react
    └── src/
        ├── types/    Mirrors backend Pydantic models exactly
        ├── utils/    api.ts — all fetch calls; nothing else calls fetch
        ├── hooks/    useScrape.ts, useUtcClock.ts, useAI.ts, useAnalysisQueue.ts
        └── components/

Persistence

ARGUS uses SQLite by default (file argus.db, created automatically in the backend working directory). PostgreSQL is supported via DATABASE_URL. The CameraStore (api/store.py) maintains an in-memory mirror of the database for fast reads; all writes go to both memory and DB.

The store uses a two-phase init pattern:

  • CameraStore() — safe to construct without DB (import time, unit tests).
  • store.connect(engine) — called once in the FastAPI lifespan. Idempotent: if tests already connected the store, the lifespan call is a no-op.

Do not assume the store is in-memory only — it has a full SQLite/PostgreSQL persistence layer.

Sources system

ARGUS uses a plugin-style sources system. Each source is a self-contained Python module (or package) in backend/sources/ that knows how to fetch camera data from one origin and emit it in ARGUS's standard Camera format.

Autodiscovery

backend/sources/__init__.py scans the directory at runtime. Any .py file or package directory whose name does not start with _ is treated as a source. Files/directories starting with _ are always skipped (templates, private helpers).

Sources are loaded via discover_sources() (returns metadata) and load_source(id) (returns the module). A broken source file never crashes the API — it is skipped with a warning.

Source contract

Every source module must define these module-level attributes:

Attribute Type Description
SOURCE_ID str Unique slug (e.g. "insecam")
SOURCE_NAME str Human-readable display name
SOURCE_DESCRIPTION str One-line description shown in the UI
SOURCE_ENABLED_DEFAULT bool Pre-selected in the boot screen
REQUIRED_ENV_VARS list[str] Env vars that must be set for the source to function

Every source module must also define:

def scrape(
    on_progress: Callable[[ScrapeProgress], None],
    on_camera: Callable[[Camera], None],
) -> None: ...

See backend/sources/_template.py for the full documented contract.

Adding a source

  1. Copy backend/sources/_template.py to backend/sources/<name>.py (simple) or backend/sources/<name>/__init__.py (with helpers).
  2. Fill in the metadata attributes and implement scrape().
  3. If the source needs credentials, list the env var names in REQUIRED_ENV_VARS and document them in the source file. Users set them in .env.
  4. Restart the backend — the source appears automatically in the boot screen.

Self-contained sources

Each source is responsible for all of its own logic. Do not place source-specific parsing, HTTP helpers, utilities, or models in backend/utils/ or backend/models/. If a source needs helpers, put them in private modules alongside it (prefixed with _, e.g. sources/mysource/_parse.py). This ensures the source can be removed by deleting its file/directory without leaving orphaned code.

backend/utils/http.py is the only shared utility — it provides a generic HTTP session builder with retry logic that any source may use. backend/models/ contains only models shared across sources and the API (Camera, ScrapeProgress, ScrapeSourceConfig, QueueStatus, QueueProgressEvent). Source-specific data types (e.g. _CountryMeta in the Insecam source) belong in the source package.

Existing source: Insecam

backend/sources/insecam/ — scrapes insecam.org. No credentials required.

backend/sources/insecam/
├── __init__.py    Source entry point + scraping orchestration
└── _parse.py      Insecam-specific HTML parsing (private to this source)

_CountryMeta (a simple dataclass-style class representing one entry from the country list endpoint) is defined in sources/insecam/__init__.py — not in backend/models/.

Camera ID namespacing

Sources set camera.id to whatever identifier their origin uses (e.g. Insecam's own numeric ID "847392"). The orchestrator in routes.py wraps every on_camera callback to prepend the source ID before the camera enters the store:

stored id = "<source_id>:<raw_id>"   e.g. "insecam:847392"  "demo:demo-001"

This is enforced in _run_scrape via _make_camera_adder. Sources must never include their own namespace prefix — the orchestrator owns that. The frontend displays camera.id as-is, so users see the namespaced form.

Stream proxy

GET /api/stream/proxy?url=<encoded-url> — pipes a camera's HTTP stream through the server so browsers connecting over HTTPS are never blocked by mixed-content rules.

Security: only URLs that already exist in the camera store (camera.stream_url) are accepted. Any other URL returns 403. This prevents the endpoint from being used as an open HTTP relay (SSRF).

Frontend layout

The frontend has two parallel layout trees switched by useIsMobile() (breakpoint: 1100 px):

  • Desktop (≥ 1100 px): 5-column CSS grid — CameraList | sep | MapView | sep | DetailPanel — with resizable drag handles.
  • Mobile (< 1100 px): MobileLayout — full-screen tabbed view (MAP / LIST / DETAIL) with a fixed bottom MobileTabBar and simplified MobileTopBar. The three panels are reused unchanged.

Both layouts share identical state from useScrape. The BootOverlay and RescrapeModal are position: fixed overlays shared by both.

frontend/src/components/mobile/ contains only the mobile shell components. Do not put business logic there.

AI system (backend/ai/)

ARGUS includes an optional, opt-in AI layer. When AI_ENABLED=false (the default), all /ai/* routes return 503 and the frontend hides all AI UI. No AI calls are made and no API keys are required.

Package layout

backend/ai/
├── __init__.py   Exports ai_service and analysis_queue singletons
├── client.py     Lazy AsyncOpenAI singleton factory (reads config)
├── prompts.py    All system prompts as module-level string constants
├── service.py    AIService — analyze_scene(), brief_camera()
└── queue.py      AnalysisQueue singleton — all scene analysis jobs

Design invariants

  • All AI calls go through ai_service — no route, hook, or utility instantiates an OpenAI client directly.
  • All scene analysis goes through analysis_queue — whether triggered for a single camera from the Detail Panel or in bulk. There is no direct scene-analysis SSE endpoint.
  • Intelligence briefs are streamed directly via POST /cameras/{id}/ai/intel (read-only; no queue needed).
  • Prompts live in prompts.py — never inlined in service methods. Auditable, diffable, easy to iterate.
  • AI-generated results are persisted on the Camera record (scene_description, scene_analyzed_at, intel_brief, intel_brief_generated_at) with ISO 8601 UTC timestamps.
  • Disabling AI hides action buttons, not data. When ai_enabled=false, the ANALYZE FEED / INTELLIGENCE BRIEF / BULK ANALYZE controls are hidden, but any previously generated scene descriptions and intelligence briefs remain visible in the Detail Panel. Never gate data display on ai_enabled.
  • fetch_frame rejects partial JPEGs. A buffer with SOI (\xff\xd8) but no EOI (\xff\xd9) is returned as None — incomplete frames must never be forwarded to the vision model.

AI config (backend/config.py)

Variable Default Purpose
AI_ENABLED false Master switch; all AI routes return 503 when false
OPENAI_API_KEY "" API key for any OpenAI-compatible provider
OPENAI_BASE_URL https://api.openai.com/v1 Override for local models (Ollama, etc.)
AI_SCENE_MODEL gpt-4o-mini Vision model for scene analysis
AI_INTEL_MODEL gpt-4.1-mini Text model for intelligence briefs
AI_QUEUE_WORKERS 3 Concurrent worker threads for scene analysis

Adding a new AI feature

  1. Add the method to AIService in service.py (async generator for streaming, or async for one-shot).
  2. Add the system prompt constant to prompts.py.
  3. Add any new model/config vars to config.py with a sane default.
  4. Wire the route in api/routes.py — use _require_ai() guard.
  5. Add the API function to frontend/src/utils/api.ts.
  6. If the feature needs bulk / queued processing, extend AnalysisQueue in queue.py.

No structural changes are needed — the central client, config, and SSE pattern handle everything.

AnalysisQueue design

  • Thread-safe in-memory queue backed by collections.deque + dict[str, _Entry].
  • Worker threads call asyncio.run() to run async AI calls; each thread owns its own event loop.
  • Re-adding a pending or processing camera is a no-op. Re-adding a done or failed camera re-queues it (retry/re-analysis semantics).
  • Every state change emits a QueueProgressEvent to all SSE subscribers via asyncio.Queue.put_nowait.
  • Workers are started and stopped gracefully — a stop signal lets each worker finish its current job before exiting.
  • ai.service and api.store are imported lazily inside _worker_loop to prevent circular imports at module load time.

Frontend AI hooks

Hook Location Purpose
useAnalysisQueue hooks/useAnalysisQueue.ts Instantiated once in App.tsx; subscribes to queue SSE, tracks per-camera states, re-fetches updated camera records on completion
useAI hooks/useAI.ts Per-camera intel brief hook; resets on camera change; streams via fetch + ReadableStream

useAnalysisQueue must be instantiated in App.tsx (not inside panels or modals) so the SSE connection persists across component tree changes.

addCamerasToQueue auto-starts workers after successfully enqueuing cameras (POST /ai/queue/start is idempotent). This ensures single-camera analysis from the Detail Panel works without the user opening BulkAnalysisModal. The explicit start/stop controls in BulkAnalysisModal are still the right place for manual worker management.


Invariants — never violate these

  1. No new dependencies without justification. Every new package requires a comment explaining why the existing stdlib or an already-present package cannot be used.
  2. No parsing, source-specific logic, or source-specific models in backend/utils/ or backend/models/. Source helpers and models belong in the source's own package.
  3. No fetch calls outside frontend/src/utils/api.ts. Components receive data via props or hooks.
  4. No magic numbers. All tuneable values belong in backend/config.py (backend) or as named constants at the top of the relevant module (frontend).
  5. Sources are read-only. They must never authenticate, submit forms, or write to any external service.
  6. Type annotations are required on all public Python functions.
  7. TypeScript strict mode is enabled. Do not use any or @ts-ignore.

Before writing code

  • Read the file you are about to modify. Do not guess at its current contents.
  • If the task requires a new file, identify whether it belongs in a source package, utils/, a component, a hook, or a test.
  • If the task changes the Camera, ScrapeProgress, or SourceMeta model, update both backend/models/camera.py / backend/sources/__init__.py and frontend/src/types/index.ts in the same change.

Testing requirements

  • All new backend logic must have a corresponding test in tests/unit/ or tests/integration/.
  • Tests must not make real HTTP calls. Patch utils.http.safe_get or use ASGITransport.
  • pytest.ini sets pythonpath = backend — do not add sys.path hacks in test files.
  • tests/conftest.py sets DATABASE_URL=sqlite:///:memory: — do not repeat this in individual test files.
  • Run python -m pytest tests/ -v from repo root and confirm 0 failures before finishing.
  • Run cd frontend && npm run build and confirm 0 TypeScript errors before finishing.

What not to do

  • Do not add a different database engine. SQLite and PostgreSQL are the two supported options.
  • Do not add authentication. This is a local OSINT tool.
  • Do not add React Router. The app is a single view.
  • Do not replace Leaflet with another map library.
  • Do not add a linter config — formatting is handled by consistent style, not tooling.
  • Do not remove the scraper_page_delay guard.
  • Do not put source-specific code or models in backend/utils/ or backend/models/.