A Python CLI tool that classifies photos in Immich using Vision Language Models (VLM). It fetches images via the Immich API, sends them to any OpenAI-compatible VLM for structured classification, and stores results in a local SQLite database for querying and export.
- Universal VLM support - Works with vLLM, Ollama, OpenAI, and any OpenAI-compatible API via structured output (
response_format) - Customizable schema - Define your own classification fields (category, tags, quality, NSFW, etc.) by subclassing
BasePrompt, or use built-in presets for common tasks - AI-assisted prompt generation - Describe your goal in natural language and let a strong LLM generate the prompt config for you
- Async & concurrent - Built on
asyncio+httpxwith configurable concurrency via semaphore - Resumable tasks - Every result is persisted immediately; pause/resume without losing progress
- Graceful Ctrl+C - Interrupt a running task to pause it; resume later from where it stopped
- Flexible export - Query results with dynamic JSON field filtering; output as table, JSON, or CSV
- Type-safe - Full type annotations passing Pyright strict mode with zero errors
- Python 3.11+
- uv (recommended) or pip
- A running Immich server
- An OpenAI-compatible VLM API endpoint (vLLM, Ollama, OpenAI, etc.)
git clone https://github.com/your-username/immich-classify.git
cd immich-classify
uv syncCopy the example environment file and fill in your values:
cp .env.example .env| Variable | Description | Default |
|---|---|---|
IMMICH_API_URL |
Immich server URL | required |
IMMICH_API_KEY |
Immich API key | required |
VLM_API_URL |
OpenAI-compatible API base URL | http://localhost:8000/v1 |
VLM_API_KEY |
VLM API key | no-key |
VLM_MODEL_NAME |
Model name (empty = server default) | |
CLASSIFY_DB_PATH |
SQLite database path | ./classify.db |
CLASSIFY_CONCURRENCY |
Max concurrent image processing | 1 |
CLASSIFY_TIMEOUT |
VLM request timeout in seconds | 60 |
CLASSIFY_IMAGE_SIZE |
thumbnail or original |
thumbnail |
CLASSIFY_DEFAULT_PROMPT |
Default prompt config .py file (see path resolution) |
(built-in ClassificationPrompt) |
Environment variables override .env file values.
# 1. List albums to find the target
immich-classify albums
# 2. Debug with a small batch first
immich-classify debug --album <album_id> --count 5
# 3. Run full classification
immich-classify classify --album <album_id>
# 4. Check progress
immich-classify status --task <task_id>
# 5. View results
immich-classify results --task <task_id> --filter category=people --format table
# 6. Export for review
immich-classify results --task <task_id> --format csv > results.csvimmich-classify albums
List all Immich albums with ID, name, and asset count.
immich-classify classify --album <id> [--album <id2>] [--prompt-config <file>] [--concurrency <n>]
Create and run a classification task. Supports multiple albums.
<file> is resolved via cwd first, then the built-in prompts directory.
immich-classify debug --album <id> [--count <n>] [--prompt-config <file>]
Run a small debug batch (default 10) and print results. No database writes.
immich-classify generate --goal <description> [--output <file.py>] [--api-url <url>] [--api-key <key>] [--model <name>]
AI-generate a prompt config from a natural language task description.
immich-classify status [--task <task_id>]
Show all tasks, or detailed progress for a specific task.
immich-classify results --task <id> [--filter <key=value>]... [--format json|csv|table]
Query classification results with optional field filtering.
immich-classify pause --task <id> Pause a running task.
immich-classify resume --task <id> Resume a paused task.
immich-classify cancel --task <id> Cancel a task (keeps existing results).
Create a Python file that subclasses BasePrompt with a @register_prompt decorator:
# my_schema.py
from dataclasses import dataclass, field
from immich_classify.prompt_base import BasePrompt, SchemaField, register_prompt
@register_prompt
@dataclass
class MyPrompt(BasePrompt):
name: str = "my_custom"
system_prompt: str = (
"You are a photo organizer. Classify the image into the given schema. "
"Output ONLY valid JSON."
)
user_prompt: str = (
"Classify this image according to the following schema:\n"
"{schema_description}\n\n"
"Output a JSON object with the specified fields."
)
schema: dict[str, SchemaField] = field(default_factory=lambda: {
"scene": SchemaField(
field_type="string",
description="Scene type",
enum=["indoor", "outdoor", "studio", "unknown"],
),
"people_count": SchemaField(
field_type="int",
description="Number of people visible",
),
"is_screenshot": SchemaField(
field_type="bool",
description="Whether the image is a screenshot",
),
"tags": SchemaField(
field_type="list[string]",
description="Descriptive tags",
),
})
prompt = MyPrompt()Then use it:
immich-classify classify --album <id> --prompt-config my_schema.pyTwo built-in prompts are provided under src/immich_classify/prompts/:
| Class | name |
Fields | Use case |
|---|---|---|---|
ClassificationPrompt |
classification |
category, quality, tags | General image classification (default) |
ForegroundPeoplePrompt |
foreground_people |
foreground_count, detection_confidence, etc. | Count foreground people |
When no --prompt-config is specified, the CLI uses ClassificationPrompt by default. You can change the default by setting CLASSIFY_DEFAULT_PROMPT in .env:
# Use a custom prompt as the default (no need for --prompt-config every time)
# Just the filename is enough — built-in prompts are found automatically
CLASSIFY_DEFAULT_PROMPT=foreground_people.pyBoth --prompt-config and CLASSIFY_DEFAULT_PROMPT use the same search order:
- Current working directory — the path is tried as-is (absolute or relative to cwd).
- Built-in prompts directory —
src/immich_classify/prompts/inside the package.
This means you can refer to built-in prompts by bare filename (e.g. foreground_people.py) without specifying the full path, while custom prompts in your project directory take priority if they share the same name.
You can subclass BasePrompt to create your own prompt for any task:
# smile_check.py
from dataclasses import dataclass, field
from immich_classify.prompt_base import BasePrompt, SchemaField, register_prompt
@register_prompt
@dataclass
class SmileDetectionPrompt(BasePrompt):
name: str = "smile_detection"
system_prompt: str = (
"You are a facial expression analysis assistant. "
"Analyze the given image for people and their expressions. "
"Output ONLY valid JSON, no other text."
)
user_prompt: str = (
"Analyze facial expressions in this image:\n"
"{schema_description}\n\n"
"Output a JSON object."
)
schema: dict[str, SchemaField] = field(default_factory=lambda: {
"has_people": SchemaField(field_type="bool", description="Whether the image contains people"),
"has_smile": SchemaField(field_type="bool", description="Whether anyone is smiling", default=False),
})
prompt = SmileDetectionPrompt()# Classify and then filter for smiling photos
immich-classify classify --album <id> --prompt-config smile_check.py
immich-classify results --task <id> --filter has_smile=trueDon't want to write a schema by hand? Use the generate command to let a strong LLM create one from a natural language description:
# Generate and preview a prompt config
immich-classify generate --goal "判断照片中的人物是否在微笑"
# Generate and export to a file
immich-classify generate --goal "挑选不含人物的风景照片" --output landscape_filter.py
# Use a different (stronger) model for generation
immich-classify generate \
--goal "classify food photos by cuisine type and presentation quality" \
--output food_classifier.py \
--api-url https://api.openai.com/v1 \
--api-key sk-... \
--model gpt-4oThe typical workflow is: generate → test → refine → run:
# 1. Generate a prompt config
immich-classify generate --goal "find photos with cats" --output cat_finder.py
# 2. Test with a small batch
immich-classify debug --album <id> --prompt-config cat_finder.py --count 5
# 3. Edit cat_finder.py if needed (it's a standard Python file)
# 4. Run full classification
immich-classify classify --album <id> --prompt-config cat_finder.py
# 5. Query results
immich-classify results --task <id> --filter has_cat=truesrc/immich_classify/
├── config.py # Config dataclass, .env loading, validation
├── prompt_base.py # BasePrompt base class, SchemaField & prompt registry
├── prompts/
│ ├── classification.py # ClassificationPrompt - general image classification (default)
│ └── foreground_people.py # ForegroundPeoplePrompt - foreground people detection
├── prompt_generator.py # AI-assisted prompt config generation & export
├── immich_client.py # Async Immich API client (httpx)
├── vlm_client.py # Async OpenAI-compatible VLM client (httpx)
├── database.py # Async SQLite layer (aiosqlite)
├── engine.py # Task execution engine (asyncio + semaphore)
├── cli.py # CLI entry point and subcommand handlers
└── __main__.py # python -m entry point
Key design decisions:
- Base / implementation separation -
BasePromptinprompt_base.pyprovides schema tooling, JSON (de)serialization and theregister_promptdecorator. Concrete prompts live underprompts/and are discovered via the registry, making it easy to add AI-generated prompts as new files. - SQLite with
json_extract()- Classification fields are fully dynamic. Results are stored as JSON and queried with SQLite's JSON functions, so no schema migration is needed when fields change. - Structured Output - Uses
response_format: { type: "json_schema" }to enforce valid JSON output from the VLM, rather than fragile regex parsing. - Robust response parsing - Automatically handles models that wrap JSON in markdown code blocks or prepend chain-of-thought reasoning before the JSON payload (common with Qwen, LLaMA, and other local models).
- Per-asset persistence - Each image result is committed immediately. A crash or interrupt loses at most the in-flight images, not the entire batch.
- Asset deduplication - When classifying multiple albums, assets appearing in more than one album are automatically deduplicated.
# Install dev dependencies
uv sync
# Run tests
uv run pytest
# Run type checker (strict mode)
uv run pyright src/immich_classify/102 tests covering all modules:
| Module | Tests | Coverage |
|---|---|---|
config.py |
8 | Validation, env loading, defaults, missing fields |
prompt_base.py + prompts/ |
20 | Schema generation, JSON schema, serialization roundtrip, registry |
prompt_generator.py |
6 | Export to Python, AI generation with mock, error handling |
database.py |
11 | CRUD, filtering with json_extract, deduplication |
immich_client.py |
5 | Album listing, asset filtering, image download |
vlm_client.py |
24 | Success, API errors, invalid JSON, structured output, markdown stripping, mixed-content extraction |
engine.py |
9 | Concurrency, error continuation, pause/resume, dedup |
cli.py |
19 | Argument parsing, filter parsing, multi-album |
| Component | Choice | Rationale |
|---|---|---|
| Language | Python 3.11+ | Rapid iteration, rich async ecosystem |
| HTTP | httpx | Native async, connection pooling |
| Database | aiosqlite | Async SQLite, zero setup |
| Logging | loguru | Structured, colorful, zero config |
| CLI | argparse | Standard library, no extra dependency |
| Formatting | tabulate | Clean table output |
| Type checking | Pyright | Strict mode, zero errors |
| Package manager | uv | Fast, reliable, modern |
License: MIT — see the LICENSE file for details.
