A standalone Elixir service that runs text embedding inference via Bumblebee and serves an OpenAI-compatible API.
OpenAI-compatible endpoint. Accepts single or batch input.
{
"input": "some text to embed",
"model": "nomic-ai/nomic-embed-text-v1.5",
"task_prefix": "search_document"
}The task_prefix field is an optional extension for Nomic models (search_document or search_query).
Lists available models.
Liveness check.
Returns {"status": "ok"}.
Requests require a bearer token when EMBEDDING_API_TOKEN is set.
Authorization: Bearer <token>
| Variable | Default | Description |
|---|---|---|
PORT |
8080 |
HTTP listen port |
EMBEDDING_MODEL |
nomic-ai/nomic-embed-text-v1.5 |
HuggingFace model to load at boot |
EMBEDDING_API_TOKEN |
(none) | Bearer token for authentication (disabled when unset) |
mix deps.get
mix run --no-haltTest the API:
curl -X POST http://localhost:8080/v1/embeddings \
-H "Content-Type: application/json" \
-d '{"input": "Hello world", "model": "nomic-ai/nomic-embed-text-v1.5"}'Run tests:
mix testfly launchopenssl rand -hex 32Set it on the embedding service:
fly secrets set EMBEDDING_API_TOKEN=<token> -a your-fly-app-namefly deploycurl -X POST https://your-fly-app-name.fly.dev/v1/embeddings \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $EMBEDDING_API_TOKEN" \
-d '{"input": "How satisfied are you with this service?", "model": "nomic-ai/nomic-embed-text-v1.5", "task_prefix": "search_document"}'Health check:
curl https://your-fly-app-name.fly.dev/health