Communicate with any LLM provider through the Otari gateway.
Add to your Cargo.toml:
[dependencies]
otari = "0.1" # From crates.io (once published)
tokio = { version = "1", features = ["full"] }Or install from GitHub directly:
[dependencies]
otari = { git = "https://github.com/mozilla-ai/otari-sdk-rust" }
tokio = { version = "1", features = ["full"] }use otari::{completion, Message, CompletionOptions};
#[tokio::main]
async fn main() -> otari::Result<()> {
let messages = vec![Message::user("Hello!")];
let response = completion(
"openai:gpt-4o-mini",
messages,
CompletionOptions::with_api_key("your-api-key")
.api_base("http://localhost:8000"),
).await?;
println!("{}", response.content().unwrap_or_default());
Ok(())
}- Rust 1.83 or newer
- A running Otari gateway instance
Set environment variables:
export OTARI_API_KEY="your-key-here"
export OTARI_API_BASE="http://localhost:8000"Alternatively, pass the API key and base URL directly in your code:
let options = CompletionOptions::with_api_key("your-api-key")
.api_base("http://localhost:8000");The Otari gateway is a FastAPI-based proxy server that exposes an OpenAI-compatible API and routes requests to multiple upstream LLM providers. It adds enterprise-grade features:
- Budget Management - Enforce spending limits with automatic daily, weekly, or monthly resets
- API Key Management - Issue, revoke, and monitor virtual API keys without exposing provider credentials
- Usage Analytics - Track every request with full token counts, costs, and metadata
- Multi-tenant Support - Manage access and budgets across users and teams
docker run \
-e GATEWAY_MASTER_KEY="your-secure-master-key" \
-e OPENAI_API_KEY="your-api-key" \
-p 8000:8000 \
ghcr.io/mozilla-ai/any-llm/gateway:latestNote: You can use a specific release version instead of
latest(e.g.,1.2.0). See available versions.
Prefer a hosted experience? The Otari platform provides a managed control plane for keys, usage tracking, and cost visibility across providers, while still building on the same interfaces.
use otari::{completion, Message, CompletionOptions};
let messages = vec![
Message::system("You are a helpful assistant."),
Message::user("What is the capital of France?"),
];
let response = completion(
"openai:gpt-4o-mini",
messages,
CompletionOptions::with_api_key("your-api-key")
.api_base("http://localhost:8000"),
).await?;
println!("{}", response.content().unwrap_or_default());Change the model string to route to different upstream providers through the gateway:
// OpenAI via gateway
let response = completion(
"openai:gpt-4o", messages.clone(), options.clone()
).await?;
// Anthropic via gateway
let response = completion(
"anthropic:claude-3-5-sonnet-latest", messages, options
).await?;use otari::{completion_stream, Message, CompletionOptions, ChunkAccumulator};
use futures::StreamExt;
let messages = vec![Message::user("Tell me a story")];
let mut stream = completion_stream(
"openai:gpt-4o-mini",
messages,
CompletionOptions::with_api_key("your-api-key")
.api_base("http://localhost:8000"),
).await?;
let mut accumulator = ChunkAccumulator::new();
while let Some(chunk) = stream.next().await {
let chunk = chunk?;
if let Some(content) = chunk.content() {
print!("{}", content);
}
accumulator.add(&chunk);
}
println!("\nTotal tokens: {:?}", accumulator.usage);use otari::{completion, Message, CompletionOptions, Tool, ToolChoice};
use serde_json::json;
let weather_tool = Tool::function("get_weather", "Get the current weather")
.parameters(json!({
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City name"
}
},
"required": ["location"]
}))
.build();
let messages = vec![Message::user("What's the weather in Paris?")];
let options = CompletionOptions::with_api_key("your-api-key")
.api_base("http://localhost:8000")
.tools(vec![weather_tool])
.tool_choice(ToolChoice::auto());
let response = completion("openai:gpt-4o-mini", messages, options).await?;
if let Some(tool_calls) = &response.choices[0].message.tool_calls {
for call in tool_calls {
println!("Function: {}", call.function.name);
println!("Arguments: {}", call.function.arguments);
}
}For models that support extended thinking:
use otari::{completion, Message, CompletionOptions, ReasoningEffort};
let messages = vec![Message::user("Solve this step by step: What is 15% of 240?")];
let options = CompletionOptions::with_api_key("your-api-key")
.api_base("http://localhost:8000")
.reasoning_effort(ReasoningEffort::Medium)
.max_tokens(16000);
let response = completion(
"anthropic:claude-sonnet-4-20250514",
messages,
options,
).await?;
// Access reasoning content
if let Some(reasoning) = &response.choices[0].message.reasoning {
println!("Thinking: {}", reasoning.content);
}
println!("Answer: {}", response.content().unwrap_or_default());The Otari client exposes a moderation method that calls
POST /v1/moderations and returns an OpenAI-compatible response:
use otari::{Config, ModerationInput, ModerationParams, Otari, OtariError};
# async fn example() -> otari::Result<()> {
let client = Otari::from_config(Config::default())?;
let resp = client
.moderation(
ModerationParams::new(
"openai:omni-moderation-latest",
ModerationInput::Text("hurt someone".into()),
)
.with_user("user_123"),
)
.await?;
if resp.results[0].flagged {
println!("unsafe input");
}
# Ok(())
# }Only upstream providers with moderation support will succeed; others
return OtariError::Unsupported { provider, operation: "moderation" }
(or "multimodal_moderation" when the request used image parts).
use otari::{Config, CreateBatchParams, Message, Otari};
# async fn example() -> otari::Result<()> {
let client = Otari::from_config(Config::default())?;
let params = CreateBatchParams::new(
"openai:gpt-4o-mini",
vec![
("req-1", vec![Message::user("Hello")]),
("req-2", vec![Message::user("World")]),
],
);
let batch = client.create_batch(params).await?;
println!("Batch ID: {}", batch.id);
# Ok(())
# }use otari::{completion, OtariError};
match completion(model, messages, options).await {
Ok(response) => println!("{}", response.content().unwrap_or_default()),
Err(OtariError::RateLimit { provider, message }) => {
eprintln!("Rate limited by {}: {}", provider, message);
}
Err(OtariError::Authentication { provider, message }) => {
eprintln!("Auth failed for {}: {}", provider, message);
}
Err(e) => eprintln!("Error: {}", e),
}The gateway supports all features through upstream providers:
| Feature | Supported |
|---|---|
| Completion | ✅ |
| Streaming | ✅ |
| Tools | ✅ |
| Images | ✅ |
| Reasoning | ✅ |
| ✅ | |
| Reranking | ✅ |
| Batch | ✅ |
| Moderation | ✅ |
- Simple, unified interface - Single function for all models, switch providers by changing the model string
- Developer friendly - Full Rust type safety with serde serialization and clear, actionable error messages
- Gateway-powered - Route to any upstream provider through a single gateway endpoint
- Async-first - Built on Tokio for high-performance async I/O
- Streaming support - Real-time token streaming with async streams
- Battle-tested - Based on the proven any-llm Python library
# Build
cargo build --all-features
# Run all checks
cargo fmt --check && cargo clippy --all-features -- -D warnings
# Run tests
cargo test --all-features
# Run the gateway example
cargo run --example gateway_completion
# Build docs
cargo doc --all-features --no-deps --open- Full Documentation - Complete guides and API reference
- Gateway Documentation - Gateway setup and deployment
- Python SDK - The full Python SDK with direct provider access
- Otari Platform (Beta) - Hosted control plane for key management, usage tracking, and cost visibility
We welcome contributions from developers of all skill levels! Please see our Contributing Guide or open an issue to discuss changes.
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.