# Check health status
curl http://localhost:8080/health
# Proxy a request to Anthropic
curl -X POST http://localhost:8080/v1/anthropic/v1/messages \
-H "Content-Type: application/json" \
-H "anthropic-version: 2023-06-01" \
-d '{
"model": "claude-sonnet-4-20250514",
"messages": [{"role": "user", "content": "Hello!"}],
"max_tokens": 100
}'
# Proxy a request to OpenAI
curl -X POST http://localhost:8080/v1/openai/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o-mini",
"messages": [{"role": "user", "content": "Hello!"}]
}'
# List all accounts
curl http://localhost:8080/api/accounts
# View dashboard
open http://localhost:8080ccflare provides a RESTful HTTP API for managing accounts, monitoring usage, and proxying requests to Anthropic and OpenAI. The API runs on port 8080 by default and requires no authentication.
http://localhost:8080
All API responses are in JSON format with Content-Type: application/json.
Check the health status of the ccflare service.
Response:
{
"status": "ok",
"accounts": 5,
"timestamp": "2024-12-17T10:30:45.123Z",
"strategy": "session",
"providers": ["anthropic", "openai"]
}Example:
curl http://localhost:8080/healthProxy requests to upstream provider APIs. The /v1/{provider} prefix is stripped exactly once before forwarding upstream. Requests are routed using the configured load balancing strategy across accounts matching the target provider.
Supported Providers:
/v1/anthropic/*→https://api.anthropic.com/*/v1/openai/*→https://api.openai.com/v1/*
Headers:
- All standard provider API headers are supported
Authorizationheader is managed by ccflare (no need to provide)
Request Body: Same as the upstream provider API requirements for the specific endpoint.
Response: Proxied response from the upstream provider API, including streaming responses.
Automatic Failover: If a request fails or an account is rate limited, ccflare automatically retries with the next available account according to the configured load balancing strategy. This ensures high availability and reliability.
Examples:
# Anthropic
curl -X POST http://localhost:8080/v1/anthropic/v1/messages \
-H "Content-Type: application/json" \
-H "anthropic-version: 2023-06-01" \
-d '{
"model": "claude-sonnet-4-20250514",
"messages": [{"role": "user", "content": "Hello!"}],
"max_tokens": 100
}'
# OpenAI chat completions
curl -X POST http://localhost:8080/v1/openai/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o-mini",
"messages": [{"role": "user", "content": "Hello!"}]
}'
# OpenAI responses
curl -X POST http://localhost:8080/v1/openai/responses \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"input": "Hello!"
}'Compatibility routes keep the client-facing Anthropic/OpenAI schema while routing
through a connected provider family chosen from the model prefix.
Model Prefix Rules:
openai/<model-id>→ preferscodex, thenopenaianthropic/<model-id>→ prefersclaude-code, thenanthropic
Bare model names are rejected with 400.
Examples:
# Ask for an Anthropic model through the OpenAI chat schema
curl -X POST http://localhost:8080/v1/ccflare/openai/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "anthropic/claude-sonnet-4",
"messages": [{"role": "user", "content": "Hello!"}]
}'
# Ask for an OpenAI model through the Anthropic Messages schema
curl -X POST http://localhost:8080/v1/ccflare/anthropic/messages \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-4o-mini",
"max_tokens": 100,
"messages": [{"role": "user", "content": "Hello!"}]
}'Behavior:
- HTTP only; websocket upgrades are not supported on
/v1/ccflare/* - requests use the same load-balancing and failover logic as native provider routes
- responses are translated back into the requested client schema, including SSE streams
- if no usable accounts exist in the requested family, ccflare returns
503
List all configured accounts with their current status.
Response:
[
{
"id": "uuid-here",
"name": "account1",
"provider": "anthropic",
"auth_method": "api_key",
"base_url": null,
"requestCount": 150,
"totalRequests": 1500,
"lastUsed": "2024-12-17T10:25:30.123Z",
"created": "2024-12-01T08:00:00.000Z",
"weight": 1,
"paused": false,
"tokenStatus": "valid",
"tokenExpiresAt": null,
"rateLimitStatus": {
"code": "ok",
"isLimited": false,
"until": null
},
"rateLimitReset": "2024-12-17T10:30:00.000Z",
"rateLimitRemaining": 100,
"sessionInfo": {
"active": true,
"startedAt": "2024-12-17T10:00:00.000Z",
"requestCount": 25
}
}
]Example:
curl http://localhost:8080/api/accountsOAuth and auth endpoints are provider-scoped. The {provider} path segment determines which provider's OAuth flow is used (e.g., anthropic, openai, claude-code, codex).
Initialize an OAuth flow for adding a new account.
Request:
{
"name": "myaccount"
}Response:
{
"success": true,
"message": "OAuth flow initiated for 'myaccount'",
"data": {
"authUrl": "https://claude.ai/oauth/authorize?...",
"sessionId": "uuid-here",
"provider": "claude-code"
}
}Example:
curl -X POST http://localhost:8080/api/auth/claude-code/init \
-H "Content-Type: application/json" \
-d '{"name": "myaccount"}'Complete the OAuth flow after user authorization.
Request:
{
"sessionId": "uuid-from-init-response",
"code": "authorization-code-from-oauth"
}Response:
{
"success": true,
"message": "Account 'myaccount' added successfully!",
"data": {
"provider": "claude-code"
}
}Example:
curl -X POST http://localhost:8080/api/auth/claude-code/complete \
-H "Content-Type: application/json" \
-d '{"sessionId": "uuid-here", "code": "auth-code"}'Check the status of an in-progress OAuth session.
Example:
curl http://localhost:8080/api/auth/session/uuid-here/statusBrowser redirect target for the OAuth flow. This is the callback URL that the OAuth provider redirects to after the user authorizes. Not called directly by API consumers.
Remove an account.
Response:
{
"success": true,
"message": "Account 'account-name' removed successfully"
}Example:
curl -X DELETE http://localhost:8080/api/accounts/uuid-hereUpdate an account (e.g., rename or change base_url).
Request:
{
"name": "new-name",
"base_url": "https://custom-endpoint.example.com"
}Response:
{
"success": true,
"message": "Account updated"
}Example:
curl -X PATCH http://localhost:8080/api/accounts/uuid-here \
-H "Content-Type: application/json" \
-d '{"name": "new-name"}'Rename an account.
Request:
{
"name": "new-name"
}Example:
curl -X POST http://localhost:8080/api/accounts/uuid-here/rename \
-H "Content-Type: application/json" \
-d '{"name": "new-name"}'Pause an account temporarily.
Response:
{
"success": true,
"message": "Account 'myaccount' paused"
}Example:
curl -X POST http://localhost:8080/api/accounts/uuid-here/pauseResume a paused account.
Response:
{
"success": true,
"message": "Account 'myaccount' resumed"
}Example:
curl -X POST http://localhost:8080/api/accounts/uuid-here/resumeGet overall usage statistics.
Response:
{
"totalRequests": 5000,
"successRate": 98.5,
"activeAccounts": 4,
"avgResponseTime": 1250.5,
"totalTokens": 1500000,
"totalCostUsd": 125.50,
"avgTokensPerSecond": null,
"topModels": [
{"model": "claude-3-opus-20240229", "count": 3000},
{"model": "claude-3-sonnet-20240229", "count": 2000}
]
}Example:
curl http://localhost:8080/api/statsReset all usage statistics.
Response:
{
"success": true,
"message": "Statistics reset successfully"
}Example:
curl -X POST http://localhost:8080/api/stats/resetGet recent request summary.
Query Parameters:
limit- Number of requests to return (default: 50)
Response:
[
{
"id": "request-uuid",
"timestamp": "2024-12-17T10:30:45.123Z",
"method": "POST",
"path": "/v1/anthropic/v1/messages",
"accountUsed": "account1",
"statusCode": 200,
"success": true,
"errorMessage": null,
"responseTimeMs": 1234,
"failoverAttempts": 0,
"model": "claude-sonnet-4-20250514",
"promptTokens": 50,
"completionTokens": 100,
"totalTokens": 150,
"inputTokens": 50,
"outputTokens": 100,
"cacheReadInputTokens": 0,
"cacheCreationInputTokens": 0,
"costUsd": 0.0125,
"agentUsed": null,
"tokensPerSecond": null
}
]Example:
curl "http://localhost:8080/api/requests?limit=100"Get detailed request information including payloads. Request and response bodies are base64-encoded to handle binary data and special characters.
Query Parameters:
limit- Number of requests to return (default: 100)
Response:
[
{
"id": "request-uuid",
"timestamp": "2024-12-17T10:30:45.123Z",
"method": "POST",
"path": "/v1/anthropic/v1/messages",
"accountUsed": "account1",
"statusCode": 200,
"success": true,
"payload": {
"request": {
"headers": {...},
"body": "base64-encoded-body"
},
"response": {
"status": 200,
"headers": {...},
"body": "base64-encoded-body"
},
"meta": {
"accountId": "uuid",
"accountName": "account1",
"retry": 0,
"timestamp": 1234567890,
"success": true,
"rateLimited": false,
"accountsAttempted": 1
}
}
}
]Example:
curl "http://localhost:8080/api/requests/detail?limit=10"Stream real-time request events via Server-Sent Events (SSE).
Response: SSE stream with request events
Example:
curl -N http://localhost:8080/api/requests/streamGet current configuration.
Response:
{
"lbStrategy": "session",
"port": 8080,
"sessionDurationMs": 18000000
}Example:
curl http://localhost:8080/api/configGet current load balancing strategy.
Response:
{
"strategy": "session"
}Example:
curl http://localhost:8080/api/config/strategyUpdate load balancing strategy.
Request:
{
"strategy": "session"
}Response:
{
"success": true,
"strategy": "session"
}Available Strategies:
session- Session-based routing that maintains 5-hour sessions with individual accounts to avoid rate limits and account bans
Example:
curl -X POST http://localhost:8080/api/config/strategy \
-H "Content-Type: application/json" \
-d '{"strategy": "session"}'List all available load balancing strategies.
Response:
["session"]Example:
curl http://localhost:8080/api/strategiesGet detailed analytics data.
Query Parameters:
range- Time range:1h,6h,24h,7d,30d(default:24h)accounts- Filter by account names (comma-separated list)models- Filter by model names (comma-separated list)status- Filter by request status:all,success,error(default:all)mode- Display mode:normal,cumulative(default:normal). Cumulative mode shows running totals over timemodelBreakdown- Include per-model time series data:true,false(default:false)
Response:
{
"meta": {
"range": "24h",
"bucket": "1h",
"cumulative": false
},
"totals": {
"requests": 5000,
"successRate": 98.5,
"activeAccounts": 4,
"avgResponseTime": 1250.5,
"totalTokens": 1500000,
"totalCostUsd": 125.50,
"avgTokensPerSecond": null
},
"timeSeries": [
{
"ts": 1734430800000,
"requests": 100,
"tokens": 15000,
"costUsd": 1.25,
"successRate": 98,
"errorRate": 2,
"cacheHitRate": 15,
"avgResponseTime": 1200,
"avgTokensPerSecond": null
}
],
"tokenBreakdown": {
"inputTokens": 500000,
"cacheReadInputTokens": 100000,
"cacheCreationInputTokens": 50000,
"outputTokens": 850000
},
"modelDistribution": [
{"model": "claude-3-opus-20240229", "count": 3000}
],
"accountPerformance": [
{"name": "account1", "requests": 2500, "successRate": 99}
],
"costByModel": [
{"model": "claude-3-opus-20240229", "costUsd": 100.50, "requests": 3000, "totalTokens": 1200000}
],
"modelPerformance": [
{
"model": "claude-3-opus-20240229",
"avgResponseTime": 1300,
"p95ResponseTime": 2500,
"errorRate": 1.5,
"avgTokensPerSecond": null,
"minTokensPerSecond": null,
"maxTokensPerSecond": null
}
]
}Examples:
# Basic analytics for last 7 days
curl "http://localhost:8080/api/analytics?range=7d"
# Analytics filtered by specific accounts
curl "http://localhost:8080/api/analytics?range=24h&accounts=account1,account2"
# Analytics for specific models with success status only
curl "http://localhost:8080/api/analytics?range=24h&models=claude-3-opus-20240229,claude-3-sonnet-20240229&status=success"
# Combined filters
curl "http://localhost:8080/api/analytics?range=7d&accounts=premium1,premium2&models=claude-3-opus-20240229&status=error"Run data cleanup based on configured retention settings.
Example:
curl -X POST http://localhost:8080/api/maintenance/cleanupCompact the database to reclaim disk space.
Example:
curl -X POST http://localhost:8080/api/maintenance/compactGet current data retention settings.
Response:
{
"payloadDays": 7,
"requestDays": 30
}Example:
curl http://localhost:8080/api/config/retentionUpdate data retention settings.
Request:
{
"payloadDays": 14,
"requestDays": 90
}Example:
curl -X POST http://localhost:8080/api/config/retention \
-H "Content-Type: application/json" \
-d '{"payloadDays": 14, "requestDays": 90}'Stream real-time logs via Server-Sent Events (SSE).
Response: SSE stream with log events
Example:
curl -N http://localhost:8080/api/logs/streamGet historical logs.
Response:
[
{
"timestamp": "2024-12-17T10:30:45.123Z",
"level": "info",
"component": "proxy",
"message": "Request completed",
"metadata": {...}
}
]Example:
curl http://localhost:8080/api/logs/historyAll API errors follow a consistent format:
{
"error": "Error message",
"details": {
// Optional additional error details
}
}- 200 OK - Request successful
- 400 Bad Request - Invalid request parameters
- 404 Not Found - Resource not found
- 429 Too Many Requests - Rate limited
- 500 Internal Server Error - Server error
- 502 Bad Gateway - Upstream provider error
- 503 Service Unavailable - All accounts failed
When an account hits rate limits, ccflare automatically fails over to the next available account. If all accounts are rate limited, a 503 error is returned.
Rate limit information is included in account responses:
rateLimitStatus- Current status (e.g., "allowed", "allowed_warning", "rate_limited")rateLimitReset- When the rate limit resetsrateLimitRemaining- Remaining requests (if available)
The proxy endpoints support streaming responses for compatible Claude API calls. When making a streaming request:
- Include
"stream": truein your request body - The response will be
Content-Type: text/event-stream - Each chunk is delivered as a Server-Sent Event
Streaming Response Capture: ccflare automatically captures streaming response bodies for analytics and debugging purposes:
- Captured data is limited to
CF_STREAM_BODY_MAX_BYTES(default: 256KB) - The capture process doesn't interfere with the client's stream
- Captured bodies are stored base64-encoded in the request history
- If the response exceeds the size limit, it's marked as truncated in metadata
Example:
curl -X POST http://localhost:8080/v1/anthropic/v1/messages \
-H "Content-Type: application/json" \
-H "anthropic-version: 2023-06-01" \
-d '{
"model": "claude-sonnet-4-20250514",
"messages": [{"role": "user", "content": "Write a poem"}],
"max_tokens": 100,
"stream": true
}'A web dashboard is available at:
http://localhost:8080/ # Dashboard
The dashboard provides a visual interface for:
- Monitoring account status and usage
- Viewing real-time analytics
- Managing configuration
- Examining request history
ccflare can be configured using the following environment variables:
PORT- Server port (default: 8080)LB_STRATEGY- Load balancing strategy (default: session)SESSION_DURATION_MS- Session duration in milliseconds (default: 18000000 / 5 hours)CLIENT_ID- OAuth client ID for Anthropic authentication (default: 9d1c250a-e61b-44d9-88ed-5944d1962f5e)CF_STREAM_BODY_MAX_BYTES- Maximum bytes to capture from streaming responses (default: 262144 / 256KB)RETRY_ATTEMPTS- Number of retry attempts for failed requests (default: 3)RETRY_DELAY_MS- Initial delay between retries in milliseconds (default: 1000)RETRY_BACKOFF- Exponential backoff multiplier for retries (default: 2)
In addition to environment variables, ccflare supports configuration through a JSON file. The config file location varies by platform:
- macOS/Linux:
~/.config/ccflare/ccflare.json(or$XDG_CONFIG_HOME/ccflare/ccflare.json) - Windows:
%LOCALAPPDATA%\ccflare\ccflare.json(or%APPDATA%\ccflare\ccflare.json)
Supported Configuration Keys:
{
"lb_strategy": "session",
"client_id": "your-oauth-client-id",
"retry_attempts": 3,
"retry_delay_ms": 1000,
"retry_backoff": 2,
"session_duration_ms": 18000000,
"port": 8080,
"stream_body_max_bytes": 262144
}Note: Environment variables take precedence over config file settings.
The following strategy is available:
session- Session-based routing that maintains 5-hour sessions with individual accounts
-
No Authentication: The API endpoints do not require authentication. ccflare manages the OAuth tokens internally for proxying to Claude.
-
Automatic Failover: When a request fails or an account is rate limited, ccflare automatically tries the next available account. If no accounts are available, requests are forwarded without authentication as a fallback.
-
Token Refresh: Access tokens are automatically refreshed when they expire.
-
Request Logging: All requests are logged with detailed metrics including tokens used, cost, and response times. Database writes are performed asynchronously to avoid blocking request processing.
-
Session Affinity: The "session" strategy maintains sticky sessions for consistent routing within a time window.
-
Rate Limit Tracking: Rate limit information is automatically extracted from responses and stored for each account, including reset times and remaining requests.
-
Provider Filtering: Accounts are automatically filtered by provider when selecting for requests, ensuring compatibility.