Skip to content

[Bug]: Cannot Access Billing/Usage Info for GPT-4O-Transcribe serias. #16270

@smbus2kw

Description

@smbus2kw

What happened?

Description :
Unable to retrieve usage information and billing details for the GPT-4O-Transcribe series, which is trained based on the LLM and charged per input audio token.

Conditions and Context:
-docker image: ghcr.io/berriai/litellm-database:main-latest (11-02 pulled)
-Although the "input_cost_per_audio_token" parameter is set in the configuration file, but it is actually optional(right?).
-The gpt-4o-mini-transcribe and gpt-4o-transcribe series are speech transcription models trained based on LLM. The pricing is not based on the traditional audio duration but rather calculated according to the input audio tokens and output audio tokens, similar to the billing model of LLM.

Test result details:
-No usage information has been received from the final result.
-No billing information and usage information in the database of LiteLLM. All values are zero.
-It can be confirmed that the raw_response output in litellm log contains usage information.

Investigation result:
It seems that LitelLM does not manage the usage information and billing for audio.
file: litellm/litellm_core_utils/llm_response_utils/convert_dict_to_response.py
function: convert_to_model_response_object
Handle branches : elif response_type == "audio_transcription" and ...
Processing logic: Did not extract usage information like other branches.

        elif response_type == "audio_transcription" and (
            model_response_object is None
            or isinstance(model_response_object, TranscriptionResponse)
        ):
            if response_object is None:
                raise Exception("Error in response object format")

            if model_response_object is None:
                model_response_object = TranscriptionResponse()

            if "text" in response_object:
                model_response_object.text = response_object["text"]

            optional_keys = ["language", "task", "duration", "words", "segments"]
            for key in optional_keys:  # not guaranteed to be in response
                if key in response_object:
                    setattr(model_response_object, key, response_object[key])

            if hidden_params is not None:
                model_response_object._hidden_params = hidden_params

            if _response_headers is not None:
                model_response_object._response_headers = _response_headers

            return model_response_object

Relevant log output

# RAW RESPONSE:
10:24:03 - LiteLLM:DEBUG: litellm_logging.py:1034 - RAW RESPONSE:
{"text": "===Omit here===", "logprobs": null, "usage": {"input_tokens": 784, "output_tokens": 235, "total_tokens": 1019, "type": "tokens", "input_token_details": {"audio_tokens": 784, "text_tokens": 0}}}

# spend_tracking info:
10:24:03 - LiteLLM Proxy:DEBUG: spend_tracking_utils.py:416 - SpendTable: created payload - payload: {
    "request_id": "b570d307-127b-431b-999b-732f646bd156",
    "call_type": "atranscription",
    "api_key": "==omit here==",
    "cache_hit": "None",
    "startTime": "2025-11-05 10:23:48.110804+00:00",
    "endTime": "2025-11-05 10:24:03.413610+00:00",
    "completionStartTime": "2025-11-05 10:24:03.413610+00:00",
    "model": "gpt-4o-mini-transcribe",
    "user": "default_user_id",
    "team_id": "",
    "metadata": "{\"user_api_key\": \"==omit here==\", \"user_api_key_alias\": null, \"user_api_key_team_id\": null, \"user_api_key_org_id\": null, \"user_api_key_user_id\": \"default_user_id\", \"user_api_key_team_alias\": null, \"requester_ip_address\": \"\", \"applied_guardrails\": [], \"batch_models\": null, \"mcp_tool_call_metadata\": null, \"vector_store_request_metadata\": null, \"guardrail_information\": null, \"usage_object\": {\"completion_tokens\": 0, \"prompt_tokens\": 0, \"total_tokens\": 0, \"completion_tokens_details\": null, \"prompt_tokens_details\": null}, \"model_map_information\": {\"model_map_key\": \"gpt-4o-mini-transcribe\", \"model_map_value\": {\"key\": \"azure/gpt-4o-mini-transcribe\", \"max_tokens\": null, \"max_input_tokens\": 16000, \"max_output_tokens\": 2000, \"input_cost_per_token\": 1.25e-06, \"input_cost_per_token_flex\": null, \"input_cost_per_token_priority\": null, \"cache_creation_input_token_cost\": null, \"cache_read_input_token_cost\": null, \"cache_read_input_token_cost_flex\": null, \"cache_read_input_token_cost_priority\": null, \"cache_creation_input_token_cost_above_1hr\": null, \"input_cost_per_character\": null, \"input_cost_per_token_above_128k_tokens\": null, \"input_cost_per_token_above_200k_tokens\": null, \"input_cost_per_query\": null, \"input_cost_per_second\": null, \"input_cost_per_audio_token\": 3e-06, \"input_cost_per_token_batches\": null, \"output_cost_per_token_batches\": null, \"output_cost_per_token\": 5e-06, \"output_cost_per_token_flex\": null, \"output_cost_per_token_priority\": null, \"output_cost_per_audio_token\": null, \"output_cost_per_character\": null, \"output_cost_per_reasoning_token\": null, \"output_cost_per_token_above_128k_tokens\": null, \"output_cost_per_character_above_128k_tokens\": null, \"output_cost_per_token_above_200k_tokens\": null, \"output_cost_per_second\": null, \"output_cost_per_video_per_second\": null, \"output_cost_per_image\": null, \"output_vector_size\": null, \"citation_cost_per_token\": null, \"tiered_pricing\": null, \"litellm_provider\": \"azure\", \"mode\": \"audio_transcription\", \"supports_system_messages\": null, \"supports_response_schema\": null, \"supports_vision\": null, \"supports_function_calling\": null, \"supports_tool_choice\": null, \"supports_assistant_prefill\": null, \"supports_prompt_caching\": null, \"supports_audio_input\": null, \"supports_audio_output\": null, \"supports_pdf_input\": null, \"supports_embedding_image_input\": null, \"supports_native_streaming\": null, \"supports_web_search\": null, \"supports_url_context\": null, \"supports_reasoning\": null, \"supports_computer_use\": null, \"search_context_cost_per_query\": null, \"tpm\": null, \"rpm\": null, \"ocr_cost_per_page\": null, \"annotation_cost_per_page\": null, \"supported_openai_params\": [\"temperature\", \"n\", \"stream\", \"stream_options\", \"stop\", \"max_tokens\", \"max_completion_tokens\", \"tools\", \"tool_choice\", \"presence_penalty\", \"frequency_penalty\", \"logit_bias\", \"user\", \"function_call\", \"functions\", \"tools\", \"tool_choice\", \"top_p\", \"logprobs\", \"top_logprobs\", \"response_format\", \"seed\", \"extra_headers\", \"parallel_tool_calls\", \"prediction\", \"modalities\", \"audio\", \"web_search_options\"]}}, \"cold_storage_object_key\": null, \"additional_usage_values\": {}}",
    "cache_key": "Cache OFF",
    "spend": 0.0,
    "total_tokens": 0,
    "prompt_tokens": 0,
    "completion_tokens": 0,
    "request_tags": "[\"User-Agent: python-requests\", \"User-Agent: python-requests/2.32.3\"]",
    "end_user": "==omit here==",
    "api_base": "https://==omit here==/openai/",
    "model_group": "azure/gpt-4o-mini-transcribe",
    "model_id": "d78bb8ace1a641d937acbc010e3174c9ff935f17adb313a5f888747bd28844ed",
    "mcp_namespaced_tool_name": null,
    "requester_ip_address": "",
    "custom_llm_provider": "azure",
    "messages": "{}",
    "response": "{\"text\": \"\"}",
    "proxy_server_request": "{\"model\": \"azure/gpt-4o-mini-transcribe\"}",
    "session_id": "af7d9117-5326-402c-8cfa-fe4479bff2eb",
    "status": "success"
}

Are you a ML Ops Team?

No

What LiteLLM version are you on ?

v1.79.1.rc.2

Twitter / LinkedIn details

No response

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions