Incorrect token count (usage_metadata) in streaming mode

### Checked other resources

- [x] I added a very descriptive title to this issue.
- [x] I searched the LangChain documentation with the integrated search.
- [x] I used the GitHub search to find a similar question and didn't find it.
- [x] I am sure that this is a bug in LangChain rather than my code.
- [x] The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

### Example Code

Any LLM-call with streaming.

The aggregated token usage is totally wrong and much to high.

See this method: https://github.com/langchain-ai/langchain/blob/b75573e858a3b53427675f551e74dfd7e1dbb4c6/libs/core/langchain_core/messages/ai.py#L406

```
    # Token usage
    if left.usage_metadata or any(o.usage_metadata is not None for o in others):
        usage_metadata: Optional[UsageMetadata] = left.usage_metadata
        for other in others:
            usage_metadata = add_usage(usage_metadata, other.usage_metadata)
    else:
        usage_metadata = None

```

For streaming we get usage_metdata for each token, e.g.

'input_tokens' = 713
'output_tokens' = 1
'total_tokens' = 714

output_tokens is always 1 and adds up nicely.
input_tokens is always 713 for llm-token-stream and adds up to "input_tokens" * "count(tokens)"  (same total_tokens with 714)

This just adds up tokens to huge (totally useless) numbers.

What is the strategy here? Should the llm not report per-token usage metdata and only report this in final chunk? Then Langchain-openai has to change this for that call: https://github.com/langchain-ai/langchain/blob/b75573e858a3b53427675f551e74dfd7e1dbb4c6/libs/partners/openai/langchain_openai/chat_models/base.py#L2805


### Error Message and Stack Trace (if applicable)

_No response_

### Description

* I'm trying to get sane token usage numbers for streaming with usage_metadata
* I get hugely inflated total_tokens and input_tokens  (because multiplied by count(output_token)
* Define a strategy and either adapt the token aggregation in langchain_core.messages.add_ai_message_chunks or the usage reporting only in final chunk in openai.chatmodels.base._create_usage_metadata

### System Info

totally not relevant

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Incorrect token count (usage_metadata) in streaming mode #30429

Checked other resources

Example Code

Error Message and Stack Trace (if applicable)

Description

System Info

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Incorrect token count (usage_metadata) in streaming mode #30429

Description

Checked other resources

Example Code

Error Message and Stack Trace (if applicable)

Description

System Info

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions