Inworld `TTSTextFrame` word timestamps strip punctuation and poison assistant context across turns

### pipecat version

0.0.108

### Python version

3.13

### Operating System

Ubuntu 24.04

### Issue description

## Summary

We are seeing a reproducible issue with `InworldTTSService` + post-TTS assistant aggregation in `pipecat-ai==0.0.108`.

When Inworld returns word timestamps, Pipecat emits word-level `TTSTextFrame`s without punctuation. Because the assistant context is built downstream from TTS, those punctuation-less tokens become the canonical assistant message stored in `LLMContext`.

That flattened assistant text is then reused in later LLM prompts, and the model starts imitating the punctuation-less style on subsequent turns.

This also shows up in frontend transcript streams: interim bot transcript buffers are built word-by-word without punctuation

## Environment

- `pipecat-ai==0.0.108`
- `pipecat-ai-flows>=0.0.22`
- `InworldTTSService`
- WebRTC call flow

Pipeline shape:

```python
transport.input() -> stt -> context_aggregator.user() -> llm -> tts -> transport.output() -> context_aggregator.assistant()
```


### Reproduction steps

1. Use `InworldTTSService` with assistant aggregation after TTS.
2. Let the LLM produce a punctuated response with multiple clauses/sentences.
3. Let Inworld return word timestamps.
4. Observe that assistant history stored in context is punctuation-less.
5. Trigger the next LLM turn.
6. Observe that the next prompt already contains punctuation-less assistant messages and that the model starts imitating that style.

### Expected behavior

- Assistant memory stored in `LLMContext` should preserve the original assistant text punctuation.
- Future LLM prompts should not be degraded by punctuation-less TTS alignment text.
- Frontend transcript consumers should not receive a final transcript that is effectively a flattened run-on sentence with punctuation separated into its own final event.

### Actual behavior

When Inworld timestamps are present:

1. The spoken assistant text is reconstructed from word timestamps.
2. Those timestamps contain bare words without punctuation.
3. `LLMAssistantAggregator` stores that punctuation-less text in assistant context.
4. The next LLM request includes assistant history like:

```text
Hey welcome back It’s good to have you again We’ll just pick up where we left off and continue with the screening interview Are you ready to get started with the next set of questions
```

instead of the original punctuated text.

5. The LLM then starts replying in the same flattened style.

### Logs

```shell
From our local logs, `OpenAILLMService` receives assistant history like this:


{'role': 'assistant', 'content': 'Hey welcome back It’s good to have you again We’ll just pick up where we left off and continue with the screening interview Are you ready to get started with the next set of questions'}
{'role': 'assistant', 'content': 'Hey are you still there Just wanted to check in real quick'}
{'role': 'assistant', 'content': 'Hey just checking in one more time are you ready to continue If I don’t hear back I’ll have to go ahead and end the call on my side'}
{'role': 'assistant', 'content': 'Hey uh this is virtual assistant actually thanks for jumping back in Are you ready to continue with the screening questions'}
{'role': 'assistant', 'content': 'Great thanks So just so I understand your current situation are you working right now or are you between roles'}


Those turns were originally generated as natural punctuated speech, but the history fed back into the LLM is flattened.
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inworld `TTSTextFrame` word timestamps strip punctuation and poison assistant context across turns #4261

pipecat version

Python version

Operating System

Issue description

Summary

Environment

Reproduction steps

Expected behavior

Actual behavior

Logs

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Inworld TTSTextFrame word timestamps strip punctuation and poison assistant context across turns #4261

Description

pipecat version

Python version

Operating System

Issue description

Summary

Environment

Reproduction steps

Expected behavior

Actual behavior

Logs

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Inworld `TTSTextFrame` word timestamps strip punctuation and poison assistant context across turns #4261