Locally hosted L3.3 works fine, other models do not #15271

frenzybiscuit · 2025-10-07T10:32:16Z

frenzybiscuit
Oct 7, 2025

I'm using TabbyAPI and when running Llama 3.3, the output works fine.

When I use a larger model, such as Mistral Large (for personal use), the output is weird.

For example, when I make a reply, it will output the previous response word for word.

I turned on debugging within TabbyAPI and the parameters (temperature, top-p, etc) are the same, so my assumption is it's the chat template being used by LiteLLM.

If I remove LiteLLM from the equation and just use OWUI -> TabbyAPI, Mistral Large works fine.

How would I fix this?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Locally hosted L3.3 works fine, other models do not #15271

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Uh oh!

Locally hosted L3.3 works fine, other models do not #15271

Uh oh!

Uh oh!

frenzybiscuit Oct 7, 2025

Replies: 0 comments

frenzybiscuit
Oct 7, 2025