Locally hosted L3.3 works fine, other models do not #15271
Unanswered
frenzybiscuit
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I'm using TabbyAPI and when running Llama 3.3, the output works fine.
When I use a larger model, such as Mistral Large (for personal use), the output is weird.
For example, when I make a reply, it will output the previous response word for word.
I turned on debugging within TabbyAPI and the parameters (temperature, top-p, etc) are the same, so my assumption is it's the chat template being used by LiteLLM.
If I remove LiteLLM from the equation and just use OWUI -> TabbyAPI, Mistral Large works fine.
How would I fix this?
Beta Was this translation helpful? Give feedback.
All reactions