-
Notifications
You must be signed in to change notification settings - Fork 222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gptel-model and gptel-backend, why do we need these two? #704
Comments
Yeah, this is true.
It's been like this for as long as gptel has supported more than just ChatGPT, so going back to Nov 2023 or earlier. Setting both independently is required for reasons I explain below.
The inability to customize
You don't have to use
Please suggest an alternative based on the information below. The reason that the model and the backend need to be set independently is because they are independent settings -- the same model can be supported by multiple backends, and each backend can support multiple models. For instance, The model The reason that the backend cannot be specified as a string is because there is no guarantee that the live data structure it represents will exist. Someone needs to create this live data structure, and it can't be gptel. What I mean is that if you want to just do (setq gptel-backend "Claude") and have it work, you would still need (gptel-make-anthropic "Claude" :key ... :stream t) in your configuration. So it's actually more configuration now -- two statements instead of one. gptel cannot ship with predefined backends for all possible providers, because there are too many of them and too many ways to configure each of them. (Look at the Even under these limitations it can be made more user-friendly, but I'd need to know what that means exactly. |
First off, thanks for writing such great response to my issue. I really appreciate it! I think the way I would approach this is to work backwards from the ideal user experience, then identify how it would work. I see your point about models being accessible via multiple backends. When I use gptel-menu I see my model choice as being essentially "provider:model-name". Extending that to the gptel-model variable makes sense to me. I'd probably want to avoid the use of strings since it's a bit of an emacs anti-pattern, symbols preferred. But we really want a multi-level symbol here, so we could do something like (set-default gpt-model '(claude . claude-3-7-sonnet-20250219)) Basically if the gpt-model is a symbol, do the current code. If it's a cons, then use the (car) as the backend and the (cdr) as the model within said backend. We could keep gpt-model backwards compatible and also extend it a trivial manner that would make sense to a lot of people. If this makes sense to you, I'd be more than happy to work on it. |
One interesting thing I noticed while working on #706: if I have gpt-4o defined as model provided by Github, it will inherit all the model parameters from openai, which is obviously confusing since I don't think the prices are identical. If however I override the values like this: (gptel-make-openai "Github"
:host "models.inference.ai.azure.com"
:endpoint "/chat/completions?api-version=2024-05-01-preview"
:stream t
:key #'gptel-api-key-from-auth-source
:models '((gpt-4o :description "gpt-4o as provided by GH" :output-cost 0.0) deepseek-r1)) then the parameters will be overridden in the openai (ChatGPT:gpt-4o) model as well. Basically currently you cannot have different pricing of same models from different providers, which isn't ideal. Any ideas if this could be improved? This does not affect actually using the model, but it would still be nice if the tab completion in |
I'm not against using symbols for backend names, but why are they an anti-pattern? We don't do enough comparisons for the
As I see it, the ideal user experience would be the following: (setq gptel-model 'claude-3-sonnet-20240229
gptel-backend "Claude") with no other configuration required. But I don't think that's possible.
Instead of setting two variables ( But the main issue is that this does not fix the main problem, which is that you still need to define the claude backend yourself. You'll still need this in your configuration: (gptel-make-claude 'claude :stream t :key ...) ;Assuming backend name is a symbol here with much more code required to define other backends like Ollama or Groq. (This is annoying, but at least the configuration for each backend is encapsulated in the structure. It's less annoying than having about eighty individual user options like
Going back to this, you'd still need to match the backend and the model if you made the model a cons. I see two actionable improvements here:
|
Yes, this is because model data is stored as symbol properties, and you can't have two symbols with the same name. I was aware of this limitation when I implemented it, and just punted the problem into the future. This can be fixed quite easily since the model property interface is abstracted. We use For future reference, the easiest way to fix it is to store the model name as a symbol property instead of calling |
The README.md notes that gptel-model is how you set the default model. Except it really isn't entirely true.
If you attempt to set the gptel-model to one that isn't hosted by the default gptel-backend, then it doesn't really do anything.
The code didn't use to be like this.
Now you have to set both gptel-model and gptel-backend in concert with each other, also gptel-backend isn't something that can be customized since it has to be a complex record that comes from another live data structure.
In terms of a user interface, this is just confusing and non-ideal. The extra step of configuring a gptel-backend to match the model.
I run in to this because I want my default model to be sonnet 3.7, and I don't want to fiddle with gptel-menu, cancel that, then run the 'gptel' command to switch my model. I've adjusted my init.el now, but this is internal details of the workings of gptel leaking to the end user configuration.
The text was updated successfully, but these errors were encountered: