-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature request] Show tokens used and costs #314
Comments
Lol. Sorry for all the renames... I'm not sure how reliable we can make the calculation. I'd be worried about giving a misleading cost. At the moment, the context + token approximation is a very rough approximation. We also need to consider added complexity per model/provider to implement this logic. Open to suggestions though. |
There is a python library that can calculate token costs that I've used in scripts. It would require an external dependency but we could perhaps code the features so that it was disabled if the Python library was missing. |
The complexity per provider is why I wanted to do it in a generic way based on token size. |
It might also make more sense to use something like this: https://github.com/AgentOps-AI/tokencost. It estimates tokencosts directly given a query. It looks like it doesn't support reasoning models though. |
Just for counting tokens https://github.com/openai/tiktoken was what I used. |
Thanks for these. I'd like to avoid a python dependency if possible, even optional ones. Having said that, I'm open to adding integration points needed if you're keen to implement the cost functionality as a separate package. |
It looks like there is an elisp implementation for counting tokens here: https://elpa.gnu.org/packages/llm.html |
It might not be a big deal to just do this through the API. OpenAI gives the number of tokens used in each response according to this. I'll check for others. |
I've checked for OpenAI, Anthropic, DeepSeek, OpenRouter and Google now. All of them have a usage key in the json object that is returned which contains full information on the tokens used. For Google, the key has a different name but it still contains the same information (just formatted a little differently). It looks like it wouldn't be hard to do just based on the information that the API returns and knowing the token costs for each model. I'm thinking then it should just show the cost so far since there's no way to predict how many output tokens the model will produce in response to a prompt ahead of time anyway. |
Thanks for looking into it. I'm not super sure I see the real benefit of the feature (given constraints) vs added complexity. How should we tally up cost (per buffer, per provider)? Where do we show it? How to convey to the user what the value means? Cost per buffer? Do we save across Emacs sessions? Etc. While the number of tokens used may be available in responses, what is the main goal of the feature? Determine how much has been spent? avoid surprise charges? AFAIK, providers typically have guardrails for this? On a somewhat tangent, I'm a little worried about unintentionally misrepresenting cost. |
Other session cases we'd have to consider:
Given the frequency in which either of these can happen within a shell session (do we reset the counter each time?), I'm not super sure I'm seeing the usefuless (in practical terms) of surfacing token details to users (or approximate cost). I'm possibly not the target audience for the feature, so maybe missing the general usefulness. On the other hand and to date, it doesn't seem to be a popular feature request? |
My thought was to calculate the additional cost every time a response was
received from the API. I would do it on a per buffer basis via a buffer
local variable and would not persist it across sessions.
Switching models
would then work nicely as that would switch the token costs and so they
would be automatically used for future queries without any additional code.
My thought was to display the cost so far in the mode line.
When clearing the buffer, I think I would be inclined to keep the current cost tough you could make an argument for resetting it. It could be user configurable.
The idea was to avoid surprises with regard to costs (particularly with expensive models such as o1 and gpt-4.5).
…On Thu, Mar 6, 2025, 4:28 AM xenodium ***@***.***> wrote:
Other session cases we'd have to consider:
- Swapping models.
- Swapping providers.
- Clearing buffer.
Given the frequency in which either of these can happen within a shell
session (do we reset the counter each time?), I'm not super sure I'm seeing
the usefuless (in practical terms) of surfacing token details to users (or
approximate cost).
I'm possibly not the target audience for the feature, so maybe missing the
general usefulness. On the other hand and to date, it doesn't seem to be a
popular feature request?
—
Reply to this email directly, view it on GitHub
<#314 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAFAOSBXHDWBQC6FFID2QCL2TA5PFAVCNFSM6AAAAABYHV6SUCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDOMBTG4YDSNRTGE>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
[image: xenodium]*xenodium* left a comment (xenodium/chatgpt-shell#314)
<#314 (comment)>
Other session cases we'd have to consider:
- Swapping models.
- Swapping providers.
- Clearing buffer.
Given the frequency in which either of these can happen within a shell
session (do we reset the counter each time?), I'm not super sure I'm seeing
the usefuless (in practical terms) of surfacing token details to users (or
approximate cost).
I'm possibly not the target audience for the feature, so maybe missing the
general usefulness. On the other hand and to date, it doesn't seem to be a
popular feature request?
—
Reply to this email directly, view it on GitHub
<#314 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAFAOSBXHDWBQC6FFID2QCL2TA5PFAVCNFSM6AAAAABYHV6SUCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDOMBTG4YDSNRTGE>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Lemme have a look... Currently, there is no generic way to expose anything other than the text output (for shell output). Maybe I can start by exposing a hook with the parsed responses, which would enable you to implement either as a separate package or in your config. |
Exposing that hook sounds like the way to go if you don't want it in this package. |
I can submit a pull request to add the hook if you like. I guess it will have to be run from the filter functions unless we decide to refactor things. |
This may need additional changes in shell-maker to expose as an event. I can look more into it later this week. |
I've been thinking for a bit that it would be useful if the tokens used so far and the total cost for the current conversion was shown. It would be a pain to query the API for each provider so I'm thinking a better way would be to add the costs to each model and then calculate the number of tokens locally.
This wouldn't work for reasoning models so I guess we would have to query the API for those. That might be more trouble than this feature is worth.
The text was updated successfully, but these errors were encountered: