Skip to content

Why was there a number of tokens reduction for these chronos models compared to the t5 models? #124

Answered by abdulfatir
CoCoNuTeK asked this question in Q&A
Discussion options

You must be logged in to vote

@CoCoNuTeK Please follow the issue guidelines in the repo and use discussions for Q/A. Issues are intended for issues (such as bugs) in the code.

The vocab size in the context of Chronos relates to the precision. 4096 was a reasonable choice. While larger values may improve precision, note that you don't want the bins to be too fine. In that case very few items may fall into those bins which may lead to the model not learning the distribution properly. Please check out the paper for discussion on such design choices: https://arxiv.org/abs/2403.07815

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by CoCoNuTeK
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants
Converted from issue

This discussion was converted from issue #123 on June 15, 2024 16:29.