You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
"Unrecognized keys in `rope_scaling` for 'rope_type'='yarn': {'original_max_position_embeddings'}"
And, when LLM's input length is shorter than original_position_embedding_len, the response is OK. However, if input's len is larger than 32768(original_position_embedding_len), the model's output will be something confusing, similar to a kind of repetition of the input.
this error happened in the version of 0.6.3.post1, but when I switch to v0.6.0, everything is OK.
I find that transformers's repo from recent versions don't accept "original_max_position_embeddings", but vllm need it. Maybe this is a confict between transformers and vllm ?
Does anyone know how to correctly enable the long context feature? Thanks ^_^
PS: I can't run collect_env.py script in v0.6.0's docker image, but 0.6.3.post1's docker image is OK.
PPS: I just search issues about "original_max_position_embeddings", but got nothing releated.
Before submitting a new issue...
Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
The text was updated successfully, but these errors were encountered:
Your current environment
The output of `python collect_env.py`
Model Input Dumps
No response
🐛 Describe the bug
when I add rope config in Qwen/Qwen2-72B-Instruct/config.json
I get transformers warning:
And, when LLM's input length is shorter than original_position_embedding_len, the response is OK. However, if input's len is larger than 32768(original_position_embedding_len), the model's output will be something confusing, similar to a kind of repetition of the input.
this error happened in the version of 0.6.3.post1, but when I switch to v0.6.0, everything is OK.
I find that transformers's repo from recent versions don't accept "original_max_position_embeddings", but vllm need it. Maybe this is a confict between transformers and vllm ?
Does anyone know how to correctly enable the long context feature? Thanks ^_^
PS: I can't run collect_env.py script in v0.6.0's docker image, but 0.6.3.post1's docker image is OK.
PPS: I just search issues about "original_max_position_embeddings", but got nothing releated.
Before submitting a new issue...
The text was updated successfully, but these errors were encountered: