Replies: 3 comments 2 replies
-
|
I can see two ways to do this, but neither are that appealing:
Neither option seems to fit with the existing codebase very well though :/ |
Beta Was this translation helpful? Give feedback.
2 replies
-
|
I tried to outline how the second option could be done in this post: |
Beta Was this translation helpful? Give feedback.
0 replies
-
|
Does this mean that Deepseek v3.2 support is coming to llama.cpp SoonTM? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment



Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Deepseek's new sparse model enables very long context performance. What would be acceptable design for the lighting indexer?
For reference this is the attention mechanism design from tech report
and here is the hugging-face reference impl: https://huggingface.co/deepseek-ai/DeepSeek-V3.2-Exp/blob/main/inference/model.py#L435
Technical Report here
Beta Was this translation helpful? Give feedback.
All reactions