Skip to content

XQUANT - cache post-norm X, rematerialize K/V on decode #15400

FlorianZimmer started this conversation in Ideas
Discussion options

You must be logged in to vote

Replies: 2 comments 4 replies

Comment options

You must be logged in to vote
3 replies
@FlorianZimmer
Comment options

@ggerganov
Comment options

@FlorianZimmer
Comment options

Comment options

You must be logged in to vote
1 reply
@FlorianZimmer
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Ideas
Labels
None yet
3 participants