Skip to content

Speculative decoding potential for running big LLMs on consumer grade GPUs efficiently #10466

steampunque started this conversation in Ideas
Discussion options

You must be logged in to vote

Replies: 8 comments 26 replies

Comment options

You must be logged in to vote
5 replies
@steampunque
Comment options

@jukofyork
Comment options

@steampunque
Comment options

@jukofyork
Comment options

@steampunque
Comment options

Comment options

You must be logged in to vote
8 replies
@steampunque
Comment options

@jukofyork
Comment options

@jukofyork
Comment options

@steampunque
Comment options

@jukofyork
Comment options

Comment options

You must be logged in to vote
3 replies
@ggerganov
Comment options

@steampunque
Comment options

@ggerganov
Comment options

Comment options

You must be logged in to vote
1 reply
@steampunque
Comment options

Comment options

You must be logged in to vote
6 replies
@steampunque
Comment options

@jukofyork
Comment options

@jukofyork
Comment options

@jukofyork
Comment options

@jukofyork
Comment options

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
1 reply
@steampunque
Comment options

Comment options

You must be logged in to vote
2 replies
@steampunque
Comment options

@Djip007
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Ideas
Labels
None yet
5 participants