Skip to content

Pull requests: huggingface/text-generation-inference

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Add support for compressed-tensors w8a8 int checkpoints
#2745 opened Nov 14, 2024 by danieldk Loading…
5 tasks
fix response type of document for Text Generation Inference
#2743 opened Nov 13, 2024 by jitokim Loading…
3 of 5 tasks
Upgrade outlines to 0.1.1
#2742 opened Nov 12, 2024 by aW3st Loading…
1 of 4 tasks
benchmark: fix prefill throughput
#2741 opened Nov 12, 2024 by danieldk Loading…
5 tasks
Fix: Change model_type from ssm to mamba
#2740 opened Nov 10, 2024 by mokeddembillel Loading…
2 of 5 tasks
Fix: Change embeddings to embedding
#2738 opened Nov 10, 2024 by mokeddembillel Loading…
2 of 5 tasks
Support continue final message
#2733 opened Nov 8, 2024 by drbh Loading…
feat: add payload limit
#2726 opened Nov 5, 2024 by OlivierDehaene Loading…
Add llama.cpp backend
#2723 opened Nov 4, 2024 by mfuntowicz Draft
feat: support flash attention 2 in qwen2 vl vision blocks
#2721 opened Nov 4, 2024 by drbh Loading…
Update to moe-kernels 0.7.0
#2720 opened Nov 4, 2024 by danieldk Loading…
5 tasks
fix: improve find_segments via numpy diff
#2686 opened Oct 24, 2024 by drbh Loading…
[WIP] Add gfx1100 support to AMD pytorch build
#2642 opened Oct 13, 2024 by cazlo Draft
1 of 5 tasks
feat: Add automatic nightly benchmarks
#2591 opened Sep 30, 2024 by Hugoch Loading…
1 of 5 tasks
Improve vlm support (add idefics3 support)
#2437 opened Aug 20, 2024 by drbh Draft
4 tasks
Add model_load_time metric
#2311 opened Jul 26, 2024 by Edwinhr716 Loading…
2 of 5 tasks
ProTip! Exclude everything labeled bug with -label:bug.