huggingface / text-generation-inference Public

Notifications You must be signed in to change notification settings
Fork 1.2k
Star 10.6k

Code
Issues 280
Pull requests 37
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Pull requests: huggingface/text-generation-inference

Labels 14 Milestones 1

New pull request New

37 Open 1,665 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

feat: support max_image_fetch_size to limit

#3339 opened Nov 13, 2025 by drbh

Loading…

fix: bump flake and update grammar logit processor

#3338 opened Nov 6, 2025 by drbh • Draft

Remove once_cell dependency from multiple Cargo.toml files and update usage in validation.rs to use std::sync::LazyLock instead of once_cell::sync::Lazy.

#3334 opened Sep 28, 2025 by htiennv

Loading…

5 tasks

feat: expose GPU energy consumption (mJ) in responses

#3315 opened Aug 28, 2025 by JulienDelavande

Loading…

2 of 5 tasks

**Add dedicated CPU-only Dockerfile and update documentation for CPU/…

#3310 opened Aug 7, 2025 by jakubgajski

Loading…

2 of 5 tasks

support qwen3 on nvidia

#3302 opened Jul 23, 2025 by icyxp

Loading…

Retrieve the correct cached model batch size in Neuron config checker for Neuron Backend

#3300 opened Jul 19, 2025 by jimburtoft

Loading…

3 tasks

Attempt to fix CI errors

#3292 opened Jul 8, 2025 by danieldk

Loading…

5 tasks

fix: enable defs references in tool calls

#3291 opened Jul 7, 2025 by drbh

Loading…

Update quantization kernels

#3288 opened Jul 7, 2025 by danieldk • Draft

5 tasks

feat: allow json_schema in response format and add test

#3276 opened Jun 25, 2025 by drbh

Loading…

Disable mamba in CPU platform

#3266 opened Jun 13, 2025 by casassg

Loading…

3 of 5 tasks

feat: improve llava next pooling for granite vision

#3255 opened Jun 4, 2025 by drbh

Loading…

Trtllm backend improvements

#3231 opened May 17, 2025 by leejuyuu

Loading…

1 of 5 tasks

Fix typos

#3210 opened May 6, 2025 by omahs

Loading…

1 of 5 tasks

feat: lock updated kernel versions

#3201 opened Apr 29, 2025 by drbh

Loading…

Set uv UV_PYTHON_INSTALL_DIR explicitly

#3197 opened Apr 27, 2025 by sebastianliebscher

Loading…

1 of 5 tasks

README: minimum Python version is 3.10

#3194 opened Apr 25, 2025 by Frenzie

Loading…

1 of 5 tasks

feat: support logit bias in chat request

#3186 opened Apr 22, 2025 by drbh

Loading…

Fix flashinfer plan call to use positional arguments for #3165

#3166 opened Apr 11, 2025 by ruckc

Loading…

2 of 5 tasks

Update to flashinfer 0.2.5

#3164 opened Apr 11, 2025 by danieldk • Draft

5 tasks

Add chunked attn for L4

#3162 opened Apr 10, 2025 by mht-sharma • Draft

2 of 7 tasks

Update links Inferentia refer docs

#3154 opened Apr 9, 2025 by guspan-tanadi

Loading…

1 of 5 tasks

feat: align function id with tool call response

#3111 opened Mar 13, 2025 by drbh

Loading…

wip: comment out prepend full_text

#3079 opened Mar 7, 2025 by jrc2139 • Draft

1 of 5 tasks

Previous 1 2 Next

Previous Next

ProTip! Updated in the last three days: updated:>2025-11-10.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!