Skip to content

[Compressor][NVFP4] Support FP4 Compression #311

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 9, 2025
Merged

Conversation

dsikka
Copy link
Collaborator

@dsikka dsikka commented May 7, 2025

Summary

  • Adds a NVFP4Compressor to compress FP4 weights into uint8
  • Compatible with the ModelOpt Integration in vLLM
  • Runs e2e using the FP4 CT Emulation: NVFP4 Emulation vllm#59

Shoutout to @mgoin for helping speed up the packing functionality

@dsikka dsikka requested review from rahul-tuli and mgoin May 7, 2025 17:12
Copy link
Contributor

@brian-dellabetta brian-dellabetta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

kylesayrs
kylesayrs previously approved these changes May 8, 2025
Copy link
Contributor

@kylesayrs kylesayrs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really clean and easy to understand, awesome work

@dsikka dsikka dismissed stale reviews from kylesayrs and brian-dellabetta via f756107 May 9, 2025 02:35
@dsikka dsikka enabled auto-merge (squash) May 9, 2025 02:36
Copy link
Member

@rahul-tuli rahul-tuli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dsikka dsikka merged commit 5c6fd5d into main May 9, 2025
1 check passed
@dsikka dsikka deleted the support_fp4_compression branch May 9, 2025 14:20
Copy link
Member

@mgoin mgoin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants