Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for FP8 Matmuls #275

Open
maktukmak opened this issue Aug 9, 2024 · 3 comments
Open

Support for FP8 Matmuls #275

maktukmak opened this issue Aug 9, 2024 · 3 comments

Comments

@maktukmak
Copy link
Contributor

Int8 matrix multiplication kernels are currently called on CUDA and CPU devices when activations and weights are quantized to int8. However, FP8 matmuls are not used when activations and weights are quantized to float8. Matmul is being done in full precision in that case, if I am not mistaken. What's the current situation and roadmap for using float8 matrix multiplications, for instance through _scaled_mm?

Copy link

github-actions bot commented Sep 9, 2024

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions github-actions bot added the Stale label Sep 9, 2024
@dacorvo dacorvo removed the Stale label Sep 9, 2024
Copy link

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions github-actions bot added the Stale label Oct 10, 2024
@dacorvo dacorvo removed the Stale label Oct 10, 2024
Copy link

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions github-actions bot added the Stale label Nov 10, 2024
@dacorvo dacorvo removed the Stale label Nov 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants