-
-
Notifications
You must be signed in to change notification settings - Fork 793
Cpu C++ kernel #1789
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Cpu C++ kernel #1789
Conversation
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
Signed-off-by: jiqing-feng <[email protected]>
| if (BUILD_CPU) | ||
| target_link_libraries(bitsandbytes PRIVATE OpenMP::OpenMP_CXX) | ||
| include(CheckCXXCompilerFlag) | ||
|
|
||
| check_cxx_compiler_flag(-mavx512f HAS_AVX512F) | ||
| check_cxx_compiler_flag(-mavx512bf16 HAS_AVX512BF16) | ||
|
|
||
| if(HAS_AVX512F) | ||
| target_compile_options(bitsandbytes PRIVATE -mavx512f) | ||
| endif() | ||
|
|
||
| if(HAS_AVX512BF16) | ||
| target_compile_options(bitsandbytes PRIVATE -mavx512bf16) | ||
| endif() | ||
| endif() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @jiqing-feng we still have some build issues with this.
A few things need consideration here:
- We build for Linux aarch64
- We also build for macOS arm64. I'm not sure how to use OpenMP on that platform - maybe we can skip for now?
- On Windows x86-64 we build with MSVC. Apart from
/arch:AVX512I don't really know if there is a flag for AVX512-BF16.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated the cmake file, please check it. Thanks.
Signed-off-by: jiqing-feng <[email protected]>
The C++ kernels.
cmake -DCOMPUTE_BACKEND=cpu -S . && makeHi @matthewdouglas . I've implemented the CPU dequantize op for nf4/fp4. It will bring 10x+ speed-up in the e2e text-generation task compared with the original python kernel on llama3-8B model. Would you please review this PR? Thanks!