Skip to content

Enable ipex and other optimizations #1628

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 25 commits into
base: main
Choose a base branch
from

Conversation

jiqing-feng
Copy link
Contributor

@jiqing-feng jiqing-feng commented May 8, 2025

This PR enables ipex and other optimizations including:

  1. ipex fused op
  2. enable fp4 on cpu
  3. enable has_rem on quantize/dequantize 4bit
  4. Simple 8bit matmul so can make finetune faster on CPU

Also, it fixed the parameter patch for cpu.

It could pass all transformers tests

After this PR merged, I will update the installation guide.

@matthewdouglas @Titus-von-Koeller

Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
@jiqing-feng jiqing-feng marked this pull request as ready for review May 8, 2025 07:25
Signed-off-by: jiqing-feng <[email protected]>
Copy link

github-actions bot commented May 8, 2025

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@jiqing-feng
Copy link
Contributor Author

I am cleaning the CPU and XPU tests, process 50%

quant_state.blocksize,
quant_state.shape,
quant_state.dtype,
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there reason why this change can't be in bitsandbytes/backends/cpu/ops.py?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
@matthewdouglas matthewdouglas added this to the v0.47.0 milestone May 9, 2025
@jiqing-feng
Copy link
Contributor Author

pytest --ignore test_optim.py --ignore test_triton.py --ignore test_cuda_setup_evaluator.py

CPU previous: 378 passed, 1537 failed, 1638 skipped, 197 xfailed, 153 warnings in 613.27s
CPU current: 2079 passed, 1498 skipped, 153 deselected, 9 xfailed, 59 warnings in 1192.94s

XPU previous: not enabled
XPU current: 2093 passed, 1493 skipped, 153 deselected, 63 warnings in 562.25s

It also could pass all transformers tests

I also updated the installation guide.

Hi @matthewdouglas . Please take the next round review.

Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
Signed-off-by: jiqing-feng <[email protected]>
@@ -316,15 +316,29 @@ pip install -e . # `-e` for "editable" install, when developing BNB (otherwise
> [!TIP]
> Intel CPU/XPU backend only supports building from source; for now, please follow the instructions below.
It does not need compile CPP codes, all required ops are in [intel_extension_for_pytorch](https://pytorch-extension.intel.com/), please follow the instruction to install ipex.
It requires [intel_extension_for_pytorch](https://pytorch-extension.intel.com/), please follow the instruction to install ipex.
Copy link
Member

@matthewdouglas matthewdouglas May 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect IPEX to be optional. Especially so for CPU on Windows or for Linux/macOS on aarch64.

Signed-off-by: jiqing-feng <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants