Skip to content

enable cpu offloading for Bark on xpu #37599

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Apr 23, 2025
Merged

Conversation

yao-matrix
Copy link
Contributor

command

pytest -rA tests/models/bark/test_modeling_bark.py::BarkModelIntegrationTests::test_generate_end_to_end_with_offload

after this PR
PASSED

Signed-off-by: YAO Matrix <[email protected]>
Signed-off-by: YAO Matrix <[email protected]>
@github-actions github-actions bot marked this pull request as draft April 18, 2025 07:33
Copy link
Contributor

Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. The CI will be paused while the PR is in draft mode. When it is ready for review, please click the Ready for review button (at the bottom of the PR page). This will assign reviewers and trigger CI.

@@ -1056,7 +1057,8 @@ def processor(self):
def inputs(self):
input_ids = self.processor("In the light of the moon, a little egg lay on a leaf", voice_preset="en_speaker_6")

input_ids = input_ids.to(torch_device)
for k, v in input_ids.items():
input_ids[k] = v.to(torch_device)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed this because input_ids is a dict, and prior code just use to(torch_device) to move all of its items to device. In my env, both XPU and A100 will fail, say one tensor is on CPU(history_prompt in input_ids dict) and another is on cuda:0 or xpu:0(it's embedding table), i found the to() in original code only moves some of the items into device but others still on cpu, I changed the code to move item-by-item specifically, and then it pass on both XPU and CUDA.

device_type = "cuda"
if is_torch_accelerator_available():
device_type = torch.accelerator.current_accelerator().type
device = torch.device(f"{device_type}:{gpu_id}")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use torch.accelerator to detect device runtime when it's available, else fallback to old value which is "cuda".

@yao-matrix yao-matrix marked this pull request as ready for review April 18, 2025 07:40
Copy link
Contributor

@MekkCyber MekkCyber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot ! sounds good, left few nits

@yao-matrix
Copy link
Contributor Author

@ydshieh , added, pls help review, thx

@ydshieh ydshieh merged commit 12f65ee into huggingface:main Apr 23, 2025
14 of 18 checks passed
@ydshieh
Copy link
Collaborator

ydshieh commented Apr 23, 2025

Thanks. I updated with 2 commits 1a4557c and 0911c14

@yao-matrix
Copy link
Contributor Author

Thanks. I updated with 2 commits 1a4557c and 0911c14

thx, i know how to do in next time for the deprecations cases. and you usage of getattr is simpler, great to learn along the journey, thx.

@yao-matrix yao-matrix deleted the bark-xpu branch April 23, 2025 22:34
zucchini-nlp pushed a commit to zucchini-nlp/transformers that referenced this pull request May 14, 2025
* enable cpu offloading of bark modeling on XPU

Signed-off-by: YAO Matrix <[email protected]>

* remove debug print

Signed-off-by: YAO Matrix <[email protected]>

* fix style

Signed-off-by: YAO Matrix <[email protected]>

* fix review comments

Signed-off-by: YAO Matrix <[email protected]>

* enhance test

Signed-off-by: YAO Matrix <[email protected]>

* update

* add deprecate message

Signed-off-by: YAO Matrix <[email protected]>

* update

* update

* trigger CI

---------

Signed-off-by: YAO Matrix <[email protected]>
Co-authored-by: ydshieh <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants