Add Nemotron GGUF Loading Support #34725

farrosalferro · 2024-11-14T04:26:47Z

What does this PR do?

Add Nemotron GGUF loading support

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case. Link: Community contribution: Adding GGUF support for more architectures #33260
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@SunMarc @LysandreJik @ArthurZucker

VladOS95-cyber · 2024-11-14T07:10:40Z

src/transformers/modeling_gguf_pytorch_utils.py

@@ -129,6 +129,9 @@ def load_gguf_checkpoint(gguf_checkpoint_path, return_tensors=False):
            )
        model_size = m.group().strip("-")  # only keeps `7b`

+    if "nemotron" in architecture:


Why you explicitly assign architecture to updated one if it is the same?

I'm sorry if my answer seems obvious, but isn't it for addressing cases where the "architecture" does not only contain "nemotron"? I took reference on what you did for the qwen2moe, so I think It's better to also do it for nemotron. But I tested it without these lines and it passes through. What do you think? And thank you for reviewing! As this is my first time contributing, please let me know if anything seems odds or is there any better implementation. Thank you!

No, for qwen2moe, I explicitly assigned another architecture name, because gguf file contains qwen2moe, but later, execution chain expects to get qwen2_moe for config, model processing and so on. You provided the same name "nemotron". So, there is no reason to explicitly assign updated architecture to the same name and even to mention nemotron, because gguf processing takes it from config by default.

Add Nemotron GGUF Loading Support

fb26745

farrosalferro mentioned this pull request Nov 14, 2024

Community contribution: Adding GGUF support for more architectures #33260

Open

15 tasks

VladOS95-cyber reviewed Nov 14, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Nemotron GGUF Loading Support #34725

Add Nemotron GGUF Loading Support #34725

farrosalferro commented Nov 14, 2024

VladOS95-cyber Nov 14, 2024

farrosalferro Nov 14, 2024

VladOS95-cyber Nov 14, 2024 •

edited

Loading

Add Nemotron GGUF Loading Support #34725

Are you sure you want to change the base?

Add Nemotron GGUF Loading Support #34725

Conversation

farrosalferro commented Nov 14, 2024

What does this PR do?

Before submitting

Who can review?

VladOS95-cyber Nov 14, 2024

Choose a reason for hiding this comment

farrosalferro Nov 14, 2024

Choose a reason for hiding this comment

VladOS95-cyber Nov 14, 2024 • edited Loading

Choose a reason for hiding this comment

VladOS95-cyber Nov 14, 2024 •

edited

Loading