feat(transformers): add VaultGemma (v4.57.1) #1450

alien-0119 · 2025-12-04T07:08:30Z

What does this PR do?

Adds # (feature)
Add VaultGemma model and fast ut.

Usage Example:

from transformers import AutoTokenizer
from mindone.transformers import VaultGemmaForCausalLM
import mindspore as ms

model_id = "google/vaultgemma-1b"
model = VaultGemmaForCausalLM.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id)

prompt = "Tell me an unknown interesting biology fact about the brain."
inputs = tokenizer(prompt, return_tensors="np")
inputs = {k: ms.tensor(v) for k, v in inputs.items()}

# Generate
outputs = model.generate(**inputs, max_new_tokens=32)
print(tokenizer.decode(outputs[0]))
# I'm not sure if this is the right place to ask this, but I'm curious about the

Performance:
Experiments were tested on Ascend Atlas 800T A2 machines with mindspore 2.7.0 pynative mode.

model	precision	weight load(s)	s/step
google/vaultgemma-1b	fp32	48.753	0.090
google/vaultgemma-1b	fp16	46.820	0.089
google/vaultgemma-1b	bf16	35.317	0.102

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you make sure to update the documentation with your changes? E.g. record bug fixes or new features in What's New. Here are the
documentation guidelines
Did you build and run the code without any errors?
Did you report the running environment (NPU type/MS version) and performance in the doc? (better record it for data loading, model inference, or training tasks)
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@xxx

gemini-code-assist · 2025-12-04T07:08:48Z

Summary of Changes

Hello @alien-0119, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly expands the mindone/transformers library by integrating the VaultGemma model. This addition provides users with a new powerful causal language model, enhancing the library's capabilities for various natural language processing tasks within the MindSpore ecosystem. The changes involve adding the model's core architecture, updating auto-loading mechanisms, and ensuring robust testing.

Highlights

New Model Integration: The VaultGemma model (v4.57.1) has been successfully integrated into the mindone/transformers library, expanding the range of available transformer models.
Core Model Implementation: Full MindSpore implementations for key VaultGemma components, including VaultGemmaRMSNorm, VaultGemmaMLP, VaultGemmaAttention, VaultGemmaDecoderLayer, VaultGemmaRotaryEmbedding, VaultGemmaPreTrainedModel, VaultGemmaModel, and VaultGemmaForCausalLM, have been added.
Auto-Configuration and Auto-Modeling Support: The library's auto-configuration and auto-modeling utilities have been updated to seamlessly recognize and load VaultGemma models and their configurations.
Comprehensive Testing: New unit tests have been introduced to validate the correctness and precision of the VaultGemma model's MindSpore implementation against its PyTorch counterpart.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces the VaultGemma model, which appears to be a port from the Hugging Face Transformers library. The implementation is comprehensive, including the model architecture and corresponding tests. My review focuses on ensuring code quality and correctness. I've identified a couple of areas for improvement: one is an incorrect type hint in the attention mechanism, and the other concerns the use of a wildcard import, which could be made more explicit for better code clarity. Overall, the changes are well-structured and the inclusion of tests is a great practice.

gemini-code-assist · 2025-12-04T07:10:31Z

mindone/transformers/models/vaultgemma/modeling_vaultgemma.py

+        past_key_values: Optional[Cache] = None,
+        cache_position: Optional[Tensor] = None,
+        **kwargs: Unpack[FlashAttentionKwargs],
+    ) -> tuple[Tensor, Optional[Tensor], Optional[tuple[Tensor]]]:


The return type hint of the construct method is incorrect. It's specified as tuple[Tensor, Optional[Tensor], Optional[tuple[Tensor]]] but the function returns a tuple of two elements: attn_output and attn_weights. The third element, presumably for past_key_values, is not returned. Since the cache is updated in-place, this is fine, but the type hint should be corrected to match the actual return signature for code clarity and to aid static analysis.

Suggested change

) -> tuple[Tensor, Optional[Tensor], Optional[tuple[Tensor]]]:

) -> tuple[Tensor, Optional[Tensor]]:

gemini-code-assist · 2025-12-04T07:10:31Z

mindone/transformers/models/vaultgemma/__init__.py

+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+from .modeling_vaultgemma import *


Using a wildcard import (*) is generally discouraged by PEP 8 as it can lead to namespace pollution and makes it difficult to track where names are coming from. While I see this pattern is used elsewhere in the repository, it's a good practice to explicitly import the public API. This improves code readability and helps static analysis tools.

Suggested change

from .modeling_vaultgemma import *

from .modeling_vaultgemma import (

VaultGemmaForCausalLM,

VaultGemmaModel,

VaultGemmaPreTrainedModel,

)

__all__ = ["VaultGemmaForCausalLM", "VaultGemmaModel", "VaultGemmaPreTrainedModel"]

alien-0119 requested a review from vigo999 as a code owner December 4, 2025 07:08

gemini-code-assist bot reviewed Dec 4, 2025

View reviewed changes

alien-0119 force-pushed the vaultgemma_master branch from d66122e to f6ad07e Compare December 4, 2025 07:19

add vaultgemma

284e3f0

alien-0119 force-pushed the vaultgemma_master branch from f6ad07e to 284e3f0 Compare December 4, 2025 07:29

alien-0119 self-assigned this Dec 4, 2025

alien-0119 added the new model add new model to mindone label Dec 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(transformers): add VaultGemma (v4.57.1) #1450

feat(transformers): add VaultGemma (v4.57.1) #1450

Uh oh!

alien-0119 commented Dec 4, 2025 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Dec 4, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Dec 4, 2025

Uh oh!

gemini-code-assist bot Dec 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	) -> tuple[Tensor, Optional[Tensor], Optional[tuple[Tensor]]]:
	) -> tuple[Tensor, Optional[Tensor]]:

-from .modeling_vaultgemma import *
+from .modeling_vaultgemma import (
+    VaultGemmaForCausalLM,
+    VaultGemmaModel,
+    VaultGemmaPreTrainedModel,
+)
+__all__ = ["VaultGemmaForCausalLM", "VaultGemmaModel", "VaultGemmaPreTrainedModel"]

feat(transformers): add VaultGemma (v4.57.1) #1450

Are you sure you want to change the base?

feat(transformers): add VaultGemma (v4.57.1) #1450

Uh oh!

Conversation

alien-0119 commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

gemini-code-assist bot commented Dec 4, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

alien-0119 commented Dec 4, 2025 •

edited

Loading