Add Devstral Small 1.1 #2468

omkar-334 · 2025-12-09T12:09:02Z

Description of the change

Added presets for Devstral Small 1.1

Reference

Github Issue - #2333
Model HF - https://huggingface.co/mistralai/Devstral-Small-2507
The Devstral HF contains only tekken.json but we need tokenizers.json format. There are 2 solutions for this -

Use the conversion script from huggingface/transformers - This converts the tekken.json and loads Autotokenizer
Use tokenizer from mistralai/Mistral-Small-24B-Base-2501 - referencing this issue from unsloth, Since Devstral is just finetuned, we can use the earlier model to obtain tokenizer (yes, this model has a tokenizers.json format)

I've gone ahead with the Option 2 and implemented it in the code.
I've updated presets in mistral_presets.py, convert_mistral.py, and convert_mistral_checkpoints.py.

Colab Notebook

I could not load it in colab since the model is 24B and runtime is crashing, but i will try it in Modal/Lambda and attach the results here.

Doubts

This is the kaggle link for the model - https://www.kaggle.com/models/mistral-ai/devstral-small-2507, but I'm unsure on the kaggle_handle format
cc @sachinprasadhs let me know if there any changes to make

gemini-code-assist · 2025-12-09T12:09:20Z

Summary of Changes

Hello @omkar-334, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request integrates the Devstral Small 1.1 model into the system by adding its configuration and adapting the tokenizer conversion process. The primary challenge addressed was the absence of a tokenizers.json file for the Devstral model, which was resolved by leveraging the tokenizer from a compatible Mistral base model. This ensures the new model can be properly utilized within the existing framework.

Highlights

New Model Integration: Presets for the Devstral Small 1.1 model have been added, which is a 24B parameter model finetuned from Mistral-Small-3.1.
Tokenizer Resolution: A specific solution was implemented to use the tokenizer from "mistralai/Mistral-Small-24B-Base-2501" for Devstral Small 1.1, addressing the issue where the original Devstral HF model lacked a tokenizers.json file.
Configuration Updates: Relevant configuration files, including mistral_presets.py, convert_mistral.py, and convert_mistral_checkpoints.py, were updated to support the new model and its unique tokenizer loading requirements.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request adds presets for the Devstral Small 1.1 model. The changes correctly handle the tokenizer issue by using the tokenizer from a compatible base model. However, there's a minor typo in the model description within the preset file. Additionally, the code in convert_mistral.py and convert_mistral_checkpoints.py for handling the special case of the 'devstral' model can be improved by using a more robust check and avoiding hardcoded strings to enhance maintainability and readability. I've provided suggestions to address these points.

keras_hub/src/models/mistral/mistral_presets.py

keras_hub/src/utils/transformers/convert_mistral.py

tools/checkpoint_conversion/convert_mistral_checkpoints.py

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

sachinprasadhs

Thanks for the PR, please attach screenshots matching numerics, parameter count, tokenizer matching and output matching.

sachinprasadhs · 2025-12-09T19:15:59Z

tools/checkpoint_conversion/convert_mistral_checkpoints.py

+
+        if preset == "devstral_small_1_1":
+            hf_tokenizer = AutoTokenizer.from_pretrained(
+                "mistralai/Mistral-Small-24B-Base-2501"
+            )
+        else:
+            hf_tokenizer = AutoTokenizer.from_pretrained(hf_preset)


Can't we use tekken.json since they have mentioned Tokenizer: Utilizes a Tekken tokenizer with a 131k vocabulary size.

we would need to add a dependancy on https://github.com/mistralai/mistral-common since transformers Autotokenizer does not support tekken.json

Got it, they have mentioned going forward they will only use tekken.json, what difference is between tokenizer.json from base model to devstral's tekken.json?

As I observed they included tokenizer.json also in the today's release of Devstral 2 model.

If adding dependency is required by observing other models of mistral, if they only have tekken.json like this model, then we can think of adding dependency.

Got it, they have mentioned going forward they will only use tekken.json, what difference is between tokenizer.json from base model to devstral's tekken.json?

As I observed they included tokenizer.json also in the today's release of Devstral 2 model.

I think they are including tokenizer.json so that people can continue using them until frameworks support tekken.json

This is the current state of their tokenizer formats for newer models -

mistralai/Devstral-Small-2507 - tekken.json (Add Devstral Small 1.1 #2333)

mistralai/Devstral-Small-2-24B-Instruct-2512 - tekken.json, tokenizer.json

mistralai/Mistral-Small-24B-Base-2501 - tekken.json, tokenizer.json

mistralai/Mistral-Small-3.1-24B-Base-2503 - tekken.json, tokenizer.json (Add Mistral-Small-3.1 #2334)

mistralai/Ministral-3-8B-Base-2512 - tekken.json, tokenizer.json

mistralai/Magistral-Small-2509 - tekken.json (Add Magistral to Keras-Hub #2314)

mistralai/Voxtral-Mini-3B-2507 - tekken.json (Add Voxtral #2349)

Older Models -

All of the mistral and mixtral models that are implemented in keras-hub include tokenizer.model and tokenizer.json.

Hence, the keras-hub implementation loads the tokenizer using tokenizer.model file format.

My earlier changes do not work since we don't use the tokenizer.json format.
Going forward, we need to use tekken.json

transformers has started supporting the tekken tokenizer and has used the mistral-common as its backend for the Mistral models. (https://github.com/huggingface/transformers/blob/471d7ce9abbb3bc1b3bab673367378f9dbc3caac/src/transformers/tokenization_mistral_common.py)

we would need to add a dependancy on https://github.com/mistralai/mistral-common since transformers Autotokenizer does not support tekken.json

Apparently, the latest version does support it. My bad.

omkar-334 added 4 commits December 9, 2025 16:12

add devstral in preset_map

61cbf9f

changes for devstral tokenizer

abf1276

edit hf tokenizer for devstral

64107d0

add preset

70a8bc8

gemini-code-assist bot reviewed Dec 9, 2025

View reviewed changes

omkar-334 and others added 4 commits December 9, 2025 17:54

linting fixes

59df4e4

Update keras_hub/src/utils/transformers/convert_mistral.py

1325743

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Update tools/checkpoint_conversion/convert_mistral_checkpoints.py

71d1b9f

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

fix

0ee78e2

sachinprasadhs self-requested a review December 9, 2025 19:04

sachinprasadhs reviewed Dec 9, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Devstral Small 1.1 #2468

Add Devstral Small 1.1 #2468

omkar-334 commented Dec 9, 2025 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Dec 9, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sachinprasadhs left a comment

Uh oh!

sachinprasadhs Dec 9, 2025

Uh oh!

omkar-334 Dec 9, 2025

Uh oh!

sachinprasadhs Dec 9, 2025

Uh oh!

sachinprasadhs Dec 9, 2025

Uh oh!

omkar-334 Dec 10, 2025 •

edited

Loading

Uh oh!

omkar-334 Dec 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add Devstral Small 1.1 #2468

Are you sure you want to change the base?

Add Devstral Small 1.1 #2468

Conversation

omkar-334 commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description of the change

Reference

Colab Notebook

Doubts

Uh oh!

gemini-code-assist bot commented Dec 9, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sachinprasadhs left a comment

Choose a reason for hiding this comment

Uh oh!

sachinprasadhs Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

omkar-334 Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

sachinprasadhs Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

sachinprasadhs Dec 9, 2025

Choose a reason for hiding this comment

Uh oh!

omkar-334 Dec 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

omkar-334 Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

omkar-334 commented Dec 9, 2025 •

edited

Loading

omkar-334 Dec 10, 2025 •

edited

Loading