Skip to content

[tx] Add support for GLM-4.7 Flash#1023

Merged
pcmoritz merged 2 commits intoNovaSky-AI:mainfrom
pcmoritz:tx-glm-4.7-lite
Feb 4, 2026
Merged

[tx] Add support for GLM-4.7 Flash#1023
pcmoritz merged 2 commits intoNovaSky-AI:mainfrom
pcmoritz:tx-glm-4.7-lite

Conversation

@pcmoritz
Copy link
Collaborator

@pcmoritz pcmoritz commented Feb 4, 2026

The architecture is the same as DeepseekV3ForCausalLM which we already support with #889

@pcmoritz pcmoritz added the tx label Feb 4, 2026
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for GLM-4.7 Flash by creating an alias to the existing DeepseekV3ForCausalLM implementation, as they share the same architecture. The change is simple and effective. However, for long-term maintainability, I've suggested a structural improvement to better separate the concerns of different model families.

hidden_states=outputs.hidden_states,
)


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For better maintainability and code organization, it's preferable to avoid aliasing models from different families within the same file. While GLM-4 and DeepseekV3 currently share the same architecture, placing the Glm4MoeLiteForCausalLM alias here couples them together. If future versions of GLM-4 diverge, this file will become harder to maintain as it will contain logic for two distinct models. A better approach would be to create a separate glm4.py file that imports and aliases DeepseekV3ForCausalLM. This would keep the model implementations modular.

@pcmoritz pcmoritz merged commit 0b16386 into NovaSky-AI:main Feb 4, 2026
4 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant