Skip to content

fix lite module for transformers>=5.0#4488

Open
43758726 wants to merge 4 commits intoInternLM:mainfrom
43758726:transformers_compatible
Open

fix lite module for transformers>=5.0#4488
43758726 wants to merge 4 commits intoInternLM:mainfrom
43758726:transformers_compatible

Conversation

@43758726
Copy link
Copy Markdown
Collaborator

@43758726 43758726 commented Apr 2, 2026

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily receiving feedbacks. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.

Motivation

The [lmdeploy.lite] fails to quant/calibrate when running with [transformers >= 5.0] in some models.

Modification

lmdeploy/lite/quantization/calibration.py: Added fallback logic in _guess_num_heads() to unwrap nested config objects by checking for text_config and llm_config attributes before accessing head count parameters.
lmdeploy/lite/quantization/awq.py: Cast scales.max() and scales.min() to float32 before multiplication to prevent float16/bfloat16 overflow that produces inf.
lmdeploy/lite/apis/auto_awq.py: Changed the import of LAYER_TYPE_MAP and calibrate from a relative import to an absolute import to avoid potential circular import issues.

Checklist

  1. Pre-commit or other linting tools are used to fix the potential lint issues.
  2. The modification is covered by complete unit tests. If not, please add more unit tests to ensure the correctness.
  3. If the modification has a dependency on downstream projects of a newer version, this PR should be tested with all supported versions of downstream projects.
  4. The documentation has been modified accordingly, like docstring or example tutorials.

Copilot AI review requested due to automatic review settings April 2, 2026 13:02
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes lmdeploy.lite quantization/calibration regressions when used with transformers>=5.0, focusing on newer nested config wrappers and numerical stability in AWQ smoothing.

Changes:

  • Add fallback logic in calibration to unwrap nested HF config objects before reading head-count fields.
  • Prevent potential overflow in AWQ scale normalization by computing extrema in float32.
  • Switch auto_awq to absolute imports for calibrate/LAYER_TYPE_MAP.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
lmdeploy/lite/quantization/calibration.py Unwrap nested config objects in _guess_num_heads; also includes new commented debug prints in the wrapped forward.
lmdeploy/lite/quantization/awq.py Adjusts AWQ smooth_fc_fcs normalization to avoid float16/bfloat16 overflow.
lmdeploy/lite/apis/auto_awq.py Changes relative import of calibrate/LAYER_TYPE_MAP to an absolute import.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +83 to +86
if hasattr(model.config, 'text_config'):
model.config = model.config.text_config
if hasattr(model.config, 'llm_config'):
model.config = model.config.llm_config
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_guess_num_heads() mutates model.config by reassigning it to text_config / llm_config. This has side effects for the rest of calibration (e.g., later code uses model.config.hidden_size, use_cache, and config updates/saving) and can break models whose wrapper config contains fields not present on the nested config. Use a local variable (e.g., cfg = model.config and unwrap cfg), and leave model.config unchanged.

Copilot uses AI. Check for mistakes.
@43758726 43758726 requested a review from lvhan028 April 2, 2026 13:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants