Skip to content

LoRA Model Separate Export During HF Conversion #5204

@cxz1418

Description

@cxz1418

I am exporting a LoRA model in HF format using MS-SWIFT.
Is there an option to export the base model and the LoRA model separately?
The merged model is too large, so I plan to build the service by swapping only the LoRA.
However, even when setting merge to false during model export, it seems that a separate LoRA file is not generated.
Could you please confirm whether there is currently an option to export them separately? (_ _)

我正在使用 MS-SWIFT 以 HF 格式导出 LoRA 模型。
请问是否有选项可以将基础模型和 LoRA 模型分别导出?
合并后的模型太大,因此我计划通过替换 LoRA 来构建服务。
但是,即使在导出模型时将 merge 设置为 false,也似乎没有生成单独的 LoRA 文件。
请问目前是否支持将它们分别导出的选项?(_ _)

Train Parameters

Memory usage: 8 * 78GiB

Training speed: 9.5s/it

PYTORCH_CUDA_ALLOC_CONF='expandable_segments:True'
NPROC_PER_NODE=8
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
megatron sft
--load Qwen3-235B-A22B-Instruct-2507-mcore
--dataset 'swift/Chinese-Qwen3-235B-2507-Distill-data-110k-SFT#2000'
'swift/self-cognition#1000'
--optimizer_cpu_offload true
--use_precision_aware_optimizer true
--train_type lora
--lora_rank 8
--lora_alpha 32
--target_modules all-linear
--split_dataset_ratio 0.01
--expert_model_parallel_size 2
--pipeline_model_parallel_size 4
--decoder_first_pipeline_num_layers 23
--decoder_last_pipeline_num_layers 23
--moe_grouped_gemm true
--moe_shared_expert_overlap true
--moe_aux_loss_coeff 1e-3
--micro_batch_size 8
--global_batch_size 16
--recompute_granularity full
--recompute_method uniform
--recompute_num_layers 1
--max_epochs 1
--finetune true
--cross_entropy_loss_fusion true
--lr 1e-4
--lr_warmup_fraction 0.05
--min_lr 1e-5
--save megatron_output/Qwen3-235B-A22B-Instruct-2507
--eval_interval 200
--save_interval 200
--max_length 2048
--num_workers 8
--dataset_num_proc 8
--no_save_optim true
--no_save_rng true
--sequence_parallel true
--attention_backend flash
--model_author swift
--model_name swift-robot

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions