Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Lora微调后模型加载失败 #966

Open
3 tasks
Littlew69 opened this issue Mar 26, 2025 · 0 comments
Open
3 tasks

[Bug] Lora微调后模型加载失败 #966

Littlew69 opened this issue Mar 26, 2025 · 0 comments

Comments

@Littlew69
Copy link

Checklist

  • 1. I have searched related issues but cannot get the expected help.
  • 2. The bug has not been fixed in the latest version.
  • 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.

Describe the bug

运行sh shell/internvl2.5/2nd_finetune/internvl2_5_4b_dynamic_res_2nd_finetune_lora.sh对模型微调,不能直接按照
path = './InternVL2_5-4B-lora' model = AutoModel.from_pretrained( path, torch_dtype=torch.bfloat16, low_cpu_mem_usage=True, use_flash_attn=True, trust_remote_code=True).eval().cuda() tokenizer = AutoTokenizer.from_pretrained(path, trust_remote_code=True, use_fast=False)
加载模型,报错NotImplementedError: Cannot copy out of meta tensor; no data!。我按照Enhancing InternVL2 on COCO Caption Using LoRA Fine-Tuning中的内容将Lora参数合并,也有同样的报错。

我在模型合并中,加载模型的InternVLChatModel.from_pretrained参数中添加device_map="auto"可以merge,但是推理速度远低于未微调的模型

Reproduction

set -x

GPUS=${GPUS:-4}
BATCH_SIZE=${BATCH_SIZE:-4}
PER_DEVICE_BATCH_SIZE=${PER_DEVICE_BATCH_SIZE:-1}
GRADIENT_ACC=$((BATCH_SIZE / PER_DEVICE_BATCH_SIZE / GPUS))

export PYTHONPATH="${PYTHONPATH}:$(pwd)"
export MASTER_PORT=34229
export TF_CPP_MIN_LOG_LEVEL=3
export LAUNCHER=pytorch

OUTPUT_DIR='./internvl_checkpoints/internvl_chat_v2_5/internvl2_5_4b_dynamic_res_2nd_finetune_lora'

if [ ! -d "$OUTPUT_DIR" ]; then
mkdir -p "$OUTPUT_DIR"
fi

number of gpus: 2

batch size per gpu: 4

gradient accumulation steps: 2

total batch size: 16

epoch: 1

torchrun
--nnodes=1
--node_rank=0
--master_addr=127.0.0.1
--nproc_per_node=${GPUS}
--master_port=${MASTER_PORT}
internvl/train/internvl_chat_finetune.py
--model_name_or_path "OpenGVLab/InternVL2_5-4B"
--conv_style "internvl2_5"
--use_fast_tokenizer False
--output_dir ${OUTPUT_DIR}
--meta_path "./InternVL_process/finetune_lora_json/ng-4-1-ok-4-1-bbox.json"
--overwrite_output_dir True
--force_image_size 448
--max_dynamic_patch 6
--down_sample_ratio 0.5
--drop_path_rate 0.0
--freeze_llm True
--freeze_mlp True
--freeze_backbone True
--use_llm_lora 16
--vision_select_layer -1
--dataloader_num_workers 4
--bf16 True
--num_train_epochs 1
--per_device_train_batch_size ${PER_DEVICE_BATCH_SIZE}
--gradient_accumulation_steps ${GRADIENT_ACC}
--evaluation_strategy "no"
--save_strategy "steps"
--save_steps 200
--save_total_limit 1
--learning_rate 4e-5
--weight_decay 0.01
--warmup_ratio 0.03
--lr_scheduler_type "cosine"
--logging_steps 1
--max_seq_length 8192
--do_train True
--grad_checkpoint True
--group_by_length True
--dynamic_image_size True
--use_thumbnail True
--ps_version 'v2'
--deepspeed "zero_stage1_config.json"
--report_to "tensorboard"
2>&1 | tee -a "${OUTPUT_DIR}/training_log.txt"

Environment

使用文档中的Python==3.9的环境

Error traceback

You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Traceback (most recent call last):
  File "~/InternVL_process/inference_with_transformers.py", line 160, in <module>
    model = AutoModel.from_pretrained(
  File "~/miniconda3/envs/internvl/lib/python3.9/site-packages/transformers/modeling_utils.py", line 2567, in cuda
    return super().cuda(*args, **kwargs)
  File "~/miniconda3/envs/internvl/lib/python3.9/site-packages/torch/nn/modules/module.py", line 918, in cuda
    return self._apply(lambda t: t.cuda(device))
  File "~/miniconda3/envs/internvl/lib/python3.9/site-packages/torch/nn/modules/module.py", line 810, in _apply
    module._apply(fn)
  File "~/miniconda3/envs/internvl/lib/python3.9/site-packages/torch/nn/modules/module.py", line 810, in _apply
    module._apply(fn)
  File "~/miniconda3/envs/internvl/lib/python3.9/site-packages/torch/nn/modules/module.py", line 810, in _apply
    module._apply(fn)
  [Previous line repeated 3 more times]
  File "~/miniconda3/envs/internvl/lib/python3.9/site-packages/torch/nn/modules/module.py", line 833, in _apply
    param_applied = fn(param)
  File "~/miniconda3/envs/internvl/lib/python3.9/site-packages/torch/nn/modules/module.py", line 918, in <lambda>
    return self._apply(lambda t: t.cuda(device))
NotImplementedError: Cannot copy out of meta tensor; no data!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant