Skip to content

pack_image_features RuntimeError when vision_feature_select_strategy="full" #39839

@llnnnnnn

Description

@llnnnnnn

System Info

transformers 4.54.0

Who can help?

@zucchini-nlp

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

from transformers.models.llava_next import LlavaNextForConditionalGeneration, LlavaNextProcessor
from PIL import Image
import requests
import torch

model = LlavaNextForConditionalGeneration.from_pretrained(
                "llava-hf/llava-v1.6-vicuna-7b-hf", 
                vision_feature_select_strategy="full",
                torch_dtype=torch.float16,
                device_map="auto",
            )
processor = LlavaNextProcessor.from_pretrained("llava-hf/llava-v1.6-vicuna-7b-hf")

image = Image.open("/data/coco/train2017/000000000009.jpg")
prompt = "USER: <image>\nWhat is shown in this image? ASSISTANT:"
inputs = processor(images=image, text=prompt, truncation=True, return_tensors="pt", vision_feature_select_strategy = "full").to("cuda")

input_embeds = model(inputs.input_ids, pixel_values=inputs.pixel_values, image_sizes=inputs.image_sizes, vision_feature_select_strategy="full")

Expected behavior

I encountered a bug when running to the line
input_embeds = model(inputs.input_ids, pixel_values=inputs.pixel_values, image_sizes=inputs.image_sizes, vision_feature_select_strategy="full")
I got:

 in pack_image_features
    image_feature = image_feature.view(num_patch_height, num_patch_width, height, width, -1)
RuntimeError: shape '[2, 2, 24, 24, -1]' is invalid for input of size 9453568

the shape of image_feature is [4, 577, 4096] currently, I want to know how to fix this?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions