Allow extra outputs from `GenerationMixin.generate`

### Feature request

Hi all, first of all if this feature already exists I apologise!

With the rise of multimodal LLMs if would be great if we could add extra outputs to `GenerationMixin.generate` results. For instance if we implement a model like Janus from DeepSeek, there are two output heads. One `lm_head`, and one `image_head`. The outputs of `forward` method have extra attributes that can't be passed to the `generate` results.

I know these multimodal models are not common within this repo so this is pretty bleeding edge, but I'm working on research in this domain and it would be great if we could forward all model outputs to the `generate` result. Maybe through an attribute like `kwarg_outputs` in classes like `GenerateDecoderOnlyOutput`?

### Motivation

As far as I understand it's possible to feed the extra output during the autoregressive loop, through `prepare_inputs_for_generation` and `_update_model_kwargs_for_generation` where we can forward model outputs to the next forward call.

But when it comes to forward these outputs to the result of `generate`, it doesn't seem possible? I know the generation mixin is geared towards text generation, but it would be great to be able to forward extra model outputs

### Your contribution

Happy to have a try but not sure how big of a PR it would be, especially if it touches the pytorch / tf / flax implementations.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Allow extra outputs from `GenerationMixin.generate` #39834

Feature request

Motivation

Your contribution

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Allow extra outputs from GenerationMixin.generate #39834

Description

Feature request

Motivation

Your contribution

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Allow extra outputs from `GenerationMixin.generate` #39834