Skip to content

different original_max_position_embeddings #5

@xionghao132

Description

@xionghao132

Thank you for your amazing work on the Qwen23 models.

I have a question about the configuration for the Qwen3-32B model. In the config.json file, the value for original_max_position_embeddings is set to 40960. However, the mirothinker was trained on a original_max_position_embeddings of 32k tokens using YaRN.

Will this setting affect the model's performance?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions