Skip to content

Integration with Hugging Face transformers library  #30

@SunMarc

Description

@SunMarc

Hi neuralmagic team !

Very nice work with AutoFP8 ! We were thinking of integrating AutoFP8 in transformers, so that users can run your checkpoints directly with transformers ! We would simply replace the linear layers by its quantized version. Hence, we would only support the inference. Let us know if you agree with this ! The goal would be to explose the quantized linear layer class in this repo (I see that you have several quantized linear) and import it in transformers.

I will be leading the integration, so any help is appreciated ! Also, are there any big blockers that I might not have seen ?

Thanks in advance !

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions