Integration with Hugging Face transformers library 

Hi neuralmagic team ! 

Very nice work with AutoFP8 ! We were thinking of integrating AutoFP8 in transformers, so that users can run your checkpoints directly with transformers ! We would simply replace the linear layers by its quantized version. Hence, we would only support the inference. Let us know if you agree with this ! The goal would be to explose the quantized linear layer class in this repo (I see that you have several quantized linear) and import it in transformers.  

I will be leading the integration, so any help is appreciated ! Also, are there any big blockers that I might not have seen ?

Thanks in advance ! 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Integration with Hugging Face transformers library #30

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Integration with Hugging Face transformers library #30

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions