-
Hello, I want to train a ViT model with RoPE instead of absolute positional embedding. I noticed the following item in the list of supported models in the README, but I couldn't find out how to access such a model: "ROPE-ViT - https://arxiv.org/abs/2403.13298" There doesn't seem to be an option for it in the Thank you very much. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
Update: I just noticed that there is a |
Beta Was this translation helpful? Give feedback.
@sinahmr yes, currently all of the ViT models with ROPE embeddings in
timm
are based on the EVA model (eva.py
). It was the first to include ROPE embeddings and it's been extended to support several variants. It's essentially a ViT with ROPE support (w/ abs pos embed option), SwiGLU option. The comments in the file have references for the model sources, papers, etc. There are also several timm definitions that I trained with registers likevit_base_patch16_rope_reg1_gap_256.sbb_in1k
vit_betwixt_patch16_rope_reg4_gap_256.sbb_in1k
vit_medium_patch16_rope_reg1_gap_256.sbb_in1k
vit_mediumd_patch16_rope_reg1_gap_256.sbb_in1k
All of the models are a bit different, even at the same 'size' range. …