Shenyuan Gao, Siyuan Zhou, Yilun Du, Jun Zhang, Chuang Gan
TL;DR: AdaWorld is a highly adaptable world model pretrained with continuous latent actions from thousands of environments, enabling zero-shot action transfer, fast adaptation, and new skill acquisition with minimal finetuning.
We introduce latent actions as a unified condition for action-aware pretraining from videos. AdaWorld can readily transfer actions across contexts without training. By initializing the control interface with the corresponding latent actions, AdaWorld can also be adapted into specialized world models efficiently and achieve significantly better planning results.
- Action transfer (source video → target scene)
- Visual planning (action-agnostic vs. AdaWorld)
Our idea is implemented based on Vista and Jafar. Thanks for their great open-source work!
If any parts of our paper and code help your research, please consider citing us and giving a star to our repository.
@inproceedings{gao2025adaworld,
title={AdaWorld: Learning Adaptable World Models with Latent Actions},
author={Gao, Shenyuan and Zhou, Siyuan and Du, Yilun and Zhang, Jun and Gan, Chuang},
booktitle={International Conference on Machine Learning (ICML)},
year={2025}
}
If you have any questions or comments, feel free to contact me through email ([email protected]). Suggestions and collaborations are also highly welcome!