DreamX-World is a general-purpose world model for interactive world simulation. It generates diverse, high-fidelity worlds that users can explore, control, and transform with event prompts.
The model is trained with a scalable data engine on Unreal Engine data, gameplay footage, and real-world videos, combined with camera estimation and strict data filtering to learn realistic dynamics and interactions. It follows a progressive training pipeline: learning fine-grained action control first, then open-ended event response, and using Reinforcement Learning to improve action following, interaction consistency, and visual fidelity. Finally, through forcing and distillation, DreamX-World achieves efficient inference, making interactive generation practical at scale.
- 2026.05.11: We open-sourced DreamX-World-5B-Cam and inference codes.
- ✔️ DreamX-World-5B-Cam Model.
- DreamX-World-14B-Cam Model.
- Autoregressive Video Generation Model.
- Audio-Video Joint Generation Model.
- Real-Time, Interactive, Long-horizon DreamX-World Model.
- Release Technical Report.
- Install dependencies
pip install -r requirements.txt- Download Wan2.2-5B-TI2V checkpoints from https://huggingface.co/Wan-AI
To generate videos, run the following script:
sh inference_5b.shPlease check out inference_README.md for detailed instructions.
| Model | Download Link | Details | Instrutions |
|---|---|---|---|
| DreamX-World-5B-Cam | Huggingface, ModelScope | w PRoPE Camera Control | inference_README.md |
DreamX-World enables high-fidelity, controllable exploration across diverse realistic environments, including indoor, urban, natural, and architectural scenes.
01.mp4 |
02.mp4 |
03.mp4 |
04.mp4 |
05.mp4 |
06.mp4 |
07.mp4 |
08.mp4 |
Beyond realistic scenes, DreamX-World also generates fantasy, game-like, sci-fi, and stylized worlds.
01.mp4 |
02.mp4 |
03.mp4 |
04.mp4 |
06.mp4 |
07.mp4 |
08.mp4 |
09.mp4 |
DreamX-World supports both first-person interaction and coherent third-person generation. It keeps camera-follow behavior stable while preserving controllable agent motion and scene consistency.
01.mp4 |
02.mp4 |
03.mp4 |
04.mp4 |
05.mp4 |
07.mp4 |
08.mp4 |
10.mp4 |
DreamX-World supports prompt-driven world events that dynamically change the environment, including flexible and compositional event generation with consistent temporal evolution.
- Single Event: A single event prompt triggers a specific world-changing interaction.
- Compositional Events: Multiple events compose together to create complex, multi-step world transformations.
01.mp4 |
02.mp4 |
03.mp4 |
04.mp4 |
05.mp4 |
06.mp4 |
07.mp4 |
08.mp4 |
Join our WeChat group for discussion:
Contact: 📧 ally.sl@alibaba-inc.com | hongxi.wjh@alibaba-inc.com
This project is licensed under Apache 2.0. See LICENSE for details.
We thank the Wan Team for open-sourcing their code and models.

