Infinite-World
Scaling Interactive World Models to 1000-Frame Horizons via Pose-Free Hierarchical Memory
Ruiqi Wu1,2,3*, Xuanhua He4,2*, Meng Cheng2*, Tianyu Yang2, Yong Zhang2‡, Chunle Guo1,3†, Chongyi Li1,3, Ming-Ming Cheng1,3
1Nankai University 2Meituan 3NKIARI 4HKUST
*Equal Contribution †Corresponding Author ‡Project Leader
Highlights
Infinite-World is a robust interactive world model with:
- Real-World Training — Trained on real-world videos without requiring perfect pose annotations or synthetic data
- 1000+ Frame Memory — Maintains coherent visual memory over 1000+ frames via Hierarchical Pose-free Memory Compressor (HPMC)
- Robust Action Control — Uncertainty-aware action labeling ensures accurate action-response learning from noisy trajectories
Installation
Environment: Python 3.10, CUDA 12.4 recommended.
1. Create conda environment
conda create -n infworld python=3.10
conda activate infworld
2. Install PyTorch with CUDA 12.4
Install from the official PyTorch index (no local whl):
pip install torch==2.6.0 torchvision==0.21.0 --index-url https://download.pytorch.org/whl/cu124
3. Install Python dependencies
pip install -r requirements.txt
Checkpoint Configuration
All model paths are configured in configs/infworld_config.yaml. Paths are relative to the project root unless absolute.
Download checkpoints
Download from Wan-AI/Wan2.1-T2V-1.3B and place files under checkpoints/:
| File / directory | Config key | Description |
|---|---|---|
models/Wan2.1_VAE.pth |
vae_cfg.vae_pth |
VAE weights |
models/models_t5_umt5-xxl-enc-bf16.pth |
text_encoder_cfg.checkpoint_path |
T5 text encoder |
models/google/umt5-xxl (folder) |
text_encoder_cfg.tokenizer_path |
T5 tokenizer |
infinite_world_model.ckpt |
checkpoint_path |
DiT model weights |
- DiT checkpoint: Can be downloaded from TBD.
Upload to Hugging Face (including checkpoints)
To upload this repo to Hugging Face Hub (code + checkpoints/):
Login
pip install huggingface_hub huggingface-cli loginUse a token from https://huggingface.co/settings/tokens (need write permission).
Upload From the project root (
infinite-world/):python scripts/upload_to_hf.py YOUR_USERNAME/infinite-worldOr set the repo and run:
export HF_REPO_ID=YOUR_USERNAME/infinite-world python scripts/upload_to_hf.pyThe script uploads the whole directory (including
checkpoints/) and skips__pycache__,outputs,.git, etc. Large checkpoint files are uploaded via the Hub API; the first run may take a while depending on size and network.Create repo manually (optional)
You can create the model repo first at https://huggingface.co/new (type: Model), then run the script with thatrepo_id.
Results
Quantitative Comparison
| Model | Mot. Smo.↑ | Dyn. Deg.↑ | Aes. Qual.↑ | Img. Qual.↑ | Avg. Score↑ | Memory↓ | Fidelity↓ | Action↓ | ELO Rating↑ |
|---|---|---|---|---|---|---|---|---|---|
| Hunyuan-GameCraft | 0.9855 | 0.9896 | 0.5380 | 0.6010 | 0.7785 | 2.67 | 2.49 | 2.56 | 1311 |
| Matrix-Game 2.0 | 0.9788 | 1.0000 | 0.5267 | 0.7215 | 0.8068 | 2.98 | 2.91 | 1.78 | 1432 |
| Yume 1.5 | 0.9861 | 0.9896 | 0.5840 | 0.6969 | 0.8141 | 2.43 | 1.91 | 2.47 | 1495 |
| HY-World-1.5 | 0.9905 | 1.0000 | 0.5280 | 0.6611 | 0.7949 | 2.59 | 2.78 | 1.50 | 1542 |
| Infinite-World | 0.9876 | 1.0000 | 0.5440 | 0.7159 | 0.8119 | 1.92 | 1.67 | 1.54 | 1719 |
Citation
If you find this work useful, please consider citing:
@article{wu2026infiniteworld,
title={Infinite-World: Scaling Interactive World Models to 1000-Frame Horizons via Pose-Free Hierarchical Memory},
author={Wu, Ruiqi and He, Xuanhua and Cheng, Meng and Yang, Tianyu and Zhang, Yong and Kang, Zhuoliang and Cai, Xunliang and Wei, Xiaoming and Guo, Chunle and Li, Chongyi and Cheng, Ming-Ming},
journal={arXiv preprint arXiv:2602.02393},
year={2026}
}
License
This project is released under the MIT License.