YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Infinite-World

Scaling Interactive World Models to 1000-Frame Horizons via Pose-Free Hierarchical Memory

Ruiqi Wu^1,2,3*, Xuanhua He^4,2*, Meng Cheng^2*, Tianyu Yang², Yong Zhang^2‡, Chunle Guo^1,3†, Chongyi Li^1,3, Ming-Ming Cheng^1,3

¹Nankai University ²Meituan ³NKIARI ⁴HKUST

^*Equal Contribution ^†Corresponding Author ^‡Project Leader

Highlights

Infinite-World is a robust interactive world model with:

Real-World Training — Trained on real-world videos without requiring perfect pose annotations or synthetic data
1000+ Frame Memory — Maintains coherent visual memory over 1000+ frames via Hierarchical Pose-free Memory Compressor (HPMC)
Robust Action Control — Uncertainty-aware action labeling ensures accurate action-response learning from noisy trajectories

Infinite-World Framework

Installation

Environment: Python 3.10, CUDA 12.4 recommended.

1. Create conda environment

conda create -n infworld python=3.10
conda activate infworld

2. Install PyTorch with CUDA 12.4

Install from the official PyTorch index (no local whl):

pip install torch==2.6.0 torchvision==0.21.0 --index-url https://download.pytorch.org/whl/cu124

3. Install Python dependencies

pip install -r requirements.txt

Checkpoint Configuration

All model paths are configured in configs/infworld_config.yaml. Paths are relative to the project root unless absolute.

Download checkpoints

Download from Wan-AI/Wan2.1-T2V-1.3B and place files under checkpoints/:

File / directory	Config key	Description
`models/Wan2.1_VAE.pth`	`vae_cfg.vae_pth`	VAE weights
`models/models_t5_umt5-xxl-enc-bf16.pth`	`text_encoder_cfg.checkpoint_path`	T5 text encoder
`models/google/umt5-xxl` (folder)	`text_encoder_cfg.tokenizer_path`	T5 tokenizer
`infinite_world_model.ckpt`	`checkpoint_path`	DiT model weights

DiT checkpoint: Can be downloaded from TBD.

Upload to Hugging Face (including checkpoints)

To upload this repo to Hugging Face Hub (code + checkpoints/):

Login
```
pip install huggingface_hub
huggingface-cli login
```
Use a token from https://huggingface.co/settings/tokens (need write permission).
Upload From the project root (infinite-world/):
```
python scripts/upload_to_hf.py YOUR_USERNAME/infinite-world
```
Or set the repo and run:
```
export HF_REPO_ID=YOUR_USERNAME/infinite-world
python scripts/upload_to_hf.py
```
The script uploads the whole directory (including checkpoints/) and skips __pycache__, outputs, .git, etc. Large checkpoint files are uploaded via the Hub API; the first run may take a while depending on size and network.
Create repo manually (optional)
You can create the model repo first at https://huggingface.co/new (type: Model), then run the script with that repo_id.

Results

Quantitative Comparison

Model	Mot. Smo.↑	Dyn. Deg.↑	Aes. Qual.↑	Img. Qual.↑	Avg. Score↑	Memory↓	Fidelity↓	Action↓	ELO Rating↑
Hunyuan-GameCraft	0.9855	0.9896	0.5380	0.6010	0.7785	2.67	2.49	2.56	1311
Matrix-Game 2.0	0.9788	1.0000	0.5267	0.7215	0.8068	2.98	2.91	1.78	1432
Yume 1.5	0.9861	0.9896	0.5840	0.6969	0.8141	2.43	1.91	2.47	1495
HY-World-1.5	0.9905	1.0000	0.5280	0.6611	0.7949	2.59	2.78	1.50	1542
Infinite-World	0.9876	1.0000	0.5440	0.7159	0.8119	1.92	1.67	1.54	1719

Citation

If you find this work useful, please consider citing:

@article{wu2026infiniteworld,
  title={Infinite-World: Scaling Interactive World Models to 1000-Frame Horizons via Pose-Free Hierarchical Memory},
  author={Wu, Ruiqi and He, Xuanhua and Cheng, Meng and Yang, Tianyu and Zhang, Yong and Kang, Zhuoliang and Cai, Xunliang and Wei, Xiaoming and Guo, Chunle and Li, Chongyi and Cheng, Ming-Ming},
  journal={arXiv preprint arXiv:2602.02393},
  year={2026}
}

License

This project is released under the MIT License.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for MeiGen-AI/Infinite-World

Infinite-World: Scaling Interactive World Models to 1000-Frame Horizons via Pose-Free Hierarchical Memory

Paper • 2602.02393 • Published 3 days ago