MegaFlow: Zero-Shot Large Displacement Optical Flow
Paper • 2603.25739 • Published
Dingxi Zhang · Fangjinhua Wang · Marc Pollefeys · Haofei Xu
ETH Zurich · Microsoft · University of Tübingen, Tübingen AI Center
MegaFlow is a simple, powerful, and unified model for zero-shot large displacement optical flow and point tracking.
MegaFlow leverages pre-trained Vision Transformer features to naturally capture extreme motion, followed by lightweight iterative refinement for sub-pixel accuracy. It achieves state-of-the-art zero-shot performance across major optical flow benchmarks (Sintel, KITTI, Spring) and delivers highly competitive zero-shot generalizability on long-range point tracking benchmarks.
| Model ID | Task | Description |
|---|---|---|
megaflow-flow |
Optical flow | Full training curriculum (default) |
megaflow-chairs-things |
Optical flow | Trained on FlyingThings + FlyingChairs only |
megaflow-track |
Point tracking | Fine-tuned on Kubric |
pip install git+https://github.com/cvg/megaflow.git
Requirements: Python ≥ 3.12, PyTorch ≥ 2.7, CUDA recommended.
import torch
from megaflow import MegaFlow
device = "cuda" if torch.cuda.is_available() else "cpu"
# video: float32 tensor [1, T, 3, H, W], pixel values in [0, 255]
video = ...
model = MegaFlow.from_pretrained("megaflow-flow").eval().to(device)
with torch.inference_mode():
with torch.autocast(device_type=device, dtype=torch.bfloat16):
# Returns flow for consecutive pairs: (0→1, 1→2, ...)
# Shape: [1, T-1, 2, H, W]
flow = model(video, num_reg_refine=8)["flow_preds"][-1]
import torch
from megaflow import MegaFlow
from megaflow.utils.basic import gridcloud2d
device = "cuda" if torch.cuda.is_available() else "cpu"
# video: float32 tensor [1, T, 3, H, W], pixel values in [0, 255]
video = ...
model = MegaFlow.from_pretrained("megaflow-track").eval().to(device)
with torch.inference_mode():
with torch.autocast(device_type=device, dtype=torch.bfloat16):
# Returns dense offsets from frame 0 to each frame t
flows_e = model.forward_track(video, num_reg_refine=8)["flow_final"]
# Convert offsets to absolute coordinates
grid_xy = gridcloud2d(1, H, W, norm=False, device=device).float()
grid_xy = grid_xy.permute(0, 2, 1).reshape(1, 1, 2, H, W)
tracks = flows_e + grid_xy # [1, T, 2, H, W]
# Clone the repo and run demos
git clone https://github.com/cvg/megaflow.git
cd megaflow
# Optical flow on a video
python demo_flow.py --input assets/longboard.mp4 --output output/longboard_flow.mp4
# Dense point tracking
python demo_track.py --input assets/apple.mp4 --grid_size 8
# Gradio web UI
python demo_gradio.py
Or try the Colab notebook directly in the browser.
@article{zhang2026megaflow,
title = {MegaFlow: Zero-Shot Large Displacement Optical Flow},
author = {Zhang, Dingxi and Wang, Fangjinhua and Pollefeys, Marc and Xu, Haofei},
journal = {arXiv preprint arXiv:2603.25739},
year = {2026}
}