Visual Generation Models
Collection
6 items β’ Updated β’ 1
How to use BiliSakura/NiT-diffusers with Diffusers:
pip install -U diffusers transformers accelerate
import torch
from diffusers import DiffusionPipeline
# switch to "mps" for apple devices
pipe = DiffusionPipeline.from_pretrained("BiliSakura/NiT-diffusers", dtype=torch.bfloat16, device_map="cuda")
prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"
image = pipe(prompt).images[0]Native diffusers implementation of NiT (Native-resolution Image Transformer). Each variant folder is self-contained:
pipeline.py β NiTPipelinescheduler/scheduler_config.json β FlowMatchEulerDiscreteScheduler config (class ships with Diffusers)transformer/nit_transformer_2d.py β NiTTransformer2DModelvae/ β AutoencoderDC weights + configNo separate NiT-diffusers package at inference time; only PyPI diffusers plus local custom code in the variant directory.
| Checkpoint | Path | Resolution | Recommended settings |
|---|---|---|---|
| NiT-XL | ./NiT-XL |
512Γ512 | 250 steps, CFG 2.05, interval (0.0, 0.7) |
Each variant keeps an English id2label map directly in its own model_index.json (DiT-style).
pipe.id2label β inspect id β English label correspondencepipe.labels β reverse map (English synonym β id), sorted for browsingpipe.get_label_ids("golden retriever")pipe(class_labels="golden retriever", ...) β string labels resolved automaticallyRun the bundled demo script from the repo root:
python demo_inference.py
This writes demo.png using NiT-XL with the settings below.
from pathlib import Path
import torch
from diffusers import DiffusionPipeline
model_dir = Path("./NiT-XL").resolve()
pipe = DiffusionPipeline.from_pretrained(
str(model_dir),
local_files_only=True,
custom_pipeline=str(model_dir / "pipeline.py"),
trust_remote_code=True,
torch_dtype=torch.bfloat16,
)
pipe.to("cuda")
print(pipe.id2label[207])
print(pipe.get_label_ids("golden retriever"))
generator = torch.Generator(device="cuda").manual_seed(42)
image = pipe(
class_labels="golden retriever",
height=512,
width=512,
num_inference_steps=250,
guidance_scale=2.05,
guidance_interval=(0.0, 0.7),
generator=generator,
).images[0]
image.save("demo.png")
Load a variant subfolder (e.g. ./NiT-XL), not the repo root.
Hub usage follows Hugging Face model-id style (UserID/RepoID):
from diffusers import DiffusionPipeline
import torch
pipe = DiffusionPipeline.from_pretrained(
"BiliSakura/NiT-diffusers",
subfolder="NiT-XL",
custom_pipeline="pipeline.py",
trust_remote_code=True,
torch_dtype=torch.bfloat16,
)
pipe.to("cuda")
@article{wang2025native,
title={Native-Resolution Image Synthesis},
author={Wang, Zidong and Bai, Lei and Yue, Xiangyu and Ouyang, Wanli and Zhang, Yiyuan},
year={2025},
eprint={2506.03131},
archivePrefix={arXiv},
primaryClass={cs.CV}
}