Submitted by Jingfeng Yao 101 Towards Scalable Pre-training of Visual Tokenizers for Generation MiniMax 422 4