Submitted by Jingfeng Yao 90 Towards Scalable Pre-training of Visual Tokenizers for Generation MiniMax 280 4