Running on CPU Upgrade 195 The Synthetic Data Playbook: Generating Trillions of the Finest Tokens π 195 Explore synthetic data experiments as an interactive bookshelf
RAE Collection Collection for Diffusion Transformers with Representation Autoencoders β’ 7 items β’ Updated 25 days ago β’ 13
view article Article Tokenization in Transformers v5: Simpler, Clearer, and More Modular +4 Dec 18, 2025 β’ 123
Pre-training Dataset Samples Collection A collection of pre-training datasets samples of sizes 10M, 100M and 1B tokens. Ideal for use in quick experimentation and ablations. β’ 18 items β’ Updated 13 days ago β’ 18
view article Article A Review on the Evolvement of Load Balancing Strategy in MoE LLMs: Pitfalls and Lessons Feb 4, 2025 β’ 31
AERIS: Argonne Earth Systems Model for Reliable and Skillful Predictions Paper β’ 2509.13523 β’ Published Sep 16, 2025 β’ 7
AERIS: Argonne Earth Systems Model for Reliable and Skillful Predictions Paper β’ 2509.13523 β’ Published Sep 16, 2025 β’ 7 β’ 2
AERIS: Argonne Earth Systems Model for Reliable and Skillful Predictions Paper β’ 2509.13523 β’ Published Sep 16, 2025 β’ 7
Nemotron-Pre-Training-Datasets Collection Large scale pre-training datasets used in the Nemotron family of models. β’ 12 items β’ Updated 3 days ago β’ 126