tttLRM: Test-Time Training for Long Context and Autoregressive 3D Reconstruction Paper • 2602.20160 • Published 12 days ago • 10
Llama-3.1-FoundationAI-SecurityLLM-Reasoning-8B Technical Report Paper • 2601.21051 • Published Jan 28 • 14
Creative Writing Datasets Collection High-quality creative writing and storytelling data. • 35 items • Updated 12 days ago • 4
SpargeAttention2: Trainable Sparse Attention via Hybrid Top-k+Top-p Masking and Distillation Fine-Tuning Paper • 2602.13515 • Published 22 days ago • 43
DiffusionRenderer: Neural Inverse and Forward Rendering with Video Diffusion Models Paper • 2501.18590 • Published Jan 30, 2025 • 1
REAP the Experts: Why Pruning Prevails for One-Shot MoE compression Paper • 2510.13999 • Published Oct 15, 2025 • 14
LLaDA2.1: Speeding Up Text Diffusion via Token Editing Paper • 2602.08676 • Published 27 days ago • 68
AudioSAE: Towards Understanding of Audio-Processing Models with Sparse AutoEncoders Paper • 2602.05027 • Published Feb 4 • 60
DINO-SAE: DINO Spherical Autoencoder for High-Fidelity Image Reconstruction and Generation Paper • 2601.22904 • Published Jan 30 • 15
PixelGen: Pixel Diffusion Beats Latent Diffusion with Perceptual Loss Paper • 2602.02493 • Published Feb 2 • 44
Beyond Output Critique: Self-Correction via Task Distillation Paper • 2602.00871 • Published Jan 31 • 2
Self-Improving Pretraining: using post-trained models to pretrain better models Paper • 2601.21343 • Published Jan 29 • 17
Late-to-Early Training: LET LLMs Learn Earlier, So Faster and Better Paper • 2602.05393 • Published about 1 month ago • 8
Chronicals: A High-Performance Framework for LLM Fine-Tuning with 3.51x Speedup over Unsloth Paper • 2601.02609 • Published Jan 6 • 2
Reinforcement Learning from Meta-Evaluation: Aligning Language Models Without Ground-Truth Labels Paper • 2601.21268 • Published Jan 29 • 4
Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text Paper • 2601.22975 • Published Jan 30 • 109