oguzhanercan 's Collections Image Generation
updated
Causal Diffusion Transformers for Generative Modeling
Paper
• 2412.12095
• Published
• 23
SnapGen: Taming High-Resolution Text-to-Image Models for Mobile Devices
with Efficient Architectures and Training
Paper
• 2412.09619
• Published
• 30
DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for
Customized Manga Generation
Paper
• 2412.07589
• Published
• 48
Flowing from Words to Pixels: A Framework for Cross-Modality Evolution
Paper
• 2412.15213
• Published
• 28
CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers
Up
Paper
• 2412.16112
• Published
• 23
Parallelized Autoregressive Visual Generation
Paper
• 2412.15119
• Published
• 53
Democratizing Text-to-Image Masked Generative Models with Compact
Text-Aware One-Dimensional Tokens
Paper
• 2501.07730
• Published
• 18
SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute
in Linear Diffusion Transformer
Paper
• 2501.18427
• Published
• 24
Improved Training Technique for Latent Consistency Models
Paper
• 2502.01441
• Published
• 8
I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning
in Diffusion Models
Paper
• 2502.10458
• Published
• 38
EQ-VAE: Equivariance Regularized Latent Space for Improved Generative
Image Modeling
Paper
• 2502.09509
• Published
• 8
LDGen: Enhancing Text-to-Image Synthesis via Large Language Model-Driven
Language Representation
Paper
• 2502.18302
• Published
• 5
How far can we go with ImageNet for Text-to-Image generation?
Paper
• 2502.21318
• Published
• 26
RectifiedHR: Enable Efficient High-Resolution Image Generation via
Energy Rectification
Paper
• 2503.02537
• Published
• 12
Inductive Moment Matching
Paper
• 2503.07565
• Published
• 6
Autoregressive Image Generation with Randomized Parallel Decoding
Paper
• 2503.10568
• Published
• 9
DiT-Air: Revisiting the Efficiency of Diffusion Model Architecture
Design in Text to Image Generation
Paper
• 2503.10618
• Published
• 19
Neighboring Autoregressive Modeling for Efficient Visual Generation
Paper
• 2503.10696
• Published
• 8
Paper
• 2503.16425
• Published
• 16
Ultra-Resolution Adaptation with Ease
Paper
• 2503.16322
• Published
• 13
When Less is Enough: Adaptive Token Reduction for Efficient Image
Representation
Paper
• 2503.16660
• Published
• 72
Equivariant Image Modeling
Paper
• 2503.18948
• Published
• 15
Diffusion-4K: Ultra-High-Resolution Image Synthesis with Latent
Diffusion Models
Paper
• 2503.18352
• Published
• 6
TextCrafter: Accurately Rendering Multiple Texts in Complex Visual
Scenes
Paper
• 2503.23461
• Published
• 94
HiFlow: Training-free High-Resolution Image Generation with Flow-Aligned
Guidance
Paper
• 2504.06232
• Published
• 13
VisualCloze: A Universal Image Generation Framework via Visual
In-Context Learning
Paper
• 2504.07960
• Published
• 50
PixelFlow: Pixel-Space Generative Models with Flow
Paper
• 2504.07963
• Published
• 18
GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for
Autoregressive Image Generation
Paper
• 2504.08736
• Published
• 46
SimpleAR: Pushing the Frontier of Autoregressive Visual Generation
through Pretraining, SFT, and RL
Paper
• 2504.11455
• Published
• 14
REPA-E: Unlocking VAE for End-to-End Tuning with Latent Diffusion
Transformers
Paper
• 2504.10483
• Published
• 22
T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level
and Token-level CoT
Paper
• 2505.00703
• Published
• 44
End-to-End Vision Tokenizer Tuning
Paper
• 2505.10562
• Published
• 22
STARFlow: Scaling Latent Normalizing Flows for High-resolution Image
Synthesis
Paper
• 2506.06276
• Published
• 26
Improving Progressive Generation with Decomposable Flow Matching
Paper
• 2506.19839
• Published
• 8
Qwen-Image Technical Report
Paper
• 2508.02324
• Published
• 272
PixNerd: Pixel Neural Field Diffusion
Paper
• 2507.23268
• Published
• 52
DiT360: High-Fidelity Panoramic Image Generation via Hybrid Training
Paper
• 2510.11712
• Published
• 31
Generating an Image From 1,000 Words: Enhancing Text-to-Image With
Structured Captions
Paper
• 2511.06876
• Published
• 28
FARMER: Flow AutoRegressive Transformer over Pixels
Paper
• 2510.23588
• Published
• 59
Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation
Paper
• 2511.14993
• Published
• 231
DeCo: Frequency-Decoupled Pixel Diffusion for End-to-End Image Generation
Paper
• 2511.19365
• Published
• 64
Terminal Velocity Matching
Paper
• 2511.19797
• Published
• 12
OmniAlpha: A Sequence-to-Sequence Framework for Unified Multi-Task RGBA Generation
Paper
• 2511.20211
• Published
• 12
PixelGen: Pixel Diffusion Beats Latent Diffusion with Perceptual Loss
Paper
• 2602.02493
• Published
• 44
Image Generation with a Sphere Encoder
Paper
• 2602.15030
• Published
• 15