Condition Errors Refinement in Autoregressive Image Generation with Diffusion Loss
Abstract
Autoregressive models with diffusion loss outperform traditional diffusion models by effectively mitigating condition errors through patch denoising optimization and condition refinement using Optimal Transport theory.
Recent studies have explored autoregressive models for image generation, with promising results, and have combined diffusion models with autoregressive frameworks to optimize image generation via diffusion losses. In this study, we present a theoretical analysis of diffusion and autoregressive models with diffusion loss, highlighting the latter's advantages. We present a theoretical comparison of conditional diffusion and autoregressive diffusion with diffusion loss, demonstrating that patch denoising optimization in autoregressive models effectively mitigates condition errors and leads to a stable condition distribution. Our analysis also reveals that autoregressive condition generation refines the condition, causing the condition error influence to decay exponentially. In addition, we introduce a novel condition refinement approach based on Optimal Transport (OT) theory to address ``condition inconsistency''. We theoretically demonstrate that formulating condition refinement as a Wasserstein Gradient Flow ensures convergence toward the ideal condition distribution, effectively mitigating condition inconsistency. Experiments demonstrate the superiority of our method over diffusion and autoregressive models with diffusion loss methods.
Community
This study presents a theoretical analysis of autoregressive image generation with diffusion loss, demonstrating that patch denoising optimization effectively mitigates condition errors and leads to a stable condition distribution. To further address condition inconsistency, we introduce a novel condition refinement approach based on Optimal Transport theory, which outperforms existing diffusion and autoregressive baselines in experiments.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Enhancing diffusion models with Gaussianization preprocessing (2025)
- Pathwise Test-Time Correction for Autoregressive Long Video Generation (2026)
- Better Source, Better Flow: Learning Condition-Dependent Source Distribution for Flow Matching (2026)
- On Stability and Robustness of Diffusion Posterior Sampling for Bayesian Inverse Problems (2026)
- SSI-DM: Singularity Skipping Inversion of Diffusion Models (2026)
- SoFlow: Solution Flow Models for One-Step Generative Modeling (2025)
- SURE Guided Posterior Sampling: Trajectory Correction for Diffusion-Based Inverse Problems (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper