A Closer Look at Time Steps is Worthy of Triple Speed-Up for Diffusion Model Training

翻译:训练扩散模型总是计算密集型任务。在本文中,我们提出了一种新的加速扩散模型训练的方法,称为动态采样策略,它是基于对时间步的更仔细的观察。我们的关键发现是:i)可以根据过程增量将时间步分为加速、减速和收敛区域。ii)这些时间步是不平衡的,其中许多集中在收敛区域。iii)集中在收敛区域的步骤对扩散训练的收益有限。为了解决这个问题,我们设计了一个非对称采样策略,减少从收敛区域的步骤的频率,同时增加从其他区域的步骤的采样概率。此外,我们提出了一种加权策略,强调快速变化过程增量的时刻的重要性。作为可插拔和架构无关的方法,SpeeD在各种扩散架构、数据集和任务上始终实现3倍的加速。值得注意的是,由于其简单的设计,我们的方法在扩散模型训练的成本上大幅降低。我们的研究使更多的研究人员能够以较低的成本训练扩散模型。

Training diffusion models is always a computation-intensive task. In this paper, we introduce a novel speed-up method for diffusion model training, called, which is based on a closer look at time steps. Our key findings are: i) Time steps can be empirically divided into acceleration, deceleration, and convergence areas based on the process increment. ii) These time steps are imbalanced, with many concentrated in the convergence area. iii) The concentrated steps provide limited benefits for diffusion training. To address this, we design an asymmetric sampling strategy that reduces the frequency of steps from the convergence area while increasing the sampling probability for steps from other areas. Additionally, we propose a weighting strategy to emphasize the importance of time steps with rapid-change process increments. As a plug-and-play and architecture-agnostic approach, SpeeD consistently achieves 3-times acceleration across various diffusion architectures, datasets, and tasks. Notably, due to its simple design, our approach significantly reduces the cost of diffusion model training with minimal overhead. Our research enables more researchers to train diffusion models at a lower cost.

https://arxiv.org/abs/2405.17403

https://arxiv.org/pdf/2405.17403.pdf

发表回复

您的电子邮箱地址不会被公开。 必填项已用 * 标注