Implementation of DDPM and DDIM Sampling
To implement DDPM (Denoising Diffusion Probabilistic Models), you first define a noise schedule that gradually adds Gaussian noise to data over a fixed number of steps (e.g., 1,000). A neural network (like a U-Net) is trained to predict the noise at each step. During sampling, you start with pure noise and iteratively denoise it by applying the trained model. Each step uses the predicted noise to update the sample, following a Markov chain process (each step depends only on the previous state). For example, in PyTorch, the sampling loop might involve iterating backward from step T
to 0
, computing the mean and variance for each reverse diffusion step using the model’s predictions.
For DDIM (Denoasing Diffusion Implicit Models), the setup is similar, but the sampling process is reworked to remove the Markovian assumption. DDIM uses a non-Markovian forward process, allowing deterministic sampling by defining a trajectory that skips intermediate steps. To implement this, you modify the update rule during sampling: instead of relying on the full Markov chain, you use a subset of steps (e.g., 50–100) and adjust the noise prediction with a deterministic correction term. For instance, you might sample every k
steps from the original DDPM noise schedule and compute the update using a weighted combination of the current sample and the model’s prediction, controlled by a hyperparameter η
(set to 0
for fully deterministic behavior).
Key Differences in Sampling
The primary difference lies in sampling speed and determinism. DDPM requires all steps (e.g., 1,000) to generate a sample, making it computationally intensive. DDIM decouples the training and sampling processes, enabling faster generation with fewer steps (e.g., 50) while maintaining quality. For example, DDIM can skip steps by leveraging the learned noise predictions to approximate intermediate states, reducing inference time by 10–20×. Additionally, DDIM allows deterministic sampling (with η=0
), ensuring the same noise input always produces the same output, which is useful for reproducibility. In contrast, DDPM’s Markovian process is inherently stochastic, even with fixed initial noise.
When to Use Each Method
Use DDPM when you prioritize sample quality over speed, such as generating high-fidelity images for offline applications. Its step-by-step denoising aligns closely with the training objective, often producing more consistent results. However, for real-time applications (e.g., interactive tools), DDIM is preferable due to its faster sampling. For example, generating a 256x256 image with DDIM in 50 steps might take 2 seconds on a GPU, whereas DDPM could require 20 seconds. Developers can also fine-tune DDIM’s η
parameter to balance speed and sample diversity: higher η
introduces stochasticity, mimicking DDPM’s behavior but with fewer steps. Both methods use the same training pipeline, so switching between them often requires only modifying the sampling loop.
Zilliz Cloud is a managed vector database built on Milvus perfect for building GenAI applications.
Try FreeLike the article? Spread the word