Diffusion models handle label imbalance in conditional settings by adjusting training strategies, modifying loss functions, and leveraging sampling techniques. In conditional generation, the model uses labels to guide the denoising process, but imbalanced data can lead to poor performance on underrepresented classes. To mitigate this, diffusion models often incorporate methods like weighted loss functions, data augmentation, and adjustments to the guidance scale during sampling. These approaches ensure that minority classes receive sufficient attention during both training and inference, even when their examples are scarce in the dataset.
One common strategy is to reweight the loss function to prioritize underrepresented labels. For example, if a dataset has 1,000 “cat” images but only 100 “dog” images, the training loss for “dog” examples can be multiplied by a factor (e.g., 10x) to amplify their impact on gradient updates. This forces the model to focus more on learning the patterns of the minority class. Additionally, classifier-free guidance—a technique where the model learns to generate samples with and without explicit labels—can be adjusted during inference. By increasing the guidance scale for underrepresented classes, the model emphasizes the conditional signal for those labels, improving their generation quality. For instance, generating rare medical conditions in X-rays might require a higher guidance scale to ensure the model respects the specific features of those cases.
Architectural adjustments and data augmentation also play a role. Some implementations oversample minority classes during training or apply domain-specific augmentations (e.g., rotating or cropping rare images) to artificially increase their effective count. In medical imaging, where a rare disease might appear in only 2% of scans, the model could use separate embedding layers for rare and common classes to better capture their distinct features. Another approach involves fine-tuning the model on a balanced subset of data after initial training on the full imbalanced dataset. For example, a diffusion model trained on a biased face dataset might later be fineuned on equal numbers of underrepresented demographics to reduce bias in generated images. These methods collectively help diffusion models address label imbalance without requiring fundamental changes to their core architecture.
Zilliz Cloud is a managed vector database built on Milvus perfect for building GenAI applications.
Try FreeLike the article? Spread the word