Diffusion Model Based on Reverse Guidance of Regional Samples
Main Article Content
Abstract
When discussing the classification of imbalanced datasets, due to their distribution characteristics, the scarce minority class makes the traditional classification methods biased toward the majority class, reducing minority class recognition. This article mainly starts with the data-level method. It expands the sample size of the minority class by a generative model to improve the classification accuracy and reduce the misclassification cost. Based on the characteristics of the complex distribution of the minority class and the advantages of Diffusion Models, this article proposed a Local Regional Samples Guidance Denoising Diffusion Probabilistic Model (LReDDPM). The method first divides the sample types of the minority class, takes the gradient information of regional samples as the condition, and then uses the denoising diffusion probabilistic model to generate minority class examples. The generated minority class examples are added to the training set to expand the sample size, enriching the local sample density of the minority class. In addition, we explore diffusion models guided by gradients derived from samples in different regions. The experimental results demonstrate that examples generated by models guided by samples from different regions exhibit varying degrees of improvement in classification performance, with the most significant enhancement observed in the safety and boundary regions. It further indicates that the complex distribution of the minority class plays a crucial role in the classification results. We conduct experiments on ten datasets and compare our results with those of five methods to evaluate the superiority and effectiveness of LReDDPM’s method. The final experimental results show that the proposed method can significantly improve classification performance.