Few.shot imbalanced classification based on data augmentation
مقال من تأليف: Chao, Xuewei ; Zhang, Lixin ;
ملخص: Few-shot imbalanced classification tasks are commonly faced in the real-world applications due to the unbalanced data distribution and few samples of rare classes. As known, the traditional machine learning algorithms perform poorly on the imbalanced classification, usually ignoring the few samples in the minority class to achieve a good overall accuracy. To solve this few-shot problem, a novel data augmentation method was proposed in this study, called H-SMOTE, to rebalance the original imbalanced data in a stable and reasonable way. Extensive experiments were carried out on 12 open datasets covering a wide range of imbalance rate from 3.8 to 16.4. Moreover, two typical classifiers SVM and Random Forest were selected to testify the performance and generalization of proposed H-SMOTE. Further, the typical data oversampling algorithm SMOTE was adopted as the baseline of comparison. The average experimental results show that the proposed H-SMOTE method outperforms the typical SMOTE in terms of accuracy (2.58%), recall (0.67%), F-measure (2.33%), G-mean (2.58%), and AUC (2.5%). Besides, the distribution of augmented dataset by H-SMOTE is more uniform and stable. Thus, this work provides a useful data augmentation method to solve the few-shot imbalanced classification, which can also be generalized to many areas in multimedia systems
لغة:
إنجليزية