This article provides an overview of methods for addressing class imbalance in deep learning-based natural language processing (NLP) tasks. It discusses the challenge of training supervised models using manually labeled datasets, where the distribution of samples across classes is usually highly skewed. It also explains how this class imbalance can pose a challenge for NLP models, which are more heavily influenced by majority class data during training. The article then provides an overview of class imbalance methods for deep learning-based NLP organized by method type.
