Audio classification has evolved with the adoption of deep learning models, particularly transformer-based architectures. These models offer improved performance and the ability to handle various tasks through a unified approach. However, the computational complexity associated with transformers remains a challenge, leading to the exploration of alternative methods such as state space models.
