Google will be presenting over two dozen research papers at the 24th edition of the Conference of the International Speech Communication Association (INTERSPEECH 2023). Two of the papers presented are DeePMOS, a deep neural network approach for estimating speech signal quality, and LanSER, a method to enhance Speech Emotion Recognition (SER) models. DeePMOS provides a distribution of mean-opinion-scores (MOS) with its average and spread, while LanSER leverages language models to capture the contextual information of the utterance. Both papers demonstrate comparable performance to existing methods, with reduced parameters and computational memory usage.
