Spectrograms and Beyond!

It is a pleasure to start the year (even though we are already in February) by celebrating the publication of a new article from our laboratory. This time, it is “Beyond Spectrograms: Rethinking Audio Classification from EnCodec’s Latent Space”. In this work, we ponder the question: What if the latent representation of a neural audio codec were used in an audio classification pipeline? What we have discovered will surprise you. Take a look at the paper and let us know your thoughts.

This paper is one of the results of the thesis being developed by Jorge Perianez Pascual, a member of our lab. Interestingly, the concept of using EnCodec’s latent space for classification was initially described as borderline unconventional, yet our results demonstrate its effectiveness. By stepping beyond traditional spectrogram-based methods, we uncover new possibilities for efficient and accurate audio classification.

This achievement would not have been possible without the collaboration of Álvaro Rubio Largo and Laura Escobar Encinas, two colleagues from the Universidad de Extremadura who contributed their multidisciplinary expertise both during the development and experimentation process and in the writing of the article.

Beyond Spectrograms is one of the results of musicgenia, a project funded by Grant CPP2021-008491 from MICIU/AEI/10.13039/50100011033 and by the European Union through NextGenerationEU/PRTR. The main goal of musicgenia is to develop a cloud-based platform that offers AI-generated production music as a service for content creators and media, both online (live music generation) and offline (pre-recorded music generation). The direct benefits of this platform include: (1) royalty-free music, (2) original music, (3) an easy way to find music that fits your content, and (4) streaming music, with a flexible consumption model where you pay per second rather than per song.

Juan D. Gutiérrez
Juan D. Gutiérrez
Assistant Professor

Assistant Professor at Universidade de Santiago de Compostela. I enjoy computing but, above all, learning new things.