To KTH's start page To KTH's start page

Speech, Music and Hearing (TMH)

Research at the Division of Speech, Music and Hearing (TMH) is truly multi-disciplinary including linguistics, phonetics, auditory perception, vision and experimental psychology. Rooted in an engineering modelling approach, our research forms a solid base for developing multimodal human-computer interaction systems in which speech, music, sound and gestures combine to create human-like communication.

Research Area

Latest Publications

Wennberg, U., Henter, G. E. (2024). Exploring Internal Numeracy in Language Models: A Case Study on ALBERT. In MathNLP 2024: 2nd Workshop on Mathematical Natural Language Processing at LREC-COLING 2024 - Workshop Proceedings. (pp. 35-40). European Language Resources Association (ELRA).
Esfandiari-Baiat, G., Edlund, J. (2024). The MEET Corpus: Collocated, Distant and Hybrid Three-party Meetings with a Ranking Task. In ISA 2024: 20th Joint ACL - ISO Workshop on Interoperable Semantic Annotation at LREC-COLING 2024, Workshop Proceedings. (pp. 1-7). European Language Resources Association (ELRA).
Müller, M., Dixon, S., Volk, A., Sturm, B., Rao, P. & Gotham, M. (2024). Introducing the TISMIR Education Track: What, Why, How?. Transactions of the International Society for Music Information Retrieval, 7(1), 85-98.
Casini, L., Jonason, N., Sturm, B. (2024). Investigating the Viability of Masked Language Modeling for Symbolic Music Generation in abc-notation. In ARTIFICIAL INTELLIGENCE IN MUSIC, SOUND, ART AND DESIGN, EVOMUSART 2024. (pp. 84-96). Springer Nature.
Dalmazzo, D., Deguernel, K., Sturm, B. (2024). The Chordinator : Modeling Music Harmony by Implementing Transformer Networks and Token Strategies. In ARTIFICIAL INTELLIGENCE IN MUSIC, SOUND, ART AND DESIGN, EVOMUSART 2024. (pp. 52-66). Springer Nature.
Full list in the KTH publications portal