Till KTH:s startsida

Speech, Music and Hearing (TMH)

Research at the Division of Speech, Music and Hearing (TMH) is truly multi-disciplinary including linguistics, phonetics, auditory perception, vision and experimental psychology. Rooted in an engineering modelling approach, our research forms a solid base for developing multimodal human-computer interaction systems in which speech, music, sound and gestures combine to create human-like communication.

Research Area

Latest Publications

[1]
Grouwels, J., Jonason, N., Sturm, B. (2025). Exploring the Expressive Space of an Articulatory Vocal Modal using Quality-Diversity Optimization with Multimodal Embeddings. I GECCO 2025 - Proceedings of the 2025 Genetic and Evolutionary Computation Conference. (s. 1362-1370). Association for Computing Machinery (ACM).
[2]
Cavalcanti, J. C., Skantze, G. (2025). "Dyadosyncrasy", Idiosyncrasy and Demographic Factors in Turn-Taking. I Proceedings of the Interspeech 2025. Rotterdam, The Netherlands: ISCA.
[4]
Mehta, S. (2025). Probabilistic Speech & Motion Synthesis : Towards More Expressive and Multimodal Generative Models (Doktorsavhandling , KTH Royal Institute of Technology, TRITA-EECS-AVL 2025:76). Hämtad från https://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-368342.
[5]
Mehta, S., Gamper, H., Jojic, N. (2025). Make Some Noise : Towards LLM audio reasoning and generation using sound tokens. I ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). (s. 1-5). Institute of Electrical and Electronics Engineers (IEEE).
Fullständig lista i KTH:s publikationsportal

Events

News