Speech, Music and Hearing (TMH)

Research at the Division of Speech, Music and Hearing (TMH) is truly multi-disciplinary including linguistics, phonetics, auditory perception, vision and experimental psychology. Rooted in an engineering modelling approach, our research forms a solid base for developing multimodal human-computer interaction systems in which speech, music, sound and gestures combine to create human-like communication.

Research Area

The division is part of the Department of Intelligent Systems at the school of Electrical Engineering and Computer Science .

Conversational Systems

Human Speech and Communication

Music Informatics and Auditory Perception

Speech and Language Technologies

Social Robotics

Voice Science and Technical Vocology

Latest Publications

[1]

Moëll, B. & Sand Aronsson, F. (2025). Harm Reduction Strategies for Thoughtful Use of Large Language Models in the Medical Domain : Perspectives for Patients and Clinicians. Journal of Medical Internet Research, 27.

[2]

Mehta, S. (2025). Probabilistic Speech & Motion Synthesis : Towards More Expressive and Multimodal Generative Models (Doctoral thesis , KTH Royal Institute of Technology, TRITA-EECS-AVL 2025:76). Retrieved from https://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-368342.

[3]

Mehta, S., Gamper, H., Jojic, N. (2025). Make Some Noise : Towards LLM audio reasoning and generation using sound tokens. In ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). (pp. 1-5). Institute of Electrical and Electronics Engineers (IEEE).

[4]

Best, P., Araya-Salas, M., Ekström, A. G., Freitas, B., Jensen, F. H., Kershenbaum, A. ... Marxer, R. (2025). Bioacoustic fundamental frequency estimation : a cross-species dataset and deep learning baseline. Bioacoustics, 34(4), 419-446.

[5]

Cros Vila, L., Sturm, B., Casini, L. & Dalmazzo, D. (2025). The AI Music Arms Race : On the Detection of AI-Generated Music. Transactions of the International Society for Music Information Retrieval, 8(1), 179-194.

Full list in the KTH publications portal

At TMH we regularly hold seminars from talented minds and bright researchers. You can check our calendar for the upcoming seminars or register as a speaker.

TMH Seminar Speaker Registration

Events

Probabilistic Speech & Motion Synthesis

12 Sep

Public defences of doctoral theses

Friday 2025-09-12, 13:07

Location: Kollegiesalen, Brinellvägen 8, Stockholm

Doctoral student: Shivam Mehta , Tal, musik och hörsel, TMH

2025-09-12T13:07:00.000+02:00 2025-09-12T13:07:00.000+02:00 Probabilistic Speech & Motion Synthesis (Public defences of doctoral theses) Kollegiesalen, Brinellvägen 8, Stockholm (KTH, Stockholm, Sweden)Probabilistic Speech & Motion Synthesis (Public defences of doctoral theses)

https://www.kth.se/is/tmh/calendar

News

Can robots truly engage in meaningful conversations with humans?
4 Apr 2025

Researchers at KTH Royal Institute of Technology, are breaking new ground in Human-Robot Interaction. Dr. Bahar Irfan, Assist. Prof. Sanna Kuoppamäki, and Prof. Gabriel Skantze, latest studies tackle ...
Erik Ekstedt and Gabriel Skantze from the Division of Speech, Music and Hearing

How to predict a conversation
26 Sep 2022

The SIGIDAL best paper award went to Erik Ekstedt and Gabriel Skantze from Speech, Music and Hearing (TMH). Their model learns to predict what will happen in the next two seconds of the conversation. ...
Research on generating a faster iteration and a more personal voice for digital assistants
24 Jan 2022

Shivam Mehta, doctoral student at the Division of Speech, Music and Hearing, congratulations on winning the Poster exhibition at the EECS Winter Conference.
ICMI 2021 Best Paper Award Nomination!
21 Oct 2021
TMH gets Jury award at IVA Gala 2021
7 Oct 2021

Studies

Research

Collaboration

About KTH

Library

Speech, Music and Hearing (TMH)

Research Area

Latest Publications

Events

News

Contact