Speech, Music and Hearing (TMH)

Research at the Division of Speech, Music and Hearing (TMH) is truly multi-disciplinary including linguistics, phonetics, auditory perception, vision and experimental psychology. Rooted in an engineering modelling approach, our research forms a solid base for developing multimodal human-computer interaction systems in which speech, music, sound and gestures combine to create human-like communication.

Research Area

The division is part of the Department of Intelligenta System at the school of Elektroteknik och Datavetenskap .

Conversational Systems

Human Speech and Communication

Music Informatics and Auditory Perception

Speech and Language Technologies

Social Robotics

Voice Science and Technical Vocology

Latest Publications

[1]

Malmberg, F., Klezovich, A., Mesch, J., Beskow, J. (2024). Exploring Latent Sign Language Representations with Isolated Signs, Sentences and In-the-Wild Data. I 11th Workshop on the Representation and Processing of Sign Languages: Evaluation of Sign Language Resources, sign-lang@LREC-COLING 2024. (s. 219-224). Association for Computational Linguistics (ACL).

[2]

Mehta, S., Tu, R., Beskow, J., Székely, É., Henter, G. E. (2024). MATCHA-TTS: A FAST TTS ARCHITECTURE WITH CONDITIONAL FLOW MATCHING. I 2024 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2024 - Proceedings. (s. 11341-11345). Institute of Electrical and Electronics Engineers (IEEE).

[3]

Amerotti, M., Sturm, B., Benford, S., Maruri-Aguilar, H., Vear, C. (2024). Evaluation of an Interactive Music Performance System in the Context of Irish Traditional Dance Music. I Proceedings New Interfaces for Musical Expression NIME’24..

[4]

Jonason, N., Wang, X., Cooper, E., Juvela, L., Sturm, B., Yamagishi, J. (2024). DDSP-based Neural Waveform Synthesis of Polyphonic Guitar Performance from String-wise MIDI Input. I Proceedings of the 27th International Conference on Digital Audio Effects (DAFx24)..

[5]

Tånnander, C., O'Regan, J., House, D., Edlund, J., Beskow, J. (2024). Prosodic characteristics of English-accented Swedish neural TTS. I Proceedings of Speech Prosody 2024. (s. 1035-1039). Leiden, The Netherlands: International Speech Communication Association.

Fullständig lista i KTH:s publikationsportal

At TMH we regularly hold seminars from talented minds and bright researchers. You can check our calendar for the upcoming seminars or register as a speaker.

TMH Seminar Speaker Registration

Events

Inga aktuella kalenderhändelser just nu.

https://www.kth.se/is/tmh/calendar

News

Erik Ekstedt och Gabriel Skantze från Avdelningen för tal, musik och hörsel

Hur man förutsäger en konversation
26 sep 2022

SIGIDALs best paper award gick till Erik Ekstedt och Gabriel Skantze från Tal, musik och hörsel (TMH). Deras modell kan förutsäga vad som kommer att hända under de kommande två sekunderna av ett samta...
Snabbare iteration och mer personlig röst för digitala assistenter
28 jan 2022

Shivam Mehta, doktorand på Avdelningen för tal musik pocherade hörsel, grattis till bästa poster på EECS vinterkonferens. Berätta lite om din forskning för oss.
ICMI 2021 Best Paper Award Nomination!
21 okt 2021
TMH gets Jury award at IVA Gala 2021
7 okt 2021
TMH awarded best paper award at HRI 2021
7 okt 2021

Utbildning

Forskning

Samverkan

Om KTH

Bibliotek

Speech, Music and Hearing (TMH)

Research Area

Latest Publications

Events

News

Kontakt