TMH Publications (latest 50)
Below are the 50 latest publications from the Department of Speech, Music and Hearing.
TMH Publications
[1]
Pandey, A., Edlund, J., Le Maguer, S. & Harte, N. (2026).
The use of variable length stimuli for assessing segmental distortion in TTS evaluation.
Computer speech & language (Print), 97.
[2]
Bokkahalli Satish, S. H., Henter, G. E., Székely, É. (2026).
When Voice Matters : Evidence of Gender Disparity in Positional Bias of SpeechLLMs.
In Speech and Computer - 27th International Conference, SPECOM 2025, Proceedings. (pp. 25-38). Springer Nature.
[3]
Amerotti, M., Benford, S., Sturm, B. L.T., Vear, C. (2026).
A Live Performance Rule System Informed by Irish Traditional Dance Music.
In Music and Sound Generation in the AI Era - 16th International Symposium, CMMR 2023, Revised Selected Papers. (pp. 127-139). Springer Nature.
[4]
Vaddadi, B., Axelsson, A., Skantze, G. (2026).
The Role of Social Robots in Autonomous Public Transport.
In Transport Transitions: Advancing Sustainable and Inclusive Mobility: Proceedings of the 10th TRA Conference, 2024, Dublin, Ireland - Volume 1: Safe and Equitable Transport. (pp. 711-716). Springer Nature.
[5]
Gurstad-Nilsson, H., Kanhov, E., Bryngelsson, P., Niklasson, M. & Degerman, P. (2025).
Going forward by moving backwards : a perpetual dialectic movement.
In Patrick Hopkinson; Mats Niklasson (Ed.), Discovery of International Digital Collaborative Autoethnographical Psychobiography: Knowing You Knowing Me (pp. 63-92). Emerald.
[6]
Andin, J., Ellis, R., Ingo, E. & Nordqvist, P. (2025).
Effects of remote work and hearing loss status on well-being and communication in individuals with hearing loss before, during, and after the COVID-19 pandemic : a retrospective survey study.
International Journal of Audiology.
[7]
Borg, A., Schiött, J., Ivegren, W., Gentline, C., Huss, V., Hugelius, A. ... Parodis, I. (2025).
AI-Enhanced Social Robotic Versus Computer-Based Virtual Patients for Clinical Reasoning Training in Medical Education : Observational Crossover Cohort Study.
Journal of Medical Internet Research, 27.
[8]
Borg, A., Jobs, B., Huss, V., Gentline, C., Espinosa, F., Ruiz, M. ... Parodis, I. (2025).
A qualitative comparison of clinical reasoning training : LLM-powered social robotic versus computer-based virtual patients for undergraduate medical education in rheumatology.
Scandinavian Journal of Rheumatology, 54(Suppl. 132), 302-302.
[9]
Willemsen, B., Skantze, G. (2025).
Detecting Referring Expressions in Visually Grounded Dialogue with Autoregressive Language Models.
Presented at XLLM @ ACL 2025, The 1st Joint Workshop on Large Language Models and Structure Modeling, Vienna, Austria, Aug 1st, 2025.
[10]
Stinkeste, C., Skantze, G. (2025).
Linguistic Anthropomorphism in Chatbots : Effects of Style, Topic, and Interaction on Users’ Perceptions and Behaviors.
In HAI '25: Proceedings of the 13th International Conference on Human-Agent Interaction. (pp. 321-331). Association for Computing Machinery (ACM).
[11]
Stinkeste, C., Wikström Kempe, A., Skantze, G. (2025).
Manners Matter : How Robot Politeness Influences Human Risk-Taking and Social Perception.
In Procceedings 34th IEEE International Conference on Robot and Human Interactive Communication, RO-MAN 2025. (pp. 1025-1032). Institute of Electrical and Electronics Engineers (IEEE).
[12]
Stinkeste, C., Dreber, A., Olofsson, J. & Skantze, G. (2025).
Comparing the audience effect of anthropomorphic robots and humans in economic games.
Computers in Human Behavior: Artificial Humans, 6.
[13]
Ashkenazi, S., Srour-Zreik, R., Skantze, G., Stuart-Smith, J., Foster, M. E. (2025).
Participatory Design for Human-Robot Interaction with Syrian Refugees and Asylum Seekers.
In Social Robotics + AI 17th International Conference, ICSR+AI 2025, Naples, Italy, September 10–12, 2025, Proceedings, Part II. (pp. 92-105). Springer Nature.
[14]
Irfan, B., Miniota, J., Thunberg, S., Lagerstedt, E., Kuoppamäki, S., Skantze, G. & Abelho Pereira, A. T. (2025).
Human-Robot Interaction Conversational User Enjoyment Scale (HRI CUES).
IEEE Transactions on Affective Computing.
[15]
Kontogiorgos, D. & Schlangen, D. (2025).
Beyond speech : leveraging mouse movements for information adaptation in voice interfaces.
Frontiers in Computer Science, 7.
[17]
Francis, J., Gustafsson, J., Székely, É. (2025).
From Static to Dynamic : Enhancing AAC with Generative Imagery and Zero-Shot TTS.
In Interspeech 2025. (pp. 4960-4962). International Speech Communication Association.
[18]
Bokkahalli Satish, S. H., Henter, G. E., Székely, É. (2025).
Hear Me Out : Interactive evaluation and bias discovery platform for speech-to-speech conversational AI.
In Interspeech 2025. (pp. 2151-2152). International Speech Communication Association.
[19]
Netzorg, R., Carvalho, N., Guzman, A., Wang, L., Francis, J., Garoute, K. V., Johnson, K., Anumanchipalli, G. K. (2025).
On the Production and Perception of a Single Speaker's Gender.
In Interspeech 2025. (pp. 669-673). International Speech Communication Association.
[20]
Park, M., Ontakhrai, S., Kittimathaveenan, K., Alfredsson, J., Ternström, S. (2025).
How to make closed-back headphones transparent for avocalist’s own direct sound.
Presented at AES 159th Convention 2025 October 23–25, Long Beach, CA, USA. (p. 8). Audio Engineering Society, Inc.
[21]
Tånnander, C., House, D., Beskow, J., Edlund, J. (2025).
Intrasentential English in Swedish TTS : perceived English-accentedness.
In Interspeech 2025. (pp. 1638-1642). International Speech Communication Association.
[22]
Malisz, Z., Foremski, J., Kul, M. (2025).
Contextual predictability effects on acoustic distinctiveness in read Polish speech.
In Interspeech 2025. (pp. 335-339). International Speech Communication Association.
[23]
Thulinsson, F., Söderlund, N., Rafiei, S., Schenkman, B., Djupsjöbacka, A., Andrén, B., Brunnström, K. (2025).
Impact of Camera height and Field-of-View on distance judgement and gap selection in digital rear-view mirrors in vehicles.
In IS and T International Symposium on Electronic Imaging Science and Technology. Society for Imaging Science & Technology.
[24]
Ekström, A. G., Gärdenfors, P., Snyder, W. D., Friedrichs, D., McCarthy, R. C., Tsapos, M. ... Moran, S. (2025).
Correlates of Vocal Tract Evolution in Late Pliocene and Pleistocene Hominins.
Human Nature, 36(1), 22-69.
[25]
Moëll, B. (2025).
Evaluation of Artificial Intelligence in the Medical Domain : Speech, Language and Applications
(Doctoral thesis , KTH Royal Institute of Technology, Stockholm, TRITA-EECS-AVL 2025:83). Retrieved from https://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-371738.
[26]
Moell, B. & Sand Aronsson, F. (2025).
Automatic Evaluation of the Pataka Test Using Machine Learning and Audio Signal Processing.
Acta Logopaedica, 2.
[27]
Moell, B., Aronsson, F. S. & Akbar, S. (2025).
Medical reasoning in LLMs : an in-depth analysis of DeepSeek R1.
Frontiers in Artificial Intelligence, 8.
[28]
Leite, I., Ahlberg, W., Pereira, A., Sestini, A., Gisslen, L., Tollmar, K. (2025).
A Call for Deeper Collaboration Between Robotics and Game Development.
In Proceedings of the IEEE 2025 Conference on Games, CoG 2025. Institute of Electrical and Electronics Engineers (IEEE).
[29]
Jacka, R., Peña, P. R., Leonard, S. J., Székely, É., Cowan, B. R. (2025).
Impact Of Disfluent Speech Agent On Partner Models And Perspectve Taking.
In CUI 2025 - Proceedings of the 2025 ACM Conference on Conversational User Interfaces. Association for Computing Machinery (ACM).
[30]
Moëll, B. & Sand Aronsson, F. (2025).
Journaling with large language models : a novel UX paradigm for AI-driven personal health management.
Frontiers in Artificial Intelligence, 8.
[31]
Grouwels, J., Jonason, N., Sturm, B. (2025).
Exploring the Expressive Space of an Articulatory Vocal Modal using Quality-Diversity Optimization with Multimodal Embeddings.
In GECCO 2025 - Proceedings of the 2025 Genetic and Evolutionary Computation Conference. (pp. 1362-1370). Association for Computing Machinery (ACM).
[32]
Cavalcanti, J. C., Skantze, G. (2025).
"Dyadosyncrasy", Idiosyncrasy and Demographic Factors in Turn-Taking.
In Proceedings of the Interspeech 2025. Rotterdam, The Netherlands: International Speech Communication Association.
[33]
Moëll, B. & Sand Aronsson, F. (2025).
Harm Reduction Strategies for Thoughtful Use of Large Language Models in the Medical Domain : Perspectives for Patients and Clinicians.
Journal of Medical Internet Research, 27.
[34]
Mehta, S., Gamper, H., Jojic, N. (2025).
Make Some Noise : Towards LLM audio reasoning and generation using sound tokens.
In ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). (pp. 1-5). Institute of Electrical and Electronics Engineers (IEEE).
[35]
Best, P., Araya-Salas, M., Ekström, A. G., Freitas, B., Jensen, F. H., Kershenbaum, A. ... Marxer, R. (2025).
Bioacoustic fundamental frequency estimation : a cross-species dataset and deep learning baseline.
Bioacoustics, 34(4), 419-446.
[36]
Torubarova, E. (2025).
Brain-Focused Multimodal Approach for Studying Conversational Engagement in HRI.
In HRI 2025 - Proceedings of the 2025 ACM/IEEE International Conference on Human-Robot Interaction. (pp. 1894-1896). Institute of Electrical and Electronics Engineers (IEEE).
[37]
Torubarova, E., Arvidsson, C., Berrebi, J., Uddén, J., Abelho Pereira, A. T. (2025).
NeuroEngage: A Multimodal Dataset Integrating fMRI for Analyzing Conversational Engagement in Human-Human and Human-Robot Interactions.
In HRI 2025 - Proceedings of the 2025 ACM/IEEE International Conference on Human-Robot Interaction. (pp. 849-858). Institute of Electrical and Electronics Engineers (IEEE).
[38]
Irfan, B., Churamani, N., Zhao, M., Ayub, A., Rossi, S. (2025).
Lifelong Learning and Personalization in Long-Term Human-Robot Interaction (LEAP-HRI) : Overcoming Inequalities with Adaptation.
In HRI 2025 - Proceedings of the 2025 ACM/IEEE International Conference on Human-Robot Interaction. (pp. 1970-1972). Institute of Electrical and Electronics Engineers (IEEE).
[39]
Skantze, G., Irfan, B. (2025).
Applying General Turn-Taking Models to Conversational Human-Robot Interaction.
In HRI 2025 - Proceedings of the 2025 ACM/IEEE International Conference on Human-Robot Interaction. (pp. 859-868). Institute of Electrical and Electronics Engineers (IEEE).
[40]
Irfan, B., Skantze, G. (2025).
Between You and Me: Ethics of Self-Disclosure in Human-Robot Interaction.
In HRI 2025 - Proceedings of the 2025 ACM/IEEE International Conference on Human-Robot Interaction. (pp. 1357-1362). Institute of Electrical and Electronics Engineers (IEEE).
[41]
Janssens, R., Pereira, A., Skantze, G., Irfan, B., Belpaeme, T. (2025).
Online Prediction of User Enjoyment in Human-Robot Dialogue with LLMs.
In HRI 2025 - Proceedings of the 2025 ACM/IEEE International Conference on Human-Robot Interaction. (pp. 1363-1367). Institute of Electrical and Electronics Engineers (IEEE).
[42]
Cros Vila, L., Sturm, B. (2025).
(Mis)Communicating with our AI Systems.
In Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems. New York, NY, USA: Association for Computing Machinery (ACM).
[43]
Kamelabad, A. M., Inoue, E., Skantze, G. (2025).
Comparing Monolingual and Bilingual Social Robots as Conversational Practice Companions in Language Learning.
In Proceedings of the 2025 ACM/IEEE International Conference on Human-Robot Interaction. (pp. 829-838).
[44]
Cai, H. & Ternström, S. (2025).
A WaveNet-based model for predicting the electroglottographic signal from the acoustic voice signal.
Journal of the Acoustical Society of America, 157(4), 3033-3044.
[45]
Marcinek, L., Beskow, J., Gustafsson, J. (2025).
A Dual-Control Dialogue Framework for Human-Robot Interaction Data Collection : Integrating Human Emotional and Contextual Awareness with Conversational AI.
In Social Robotics - 16th International Conference, ICSR + AI 2024, Proceedings. (pp. 290-297). Springer Nature.
[46]
Herbst, C. T., Tokuda, I. T., Nishimura, T., Ternström, S., Ossio, V., Levy, M. ... Dunn, J. C. (2025).
‘Monkey yodels’—frequency jumps in New World monkey vocalizations greatly surpass human vocal register transitions.
Philosophical Transactions of the Royal Society of London. Biological Sciences, 380(1923).
[47]
Irfan, B., Kuoppamäki, S., Hosseini, A. & Skantze, G. (2025).
Between reality and delusion : challenges of applying large language models to companion robots for open-domain dialogues with older adults.
Autonomous Robots, 49(1).
[48]
Cai, H. (2025).
Mapping voice quality in normal, pathological and synthetic voices
(Doctoral thesis , KTH Royal Institute of Technology, Stockholm, TRITA-EECS-AVL 2025:25). Retrieved from https://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-360211.
[49]
Kanhov, E., Kaila, A.-K. & Sturm, B. L. T. (2025).
Innovation, data colonialism and ethics : critical reflections on the impacts of AI on Irish traditional music.
Journal of New Music Research, 1-17.
[50]
Włodarczak, M., Ludusan, B., Sundberg, J. & Heldner, M. (2025).
Classification of voice quality using neck-surface acceleration : Comparison with glottal flow and radiated sound.
Journal of Voice, 39(1), 10-24.