Skip to main content

TMH Publications (latest 50)

Below are the 50 latest publications from the Department of Speech, Music and Hearing.

TMH Publications

[1]
Traum, D., Skantze, G., Nishizaki, H., Higashinaka, R., Minato, T. & Nagai, T. (2024). Special issue on multimodal processing and robotics for dialogue systems (Part II). Advanced Robotics, 38(4), 193-194.
[2]
Borg, A., Parodis, I., Skantze, G. (2024). Creating Virtual Patients using Robots and Large Language Models: A Preliminary Study with Medical Students. In HRI 2024 Companion - Companion of the 2024 ACM/IEEE International Conference on Human-Robot Interaction. (pp. 273-277). Association for Computing Machinery (ACM).
[3]
Ashkenazi, S., Skantze, G., Stuart-Smith, J., Foster, M. E. (2024). Goes to the Heart: Speaking the User's Native Language. In HRI 2024 Companion - Companion of the 2024 ACM/IEEE International Conference on Human-Robot Interaction. (pp. 214-218). Association for Computing Machinery (ACM).
[4]
Kamelabad, A. M. (2024). The Qestion Is Not Whether; It Is How!. In HRI 2024 Companion - Companion of the 2024 ACM/IEEE International Conference on Human-Robot Interaction. (pp. 112-114). Association for Computing Machinery (ACM).
[5]
Irfan, B., Staffa, M., Bobu, A., Churamani, N. (2024). Lifelong Learning and Personalization in Long-Term Human-Robot Interaction (LEAP-HRI): Open-World Learning. In HRI 2024 Companion - Companion of the 2024 ACM/IEEE International Conference on Human-Robot Interaction. (pp. 1323-1325). Association for Computing Machinery (ACM).
[6]
Axelsson, A., Vaddadi, B., Bogdan, C. M., Skantze, G. (2024). Robots in autonomous buses: Who hosts when no human is there?. In HRI 2024 Companion - Companion of the 2024 ACM/IEEE International Conference on Human-Robot Interaction. (pp. 1278-1280). Association for Computing Machinery (ACM).
[7]
Wolfert, P., Henter, G. E. & Belpaeme, T. (2024). Exploring the Effectiveness of Evaluation Practices for Computer-Generated Nonverbal Behaviour. Applied Sciences, 14(4).
[9]
Cumbal, R., Engwall, O. (2024). Speaking Transparently : Social Robots in Educational Settings. In Companion of the 2024 ACM/IEEE International Conference on Human-Robot Interaction (HRI '24 Companion), March 11--14, 2024, Boulder, CO, USA..
[10]
Cumbal, R. (2024). Robots Beyond Borders : The Role of Social Robots in Spoken Second Language Practice (Doctoral thesis , KTH Royal Institute of Technology, Stockholm, TRITA-EECS-AVL 2024:23). Retrieved from https://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-343863.
[12]
Sundberg, J., Salomão, G. L. & Scherer, K. R. (2024). Emotional expressivity in singing : Assessing physiological and acoustic indicators of two opera singers' voice characteristics. Journal of the Acoustical Society of America, 155(1), 18-28.
[13]
Deichler, A., Mehta, S., Alexanderson, S., Beskow, J. (2023). Difusion-Based Co-Speech Gesture Generation Using Joint Text and Audio Representation. In PROCEEDINGS OF THE 25TH INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, ICMI 2023. (pp. 755-762). Association for Computing Machinery (ACM).
[14]
Wozniak, M. K., Stower, R., Jensfelt, P., Abelho Pereira, A. T. (2023). Happily Error After : Framework Development and User Study for Correcting Robot Perception Errors in Virtual Reality. In 2023 32ND IEEE INTERNATIONAL CONFERENCE ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION, RO-MAN. (pp. 1573-1580). Institute of Electrical and Electronics Engineers (IEEE).
[15]
Torre, I., Lagerstedt, E., Dennler, N., Seaborn, K., Leite, I., Székely, É. (2023). Can a gender-ambiguous voice reduce gender stereotypes in human-robot interactions?. In 2023 32ND IEEE INTERNATIONAL CONFERENCE ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION, RO-MAN. (pp. 106-112). Institute of Electrical and Electronics Engineers (IEEE).
[16]
Miniotaitė, J., Wang, S., Beskow, J., Gustafson, J., Székely, É., Abelho Pereira, A. T. (2023). Hi robot, it's not what you say, it's how you say it. In 2023 32ND IEEE INTERNATIONAL CONFERENCE ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION, RO-MAN. (pp. 307-314). Institute of Electrical and Electronics Engineers (IEEE).
[17]
D'Amario, S., Ternström, S., Goebl, W. & Bishop, L. (2023). Body motion of choral singers. Frontiers in Psychology, 14.
[18]
Figueroa, C., Ochs, M., Skantze, G. (2023). Classification of Feedback Functions in Spoken Dialog Using Large Language Models and Prosodic Features. In 27th Workshop on the Semantics and Pragmatics of Dialogue. (pp. 15-24). Maribor: University of Maribor.
[19]
Offrede, T., Mishra, C., Skantze, G., Fuchs, S., Mooshammer, C. (2023). Do Humans Converge Phonetically When Talking to a Robot?. In Proceedings of the 20th International Congress of Phonetic Sciences, Prague 2023. (pp. 3507-3511).
[20]
Gustafsson, J., Székely, É., Beskow, J. (2023). Generation of speech and facial animation with controllable articulatory effort for amusing conversational characters. In 23rd ACM International Conference on Interlligent Virtual Agent (IVA 2023). Institute of Electrical and Electronics Engineers (IEEE).
[21]
Engwall, O., Bandera Rubio, J. P., Bensch, S., Haring, K. S., Kanda, T., Núñez, P. ... Sgorbissa, A. (2023). Editorial : Socially, culturally and contextually aware robots. Frontiers in Robotics and AI, 10.
[22]
Yoon, Y., Kucherenko, T., Woo, J., Wolfert, P., Nagy, R., Henter, G. E. (2023). GENEA Workshop 2023 : The 4th Workshop on Generation and Evaluation of Non-verbal Behaviour for Embodied Agents. In ICMI 2023: Proceedings of the 25th International Conference on Multimodal Interaction. (pp. 822-823). Association for Computing Machinery (ACM).
[23]
Wolfert, P., Henter, G. E., Belpaeme, T. (2023). "Am I listening?", Evaluating the Quality of Generated Data-driven Listening Motion. In ICMI 2023 Companion: Companion Publication of the 25th International Conference on Multimodal Interaction. (pp. 6-10). Association for Computing Machinery (ACM).
[24]
Axelsson, A. (2023). Adaptive Robot Presenters : Modelling Grounding in Multimodal Interaction (Doctoral thesis , KTH Royal Institute of Technology, Stockholm, TRITA-EECS-AVL 2023:70). Retrieved from https://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-338178.
[25]
Feindt, K., Rossi, M., Esfandiari-Baiat, G., Ekström, A. G., Zellers, M. (2023). Cues to next-speaker projection in conversational Swedish: Evidence from reaction times. In Interspeech 2023. (pp. 1040-1044). International Speech Communication Association.
[26]
Ekstedt, E., Wang, S., Székely, É., Gustafsson, J., Skantze, G. (2023). Automatic Evaluation of Turn-taking Cues in Conversational Speech Synthesis. In Interspeech 2023. (pp. 5481-5485). International Speech Communication Association.
[27]
Cao, X., Fan, Z., Svendsen, T., Salvi, G. (2023). An Analysis of Goodness of Pronunciation for Child Speech. In Interspeech 2023. (pp. 4613-4617). International Speech Communication Association.
[28]
Fallgren, P., Edlund, J. (2023). Crowdsource-based validation of the audio cocktail as a sound browsing tool. In Interspeech 2023. (pp. 2178-2182). International Speech Communication Association.
[29]
Lameris, H., Gustafsson, J., Székely, É. (2023). Beyond style : synthesizing speech with pragmatic functions. In Interspeech 2023. (pp. 3382-3386). International Speech Communication Association.
[31]
Getman, Y., Phan, N., Al-Ghezi, R., Voskoboinik, E., Singh, M., Grosz, T. ... Ylinen, S. (2023). Developing an AI-Assisted Low-Resource Spoken Language Learning App for Children. IEEE Access, 11, 86025-86037.
[32]
Tånnander, C., House, D., Edlund, J. (2023). Analysis-by-synthesis : phonetic-phonological variation indeep neural network-based text-to-speech synthesis. In Proceedings of the 20th International Congress of Phonetic Sciences, Prague 2023. (pp. 3156-3160). Prague, Czech Republic: GUARANT International.
[33]
Sturm, B., Flexer, A. (2023). A Review of Validity and its Relationship to Music Information Research. In Proc. Int. Symp. Music Information Retrieval..
[34]
Amerotti, M., Benford, S., Sturm, B., Vear, C. (2023). A Live Performance Rule System Informed by Irish Traditional Dance Music. In Proc. International Symposium on Computer Music Multidisciplinary Research..
[35]
Wang, S., Henter, G. E., Gustafsson, J., Székely, É. (2023). A Comparative Study of Self-Supervised Speech Representations in Read and Spontaneous TTS. In ICASSPW 2023: 2023 IEEE International Conference on Acoustics, Speech and Signal Processing Workshops, Proceedings. Institute of Electrical and Electronics Engineers (IEEE).
[36]
Sundberg, J., La, F. & Granqvist, S. (2023). Fundamental frequency disturbances in female and male singers' pitch glides through long tube with varied resistancesa. Journal of the Acoustical Society of America, 154(2), 801-807.
[37]
McMillan, D., Jaber, R., Cowan, B. R., Fischer, J. E., Irfan, B., Cumbal, R., Zargham, N., Lee, M. (2023). Human-Robot Conversational Interaction (HRCI). In HRI 2023: Companion of the ACM/IEEE International Conference on Human-Robot Interaction. (pp. 923-925). Association for Computing Machinery (ACM).
[38]
Peña, P. R., Doyle, P. R., Ip, E. Y., Di Liberto, G., Higgins, D., McDonnell, R., Branigan, H., Gustafsson, J., McMillan, D., Moore, R. J., Cowan, B. R. (2023). A Special Interest Group on Developing Theories of Language Use in Interaction with Conversational User Interfaces. In CHI 2023: Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery (ACM).
[39]
Axelsson, A., Skantze, G. (2023). Do you follow? : A fully automated system for adaptive robot presenters. In HRI 2023: Proceedings of the 2023 ACM/IEEE International Conference on Human-Robot Interaction. (pp. 102-111). Association for Computing Machinery (ACM).
[40]
Mishra, C., Offrede, T., Fuchs, S., Mooshammer, C. & Skantze, G. (2023). Does a robot's gaze aversion affect human gaze aversion?. Frontiers in Robotics and AI, 10.
[41]
Nyatsanga, S., Kucherenko, T., Ahuja, C., Henter, G. E. & Neff, M. (2023). A Comprehensive Review of Data-Driven Co-Speech Gesture Generation. Computer graphics forum (Print), 42(2), 569-596.
[42]
Ekström, A. G. & Edlund, J. (2023). Evolution of the human tongue and emergence of speech biomechanics. Frontiers in Psychology, 14.
[43]
Leijon, A., von Gablenz, P., Holube, I., Taghia, J. & Smeds, K. (2023). Bayesian analysis of Ecological Momentary Assessment (EMA) data collected in adults before and after hearing rehabilitation. Frontiers in Digital Health, 5.
[44]
Pérez Zarazaga, P., Henter, G. E., Malisz, Z. (2023). A processing framework to access large quantities of whispered speech found in ASMR. In ICASSP 2023: 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Rhodes, Greece: IEEE Signal Processing Society.
[45]
Wang, S., Henter, G. E., Gustafsson, J., Székely, É. (2023). A comparative study of self-supervised speech representationsin read and spontaneous TTS. (Manuscript).
[46]
Adiban, M., Siniscalchi, S. M. & Salvi, G. (2023). A step-by-step training method for multi generator GANs with application to anomaly detection and cybersecurity. Neurocomputing, 537, 296-308.
[47]
Stenwig, E., Salvi, G., Rossi, P. S. & Skjaervold, N. K. (2023). Comparison of correctly and incorrectly classified patients for in-hospital mortality prediction in the intensive care unit. BMC Medical Research Methodology, 23(1).
[48]
Falk, S., Sturm, B., Ahlbäck, S. (2023). Automatic legato transcription based on onset detection. In SMC 2023: Proceedings of the Sound and Music Computing Conference 2023. (pp. 214-221). Sound and Music Computing Network.
[49]
Déguernel, K., Sturm, B. (2023). Bias in Favour or Against Computational Creativity : A Survey and Reflection on the Importance of Socio-cultural Context in its Evaluation. In Proc. International Conference on Computational Creativity..
[50]
Huang, R., Holzapfel, A., Sturm, B. & Kaila, A.-K. (2023). Beyond Diverse Datasets : Responsible MIR, Interdisciplinarity, and the Fractured Worlds of Music. Transactions of the International Society for Music Information Retrieval, 6(1), 43-59.
Full list in the KTH publications portal
Page responsible:Web editors at EECS
Belongs to: Speech, Music and Hearing (TMH)
Last changed: Aug 15, 2023