TMH Publications (latest 50)
Below are the 50 latest publications from the Department of Speech, Music and Hearing.
TMH Publications
[1]
Wolfert, P., Henter, G. E. & Belpaeme, T. (2024).
Exploring the Effectiveness of Evaluation Practices for Computer-Generated Nonverbal Behaviour.
Applied Sciences, 14(4).
[2]
Mehta, S., Frisk, K. & Nyborg, L. (2024).
Role of Cr in Mn-rich precipitates for Al–Mn–Cr–Zr-based alloys tailored for additive manufacturing.
Calphad, 84.
[3]
Cumbal, R., Engwall, O. (2024).
Speaking Transparently : Social Robots in Educational Settings.
I Companion of the 2024 ACM/IEEE International Conference on Human-Robot Interaction (HRI '24 Companion), March 11--14, 2024, Boulder, CO, USA..
[4]
Cumbal, R. (2024).
Robots Beyond Borders : The Role of Social Robots in Spoken Second Language Practice
(Doktorsavhandling , KTH Royal Institute of Technology, Stockholm, TRITA-EECS-AVL 2024:23). Hämtad från https://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-343863.
[5]
Ternström, S. (2024).
Update 3.1 to FonaDyn : A system for real-time analysis of the electroglottogram, over the voice range.
SoftwareX, 26.
[6]
Sundberg, J., Salomão, G. L. & Scherer, K. R. (2024).
Emotional expressivity in singing : Assessing physiological and acoustic indicators of two opera singers' voice characteristics.
Journal of the Acoustical Society of America, 155(1), 18-28.
[7]
D'Amario, S., Ternström, S., Goebl, W., Bishop, L. (2023).
Impact of singing togetherness and task complexity on choristers' body motion.
I SMAC 2023: Proceedings of the Stockholm Music Acoustics Conference 2023. (s. 146-150). Stockholm: KTH Royal Institute of Technology.
[8]
Deichler, A., Mehta, S., Alexanderson, S., Beskow, J. (2023).
Difusion-Based Co-Speech Gesture Generation Using Joint Text and Audio Representation.
I PROCEEDINGS OF THE 25TH INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, ICMI 2023. (s. 755-762). Association for Computing Machinery (ACM).
[9]
Wozniak, M. K., Stower, R., Jensfelt, P., Abelho Pereira, A. T. (2023).
Happily Error After : Framework Development and User Study for Correcting Robot Perception Errors in Virtual Reality.
I 2023 32ND IEEE INTERNATIONAL CONFERENCE ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION, RO-MAN. (s. 1573-1580). Institute of Electrical and Electronics Engineers (IEEE).
[10]
Torre, I., Lagerstedt, E., Dennler, N., Seaborn, K., Leite, I., Székely, É. (2023).
Can a gender-ambiguous voice reduce gender stereotypes in human-robot interactions?.
I 2023 32ND IEEE INTERNATIONAL CONFERENCE ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION, RO-MAN. (s. 106-112). Institute of Electrical and Electronics Engineers (IEEE).
[11]
Miniotaitė, J., Wang, S., Beskow, J., Gustafson, J., Székely, É., Abelho Pereira, A. T. (2023).
Hi robot, it's not what you say, it's how you say it.
I 2023 32ND IEEE INTERNATIONAL CONFERENCE ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION, RO-MAN. (s. 307-314). Institute of Electrical and Electronics Engineers (IEEE).
[12]
D'Amario, S., Ternström, S., Goebl, W. & Bishop, L. (2023).
Body motion of choral singers.
Frontiers in Psychology, 14.
[13]
Figueroa, C., Ochs, M., Skantze, G. (2023).
Classification of Feedback Functions in Spoken Dialog Using Large Language Models and Prosodic Features.
I 27th Workshop on the Semantics and Pragmatics of Dialogue. (s. 15-24). Maribor: University of Maribor.
[14]
Offrede, T., Mishra, C., Skantze, G., Fuchs, S., Mooshammer, C. (2023).
Do Humans Converge Phonetically When Talking to a Robot?.
I Proceedings of the 20th International Congress of Phonetic Sciences, Prague 2023. (s. 3507-3511).
[15]
Gustafsson, J., Székely, É., Beskow, J. (2023).
Generation of speech and facial animation with controllable articulatory effort for amusing conversational characters.
I 23rd ACM International Conference on Interlligent Virtual Agent (IVA 2023). Institute of Electrical and Electronics Engineers (IEEE).
[16]
Engwall, O., Bandera Rubio, J. P., Bensch, S., Haring, K. S., Kanda, T., Núñez, P. ... Sgorbissa, A. (2023).
Editorial : Socially, culturally and contextually aware robots.
Frontiers in Robotics and AI, 10.
[17]
Yoon, Y., Kucherenko, T., Woo, J., Wolfert, P., Nagy, R., Henter, G. E. (2023).
GENEA Workshop 2023 : The 4th Workshop on Generation and Evaluation of Non-verbal Behaviour for Embodied Agents.
I ICMI 2023: Proceedings of the 25th International Conference on Multimodal Interaction. (s. 822-823). Association for Computing Machinery (ACM).
[18]
Wolfert, P., Henter, G. E., Belpaeme, T. (2023).
"Am I listening?", Evaluating the Quality of Generated Data-driven Listening Motion.
I ICMI 2023 Companion: Companion Publication of the 25th International Conference on Multimodal Interaction. (s. 6-10). Association for Computing Machinery (ACM).
[19]
Axelsson, A. (2023).
Adaptive Robot Presenters : Modelling Grounding in Multimodal Interaction
(Doktorsavhandling , KTH Royal Institute of Technology, Stockholm, TRITA-EECS-AVL 2023:70). Hämtad från https://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-338178.
[20]
Feindt, K., Rossi, M., Esfandiari-Baiat, G., Ekström, A. G., Zellers, M. (2023).
Cues to next-speaker projection in conversational Swedish: Evidence from reaction times.
I Interspeech 2023. (s. 1040-1044). International Speech Communication Association.
[21]
Ekstedt, E., Wang, S., Székely, É., Gustafsson, J., Skantze, G. (2023).
Automatic Evaluation of Turn-taking Cues in Conversational Speech Synthesis.
I Interspeech 2023. (s. 5481-5485). International Speech Communication Association.
[22]
Cao, X., Fan, Z., Svendsen, T., Salvi, G. (2023).
An Analysis of Goodness of Pronunciation for Child Speech.
I Interspeech 2023. (s. 4613-4617). International Speech Communication Association.
[23]
Fallgren, P., Edlund, J. (2023).
Crowdsource-based validation of the audio cocktail as a sound browsing tool.
I Interspeech 2023. (s. 2178-2182). International Speech Communication Association.
[24]
Lameris, H., Gustafsson, J., Székely, É. (2023).
Beyond style : synthesizing speech with pragmatic functions.
I Interspeech 2023. (s. 3382-3386). International Speech Communication Association.
[25]
Kalpakchi, D. (2023).
Ask and distract : Data-driven methods for the automatic generation of multiple-choice reading comprehension questions from Swedish texts
(Doktorsavhandling , KTH Royal Institute of Technology, TRITA-EECS-AVL 2023:56). Hämtad från https://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-336531.
[26]
Getman, Y., Phan, N., Al-Ghezi, R., Voskoboinik, E., Singh, M., Grosz, T. ... Ylinen, S. (2023).
Developing an AI-Assisted Low-Resource Spoken Language Learning App for Children.
IEEE Access, 11, 86025-86037.
[27]
Tånnander, C., House, D., Edlund, J. (2023).
Analysis-by-synthesis : phonetic-phonological variation indeep neural network-based text-to-speech synthesis.
I Proceedings of the 20th International Congress of Phonetic Sciences, Prague 2023. (s. 3156-3160). Prague, Czech Republic: GUARANT International.
[28]
Sturm, B., Flexer, A. (2023).
A Review of Validity and its Relationship to Music Information Research.
I Proc. Int. Symp. Music Information Retrieval..
[29]
Amerotti, M., Benford, S., Sturm, B., Vear, C. (2023).
A Live Performance Rule System Informed by Irish Traditional Dance Music.
I Proc. International Symposium on Computer Music Multidisciplinary Research..
[30]
Alexanderson, S., Nagy, R., Beskow, J. & Henter, G. E. (2023).
Listen, Denoise, Action! Audio-Driven Motion Synthesis with Diffusion Models.
ACM Transactions on Graphics, 42(4).
[31]
Wang, S., Henter, G. E., Gustafsson, J., Székely, É. (2023).
A Comparative Study of Self-Supervised Speech Representations in Read and Spontaneous TTS.
I ICASSPW 2023: 2023 IEEE International Conference on Acoustics, Speech and Signal Processing Workshops, Proceedings. Institute of Electrical and Electronics Engineers (IEEE).
[32]
Sundberg, J., La, F. & Granqvist, S. (2023).
Fundamental frequency disturbances in female and male singers' pitch glides through long tube with varied resistancesa.
Journal of the Acoustical Society of America, 154(2), 801-807.
[33]
Irfan, B., Ramachandran, A., Staffa, M., Gunes, H. (2023).
Lifelong Learning and Personalization in Long-Term Human-Robot Interaction (LEAP-HRI) : Adaptivity for All.
I HRI 2023: Companion of the ACM/IEEE International Conference on Human-Robot Interaction. (s. 929-931). Association for Computing Machinery (ACM).
[34]
McMillan, D., Jaber, R., Cowan, B. R., Fischer, J. E., Irfan, B., Cumbal, R., Zargham, N., Lee, M. (2023).
Human-Robot Conversational Interaction (HRCI).
I HRI 2023: Companion of the ACM/IEEE International Conference on Human-Robot Interaction. (s. 923-925). Association for Computing Machinery (ACM).
[35]
Peña, P. R., Doyle, P. R., Ip, E. Y., Di Liberto, G., Higgins, D., McDonnell, R., Branigan, H., Gustafsson, J., McMillan, D., Moore, R. J., Cowan, B. R. (2023).
A Special Interest Group on Developing Theories of Language Use in Interaction with Conversational User Interfaces.
I CHI 2023: Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems. Association for Computing Machinery (ACM).
[36]
Axelsson, A., Skantze, G. (2023).
Do you follow? : A fully automated system for adaptive robot presenters.
I HRI 2023: Proceedings of the 2023 ACM/IEEE International Conference on Human-Robot Interaction. (s. 102-111). Association for Computing Machinery (ACM).
[37]
Mishra, C., Offrede, T., Fuchs, S., Mooshammer, C. & Skantze, G. (2023).
Does a robot's gaze aversion affect human gaze aversion?.
Frontiers in Robotics and AI, 10.
[38]
Borin, L., Domeij, R., Edlund, J. & Forsberg, M. (2023).
Language Report Swedish.
I Cognitive Technologies (s. 219-222). Springer Nature.
[39]
Nyatsanga, S., Kucherenko, T., Ahuja, C., Henter, G. E. & Neff, M. (2023).
A Comprehensive Review of Data-Driven Co-Speech Gesture Generation.
Computer graphics forum (Print), 42(2), 569-596.
[40]
Ekström, A. G. & Edlund, J. (2023).
Evolution of the human tongue and emergence of speech biomechanics.
Frontiers in Psychology, 14.
[41]
Leijon, A., von Gablenz, P., Holube, I., Taghia, J. & Smeds, K. (2023).
Bayesian analysis of Ecological Momentary Assessment (EMA) data collected in adults before and after hearing rehabilitation.
Frontiers in Digital Health, 5.
[42]
Pérez Zarazaga, P., Henter, G. E., Malisz, Z. (2023).
A processing framework to access large quantities of whispered speech found in ASMR.
I ICASSP 2023: 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Rhodes, Greece: IEEE Signal Processing Society.
[43]
Wang, S., Henter, G. E., Gustafsson, J., Székely, É. (2023).
A comparative study of self-supervised speech representationsin read and spontaneous TTS.
(Manuskript).
[44]
Adiban, M., Siniscalchi, S. M. & Salvi, G. (2023).
A step-by-step training method for multi generator GANs with application to anomaly detection and cybersecurity.
Neurocomputing, 537, 296-308.
[45]
Stenwig, E., Salvi, G., Rossi, P. S. & Skjaervold, N. K. (2023).
Comparison of correctly and incorrectly classified patients for in-hospital mortality prediction in the intensive care unit.
BMC Medical Research Methodology, 23(1).
[46]
Falk, S., Sturm, B., Ahlbäck, S. (2023).
Automatic legato transcription based on onset detection.
I SMC 2023: Proceedings of the Sound and Music Computing Conference 2023. (s. 214-221). Sound and Music Computing Network.
[47]
Déguernel, K., Sturm, B. (2023).
Bias in Favour or Against Computational Creativity : A Survey and Reflection on the Importance of Socio-cultural Context in its Evaluation.
I Proc. International Conference on Computational Creativity..
[48]
Deichler, A., Wang, S., Alexanderson, S. & Beskow, J. (2023).
Learning to generate pointing gestures in situated embodied conversational agents.
Frontiers in Robotics and AI, 10.
[49]
Huang, R., Holzapfel, A., Sturm, B. & Kaila, A.-K. (2023).
Beyond Diverse Datasets : Responsible MIR, Interdisciplinarity, and the Fractured Worlds of Music.
Transactions of the International Society for Music Information Retrieval, 6(1), 43-59.
[50]
Kamelabad, A. M., Skantze, G. (2023).
I Learn Better Alone! Collaborative and Individual Word Learning With a Child and Adult Robot.
I Proceedings of the 2023 ACM/IEEE International Conference on Human-Robot Interaction. (s. 368-377). New York, NY, United States: Association for Computing Machinery (ACM).