Till innehåll på sidan

TMH Publications (latest 50)

Below are the 50 latest publications from the Department of Speech, Music and Hearing.

TMH Publications

[1]
Saponaro, G., Jamone, L., Bernardino, A. & Salvi, G. (2020). Beyond the Self: Using Grounded Affordances to Interpret and Describe Others’ Actions. IEEE Transactions on Cognitive and Developmental Systems, 12(2), 209-221.
[2]
Ternström, S., D'Amario, S. & Selamtzis, A. (2020). Effects of the lung volume on the electroglottographic waveform in trained female singers. Journal of Voice, 34(3), 485.e1-485.e21.
[3]
Pabon, P. & Ternström, S. (2020). Feature maps of the acoustic spectrum of the voice. Journal of Voice, 34(1), 161.e1-161.e26.
[4]
Stefanov, K. (2020). Self-Supervised Vision-Based Detection of the Active Speaker as Support for Socially-Aware Language Acquisition. IEEE Transactions on Cognitive and Developmental Systems, 12(1).
[5]
Sundberg, J., Salomão, G. L. & Scherer, K. R. (2021). Analyzing Emotion Expression in Singing via Flow Glottograms, Long-Term-Average Spectra, and Expert Listener Evaluation. Journal of Voice, 35(1), 52-60.
[6]
Kontogiorgos, D., van Waveren, S., Wallberg, O., Abelho Pereira, A. T., Leite, I., Gustafson, J. (2020). Embodiment Effects in Interactions with Failing Robots. Presenterad vid SIGCHI Conference on Human Factors in Computing Systems, CHI ’20, April 25–30, 2020, Honolulu, HI, USA. ACM Digital Library.
[7]
Alexanderson, S., Henter, G. E., Kucherenko, T. & Beskow, J. (2020). Style-Controllable Speech-Driven Gesture Synthesis Using Normalising Flows. Computer graphics forum (Print), 39(2), 487-496.
[8]
Abelho Pereira, A. T., Oertel, C., Fermoselle, L., Mendelson, J., Gustafson, J. (2020). Effects of Different Interaction Contexts when Evaluating Gaze Models in HRI. Presenterad vid ACM/IEEE International Conference on Human-Robot Interaction (HRI), MAR 23-26, 2020, Cambridge, ENGLAND. (s. 131-138). Association for Computing Machinery (ACM).
[9]
Kontogiorgos, D., Abelho Pereira, A. T., Sahindal, B., van Waveren, S., Gustafson, J. (2020). Behavioural Responses to Robot Conversational Failures. I HRI '20: Proceedings of the 2020 ACM/IEEE International Conference on Human-Robot Interaction. ACM Digital Library.
[10]
Kontogiorgos, D., Pelikan, H. (2020). Towards Adaptive and Least-Collaborative-Effort Social Robots. I Companion of the 2020 ACM/IEEE International Conference on Human-Robot Interaction, HRI 2020, Cambridge, UK, March 23-26, 2020. IEEE conference proceedings.
[11]
Engwall, O., David Lopes, J. & Åhlund, A. (2020). Robot interaction styles for conversation practice in second language learning. International Journal of Social Robotics.
[12]
Lã, F. M.B. & Ternström, S. (2020). Flow ball-assisted voice training : Immediate effects on vocal fold contacting. Biomedical Signal Processing and Control, 62.
[13]
Shahrebabaki, A. S., Siniscalchi, M. S., Salvi, G., Svendsen, T. K. (2020). Sequence-to-sequence articulatory inversion through time convolution of sub-band frequency signals. I Interspeech. (s. 2882-2886). Shanghai, China: The International Speech Communication Association (ISCA).
[14]
Engwall, O. & David Lopes, J. (2020). Interaction and collaboration in robot-assisted language learning for adults. Computer Assisted Language Learning.
[15]
Nix, J., Jers, H. & Ternström, S. (2020). Acoustical, psychoacoustical, and pedagogical considerations for choral singing with covid-19 health measures. Choral Journal, 61(3), 32-40.
[16]
Ben-Tal, O., Harris, M. & Sturm, B. (2020). How music AI is useful : Engagements with composers, performers, and audiences. Leonardo music journal, 1-13.
[18]
Petrovska, D., Hennebert, J., Melin, H., Genoud, D. (2020). Polycost : A telephone-speech database for speaker recognition. I RLA2C 1998 - Speaker Recognition and its Commercial and Forensic Applications. (s. 211-214). International Speech Communication Association.
[19]
Elowsson, A. (2020). Polyphonic pitch tracking with deep layered learning. Journal of the Acoustical Society of America, 148(1), 446-468.
[20]
Gill, B. P., Lã, F. M. B., Lee, J. & Sundberg, J. (2020). Spectrum Effects of a Velopharyngeal Opening in Singing. Journal of Voice, 34(3), 346-351.
[21]
Saldías, M., Laukkanen, A. -., Guzmán, M., Miranda, G., Stoney, J., Alku, P. & Sundberg, J. (2020). The Vocal Tract in Loud Twang-Like Singing While Producing High and Low Pitches. Journal of Voice.
[22]
Henter, G. E., Alexanderson, S. & Beskow, J. (2020). MoGlow: Probabilistic and controllable motion synthesis using normalising flows. ACM Transactions on Graphics, 39(4), 236:1-236:14.
[23]
Jonason, N., Sturm, B., Thomé, C. (2020). The control-synthesis approach for making expressive and controllable neural music synthesizers. I Proceedings of the 2020 AI Music Creativity Conference..
[24]
Cumbal, R., David Lopes, J., Engwall, O. (2020). Detection of Listener Uncertainty in Robot-Led Second Language Conversation Practice. I Proceedings ICMI '20: International Conference on Multimodal Interaction. Association for Computing Machinery (ACM).
[25]
Dabbaghchian, S., Arnela, M., Engwall, O. & Oriol, G. (2021). Simulation of vowel-vowel utterances using a 3D biomechanical-acoustic model. International Journal for Numerical Methods in Biomedical Engineering, 37(1).
[26]
Székely, É., Edlund, J., Gustafson, J. (2020). Augmented Prompt Selection for Evaluation of Spontaneous Speech Synthesis. I Proceedings of The 12th Language Resources and Evaluation Conference..
[27]
Székely, É., Henter, G. E., Beskow, J., Gustafson, J. (2020). Breathing and Speech Planning in Spontaneous Speech Synthesis. I 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). (s. 7649-7653). IEEE.
[28]
White, L. & Malisz, Z. (2020). Speech rhythm and timing. I Carlos Gussenhoven and Aoju Chen (Red.), Oxford Handbook of Language Prosody. Oxford: Oxford University Press.
[29]
Strombergsson, S., Holm, K., Edlund, J., Lagerberg, T. & McAllister, A. (2020). Audience Response System-Based Evaluation of Intelligibility of Children's Connected Speech - Validity, Reliability and Listener Differences. Journal of Communication Disorders, 87.
[30]
Alexanderson, S., Henter, G. E., Kucherenko, T. & Beskow, J. (2020). Style-Controllable Speech-Driven Gesture Synthesis Using Normalising FlowsKeywords. Computer graphics forum (Print), 39(2), 487-496.
[31]
Kucherenko, T., Jonell, P., van Waveren, S., Henter, G. E., Alexanderson, S., Leite, I., Kjellström, H. (2020). Gesticulator : A framework for semantically-aware speech-driven gesture generation. I ICMI '20: Proceedings of the 2020 International Conference on Multimodal Interaction. Association for Computing Machinery (ACM).
[32]
Leijon, A., Dillon, H., Hickson, L., Kinkel, M., Kramer, S. E. & Nordqvist, P. (2020). Analysis of data from the International Outcome Inventory for Hearing Aids (IOI-HA) using Bayesian Item Response Theory. International Journal of Audiology.
[33]
Patel, R. R., Sundberg, J., Gill, B. & Lã, F. M. B. (2020). Glottal Airflow and Glottal Area Waveform Characteristics of Flow Phonation in Untrained Vocally Healthy Adults. Journal of Voice.
[34]
Chettri, B., Benetos, E. & Sturm, B. (2020). Dataset Artefacts in Anti-Spoofing Systems : A Case Study on the ASVspoof 2017 Benchmark. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 28, 3018-3028.
[35]
Henter, G. E., Alexanderson, S. & Beskow, J. (2020). MoGlow : Probabilistic and Controllable Motion Synthesis Using Normalising Flows. ACM Transactions on Graphics, 39(6).
[36]
Kalpakchi, D., Boye, J. (2020). UDon2: a library for manipulating Universal Dependencies trees. I Proceedings of the Fourth Workshop on Universal Dependencies (UDW 2020). (s. 120-125).
[37]
Sturm, B. (Red.). (2020). Proceedings of The 2020 Joint Conference on AI Music Creativity . KTH Royal Institute of Technology.
[38]
Skantze, G. (2021). Turn-taking in Conversational Systems and Human-Robot Interaction : A Review. Computer speech & language (Print), 67.
[39]
Jonell, P., Kucherenko, T., Henter, G. E., Beskow, J. (2020). Let’s face it : Probabilistic multi-modal interlocutor-aware generation of facial gestures in dyadic settings. I IVA '20: Proceedings of the 20th ACM International Conference on Intelligent Virtual Agents. Association for Computing Machinery (ACM).
[40]
Jonell, P., Kucherenko, T., Torre, I., Beskow, J. (2020). Can we trust online crowdworkers? : Comparing online and offline participants in a preference test of virtual agents. I IVA '20: Proceedings of the 20th ACM International Conference on Intelligent Virtual Agents. Association for Computing Machinery (ACM).
[41]
Kucherenko, T., Hasegawa, D., Kaneko, N., Henter, G. E. & Kjellström, H. (2021). Moving Fast and Slow : Analysis of Representations and Post-Processing in Speech-Driven Automatic Gesture Generation. International Journal of Human-Computer Interaction, 1-17.
[42]
Torre, I., Dogan, F. I., Kontogiorgos, D. (2021). Voice, Embodiment, and Autonomy as Identity Affordances. I HRI '21 Companion: Companion of the 2021 ACM/IEEE International Conference on Human-Robot Interaction..
[43]
Kontogiorgos, D., Sibirtseva, E., Gustafson, J. (2020). Chinese whispers : A multimodal dataset for embodied language grounding. I LREC 2020 - 12th International Conference on Language Resources and Evaluation, Conference Proceedings. (s. 743-749). European Language Resources Association (ELRA).
[44]
Fornhammar, L., Sundberg, J., Fuchs, M. & Pieper, L. (2020). Measuring Voice Effects of Vibrato-Free and Ingressive Singing : A Study of Phonation Threshold Pressures. Journal of Voice.
[45]
Domeij, R., Edlund, J., Eriksson, G., Fallgren, P., House, D., Lindström, E., Skog, S. N., Öqvist, J. (2020). Exploring the archives for textual entry points to speech - Experiences of interdisciplinary collaboration in making cultural heritage accessible for research. I CEUR Workshop Proceedings. (s. 45-55). CEUR-WS.
[46]
Mishra, S., Benetos, E., Sturm, B., Dixon, S. (2020). Reliable Local Explanations for Machine Listening. I Proceedings of the International Joint Conference on Neural Networks. Institute of Electrical and Electronics Engineers Inc.
[47]
Székely, É., Edlund, J., Gustafson, J. (2020). Augmented prompt selection for evaluation of spontaneous speech synthesis. I LREC 2020 - 12th International Conference on Language Resources and Evaluation, Conference Proceedings. (s. 6368-6374). European Language Resources Association (ELRA).
[48]
Ghosh, A., Honore, A., Liu, D., Henter, G. E., Chatterjee, S. (2020). Robust classification using hidden markov models and mixtures of normalizing flows. I IEEE International Workshop on Machine Learning for Signal Processing, MLSP. IEEE Computer Society.
[49]
Gillet, S., Cumbal, R., Pereira, A., Lopes, J., Engwall, O., Leite, I. (2021). Robot Gaze Can Mediate Participation Imbalance in Groups with Different Skill Levels. I Proceedings of the 2021 ACM/IEEE International Conference on Human-Robot Interaction. (s. 303-311). Association for Computing Machinery.
[50]
Alexanderson, S., Székely, É., Henter, G. E., Kucherenko, T., Beskow, J. (2020). Generating coherent spontaneous speech and gesture from text. I Proceedings of the 20th ACM International Conference on Intelligent Virtual Agents, IVA 2020. Association for Computing Machinery, Inc.
Fullständig lista i KTH:s publikationsportal