Till innehåll på sidan

TMH Publications (latest 50)

Below are the 50 latest publications from the Department of Speech, Music and Hearing.

TMH Publications

[1]
Saponaro, G., Jamone, L., Bernardino, A. & Salvi, G. (2020). Beyond the Self: Using Grounded Affordances to Interpret and Describe Others’ Actions. IEEE Transactions on Cognitive and Developmental Systems, 12(2), 209-221.
[2]
Selamtzis, A., Castellana, A., Salvi, G., Carullo, A. & Astolfi, A. (2019). Effect of vowel context in cepstral and entropy analysis of pathological voices. Biomedical Signal Processing and Control, 47, 350-357.
[3]
Fallgren, P., Malisz, Z., Edlund, J. (2019). Bringing order to chaos : A non-sequential approach for browsing large sets of found audio data. I LREC 2018 - 11th International Conference on Language Resources and Evaluation. (s. 4307-4311). European Language Resources Association (ELRA).
[4]
Ternström, S., D'Amario, S. & Selamtzis, A. (2020). Effects of the lung volume on the electroglottographic waveform in trained female singers. Journal of Voice, 34(3), 485.e1-485.e21.
[5]
Pabon, P. & Ternström, S. (2020). Feature maps of the acoustic spectrum of the voice. Journal of Voice, 34(1), 161.e1-161.e26.
[6]
Hallström, E., Mossmyr, S., Sturm, B., Vegeborn, V., Wedin, J. (2019). From Jigs and Reels to Schottisar och Polskor : Generating Scandinavian-like Folk Music with Deep Recurrent Networks. Presenterad vid The 16th Sound & Music Computing Conference, Malaga, Spain, 28-31 May 2019.
[7]
Bisesi, E., Friberg, A. & Parncutt, R. (2019). A Computational Model of Immanent Accent Salience in Tonal Music. Frontiers in Psychology, 10(317), 1-19.
[8]
Kucherenko, T., Hasegawa, D., Henter, G. E., Kaneko, N., Kjellström, H. (2019). Analyzing Input and Output Representations for Speech-Driven Gesture Generation. I 19th ACM International Conference on Intelligent Virtual Agents. New York, NY, USA: ACM Publications.
[9]
Ternström, S., Pabon, P. (2019). Accounting for variability over the voice range. I Proceedings of the ICA 2019 and EAA Euroregio. (s. 7775-7780). Aachen, DE: Deutsche Gesellschaft für Akustik (DEGA e.V.).
[10]
Sturm, B., Iglesias, M., Ben-Tal, O., Miron, M. & Gómez, E. (2019). Artificial Intelligence and Music: Open Questions of Copyright Law and Engineering Praxis. MDPI Arts, 8(3).
[11]
Lã, F. M.B., Ternström, S. (2019). Flow ball-assisted training : immediate effects on vocal fold contacting. I Pan-European Voice Conference 2019. (s. 50-51). University of Copenhagen.
[12]
Rodríguez-Algarra, F., Sturm, B. & Dixon, S. (2019). Characterising Confounding Effects in Music Classification Experiments through Interventions. Transactions of the International Society for Music Information Retrieval, 52-66.
[13]
Kontogiorgos, D., Abelho Pereira, A. T., Gustafson, J. (2019). Estimating Uncertainty in Task Oriented Dialogue. I ICMI 2019 - Proceedings of the 2019 International Conference on Multimodal Interaction. (s. 414-418). ACM Digital Library.
[14]
Stefanov, K. (2020). Self-Supervised Vision-Based Detection of the Active Speaker as Support for Socially-Aware Language Acquisition. IEEE Transactions on Cognitive and Developmental Systems, 12(1).
[15]
Zhang, C., Oztireli, C., Mandt, S., Salvi, G. (2019). Active Mini-Batch Sampling Using Repulsive Point Processes. Presenterad vid 33rd AAAI Conference on Artificial Intelligence / 31st Innovative Applications of Artificial Intelligence Conference / 9th AAAI Symposium on Educational Advances in Artificial Intelligence, Location: Honolulu, HI, JAN 27-FEB 01, 2019. (s. 5741-5748). ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE.
[16]
Székely, É., Henter, G. E., Gustafson, J. (2019). CASTING TO CORPUS : SEGMENTING AND SELECTING SPONTANEOUS DIALOGUE FOR TTS WITH A CNN-LSTM SPEAKER-DEPENDENT BREATH DETECTOR. I 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP). (s. 6925-6929). IEEE.
[17]
Gulz, T., Holzapfel, A., Friberg, A. (2019). Developing a Method for Identifying Improvisation Strategies in Jazz Duos. I Proc. of the 14th International Symposium on CMMR. (s. 482-489). Marseille Cedex.
[18]
[19]
Ardal, D., Alexandersson, S., Lempert, M., Abelho Pereira, A. T. (2019). A Collaborative Previsualization Tool for Filmmaking in Virtual Reality. I Proceedings - CVMP 2019: 16th ACM SIGGRAPH European Conference on Visual Media Production. ACM Digital Library.
[20]
Chettri, B., Stoller, D., Morfi, V., Martínez Ramírez, M. A., Benetos, E., Sturm, B. (2019). Ensemble models for spoofing detection in automatic speaker verification. I Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2019. (s. 1018-1022). International Speech Communication Association.
[21]
Patel, R., Ternström, S. (2019). Electroglottographic voice maps of untrained vocally healthy adults with gender differences and gradients. I Models and Analysis of Vocal Emissions for Biomedical Applications (MAVEBA): 11th International Workshop. (s. 107-110). Firenze, Italy: Firenze University Press.
[22]
Kontogiorgos, D., van Waveren, S., Wallberg, O., Abelho Pereira, A. T., Leite, I., Gustafson, J. (2020). Embodiment Effects in Interactions with Failing Robots. Presenterad vid SIGCHI Conference on Human Factors in Computing Systems, CHI ’20, April 25–30, 2020, Honolulu, HI, USA. ACM Digital Library.
[23]
Jonell, P., Lopes, J., Fallgren, P., Wennberg, U., Doğan, F. I., Skantze, G. (2019). Crowdsourcing a self-evolving dialog graph. I CUI '19: Proceedings of the 1st International Conference on Conversational User Interfaces. Association for Computing Machinery (ACM).
[24]
Alexanderson, S., Henter, G. E., Kucherenko, T. & Beskow, J. (2020). Style-Controllable Speech-Driven Gesture Synthesis Using Normalising Flows. Computer graphics forum (Print), 39(2), 487-496.
[25]
Abelho Pereira, A. T., Oertel, C., Fermoselle, L., Mendelson, J., Gustafson, J. (2020). Effects of Different Interaction Contexts when Evaluating Gaze Models in HRI. Presenterad vid International Conference on Human Robot Interaction (HRI).
[26]
Kontogiorgos, D., Abelho Pereira, A. T., Sahindal, B., van Waveren, S., Gustafson, J. (2020). Behavioural Responses to Robot Conversational Failures. I HRI '20: Proceedings of the 2020 ACM/IEEE International Conference on Human-Robot Interaction. ACM Digital Library.
[27]
Ibrahim, O., Skantze, G., Stoll, S., Dellwo, V. (2019). Fundamental frequency accommodation in multi-party human-robot game interactions : The effect of winning or losing. I Proceedings Interspeech 2019. (s. 3980-3984). International Speech Communication Association.
[28]
Leijon, A., Dahlquist, M. & Smeds, K. (2019). Bayesian analysis of paired-comparison sound quality ratings. Journal of the Acoustical Society of America, 146(5), 3174-3183.
[29]
Kontogiorgos, D., Pelikan, H. (2020). Towards Adaptive and Least-Collaborative-Effort Social Robots. I Companion of the 2020 ACM/IEEE International Conference on Human-Robot Interaction, HRI 2020, Cambridge, UK, March 23-26, 2020. IEEE conference proceedings.
[30]
Engwall, O., David Lopes, J. & Åhlund, A. (2020). Robot interaction styles for conversation practice in second language learning. International Journal of Social Robotics.
[31]
Cumbal, R., Lopes, J., Engwall, O. (2020). Uncertainty in robot assisted second language conversation practice. I HRI '20: Companion of the 2020 ACM/IEEE International Conference on Human-Robot Interaction. (s. 171-173). Association for Computing Machinery (ACM).
[32]
Shore, T., Skantze, G. (2020). Using lexical alignment and referring ability to address data sparsity in situated dialog reference resolution. I Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018. (s. 2288-2297). Association for Computational Linguistics.
[33]
Lã, F. M.B. & Ternström, S. (2020). Flow ball-assisted voice training : Immediate effects on vocal fold contacting. Biomedical Signal Processing and Control, 62.
[34]
Engwall, O. & David Lopes, J. (2020). Interaction and collaboration in robot-assisted language learning for adults. Computer Assisted Language Learning.
[35]
Tånnander, C., Edlund, J. (2019). First steps towards text profiling for speech synthesis. I CEUR Workshop Proceedings. (s. 457-468). CEUR-WS.
[36]
Nix, J., Jers, H. & Ternström, S. (2020). Acoustical, psychoacoustical, and pedagogical considerations for choral singing with covid-19 health measures. Choral Journal, 61(3), 32-40.
[37]
Ben-Tal, O., Harris, M. & Sturm, B. (2020). How music AI is useful : Engagements with composers, performers, and audiences. Leonardo music journal, 1-13.
[39]
Petrovska, D., Hennebert, J., Melin, H., Genoud, D. (2020). Polycost : A telephone-speech database for speaker recognition. I RLA2C 1998 - Speaker Recognition and its Commercial and Forensic Applications. (s. 211-214). International Speech Communication Association.
[40]
Elowsson, A. (2020). Polyphonic pitch tracking with deep layered learning. Journal of the Acoustical Society of America, 148(1), 446-468.
[41]
Gill, B. P., Lã, F. M. B., Lee, J. & Sundberg, J. (2020). Spectrum Effects of a Velopharyngeal Opening in Singing. Journal of Voice, 34(3), 346-351.
[42]
Saldías, M., Laukkanen, A. -., Guzmán, M., Miranda, G., Stoney, J., Alku, P. & Sundberg, J. (2020). The Vocal Tract in Loud Twang-Like Singing While Producing High and Low Pitches. Journal of Voice.
[43]
Axelsson, N., Skantze, G. (2020). Using knowledge graphs and behaviour trees for feedback-aware presentation agents. I Proceedings of IVA 2020..
[44]
Henter, G. E., Alexanderson, S. & Beskow, J. (2020). MoGlow: Probabilistic and controllable motion synthesis using normalising flows. ACM Transactions on Graphics, 39(4), 236:1-236:14.
[45]
Jonason, N., Sturm, B., Thomé, C. (2020). The control-synthesis approach for making expressive and controllable neural music synthesizers. I Proceedings of the 2020 AI Music Creativity Conference..
[46]
Cumbal, R., David Lopes, J., Engwall, O. (2020). Detection of Listener Uncertainty in Robot-Led Second Language Conversation Practice. Presenterad vid 22nd ACM International Conference on Multimodal Interaction. Utrecht, the Netherlands.
[47]
Abelho Pereira, A. T., Oertel, C., Fermoselle, L., Mendelson, J., Gustafson, J. (2020). Effects of Different Interaction Contexts when Evaluating Gaze Models in HRI. I Proceedings of the 2020 ACM/IEEE international conference on human-robot interaction (HRI '20). (s. 131-138). Association for Computing Machinery (ACM).
[48]
Székely, É., Edlund, J., Gustafson, J. (2020). Augmented Prompt Selection for Evaluation of Spontaneous Speech Synthesis. I Proceedings of The 12th Language Resources and Evaluation Conference..
[49]
Székely, É., Henter, G. E., Beskow, J., Gustafson, J. (2020). Breathing and Speech Planning in Spontaneous Speech Synthesis. I IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE.
[50]
White, L. & Malisz, Z. (2020). Speech rhythm and timing. I Carlos Gussenhoven and Aoju Chen (Red.), Oxford Handbook of Language Prosody. Oxford: Oxford University Press.
Fullständig lista i KTH:s publikationsportal