TMH Publications (latest 50)

Below are the 50 latest publications from the Department of Speech, Music and Hearing.

TMH Publications

[1]
Saponaro, G., Jamone, L., Bernardino, A. & Salvi, G. (2019). Beyond the Self: Using Grounded Affordances to Interpret and Describe Others’ Actions. IEEE Transactions on Cognitive and Developmental Systems.
[2]
Selamtzis, A., Castellana, A., Salvi, G., Carullo, A. & Astolfi, A. (2019). Effect of vowel context in cepstral and entropy analysis of pathological voices. Biomedical Signal Processing and Control, 47, 350-357.
[3]
Li, C., Androulakaki, T., Gao, A. Y., Yang, F., Saikia, H., Peters, C., Skantze, G. (2018). Effects of Posture and Embodiment on Social Distance in Human-Agent Interaction in Mixed Reality. In Proceedings of the 18th International Conference on Intelligent Virtual Agents. (pp. 191-196). ACM Digital Library.
[4]
Peters, C., Li, C., Yang, F., Avramova, V., Skantze, G. (2018). Investigating Social Distances between Humans, Virtual Humans and Virtual Robots in Mixed Reality. In Proceedings of 17th International Conference on Autonomous Agents and MultiAgent Systems. (pp. 2247-2249).
[5]
Fallgren, P., Malisz, Z., Edlund, J. (2018). A tool for exploring large amounts of found audio data. In CEUR Workshop Proceedings. (pp. 499-503). CEUR-WS.
[6]
Jonell, P., Mattias, B., Per, F., Kontogiorgos, D., David Aguas Lopes, J., Malisz, Z., Samuel, M., Oertel, C., Eran, R., Shore, T. (2018). FARMI: A Framework for Recording Multi-Modal Interactions. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). (pp. 3969-3974). Paris: European Language Resources Association.
[7]
Malisz, Z., Żygis, M. (2018). Lexical stress in Polish : Evidence from focus and phrase-position differentiated production data. In Proceedings of the International Conference on Speech Prosody. (pp. 1008-1012). International Speech Communications Association.
[8]
Per, F., Malisz, Z., Edlund, J. (2019). Bringing order to chaos : A non-sequential approach for browsing large sets of found audio data. In LREC 2018 - 11th International Conference on Language Resources and Evaluation. (pp. 4307-4311). European Language Resources Association (ELRA).
[9]
Kontogiorgos, D., Avramova, V., Alexanderson, S., Jonell, P., Oertel, C., Beskow, J., Skantze, G., Gustafson, J. (2018). A Multimodal Corpus for Mutual Gaze and Joint Attention in Multiparty Situated Interaction. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). (pp. 119-127). Paris.
[10]
Kontogiorgos, D., Sibirtseva, E., Pereira, A., Skantze, G., Gustafson, J. (2018). Multimodal reference resolution in collaborative assembly tasks. In Multimodal reference resolution in collaborative assembly tasks. ACM Digital Library.
[11]
Roddy, M., Skantze, G., Harte, N. (2018). Multimodal Continuous Turn-Taking Prediction Using Multiscale RNNs. In ICMI 2018 - Proceedings of the 2018 International Conference on Multimodal Interaction. (pp. 186-190).
[12]
Shore, T., Androulakaki, T., Skantze, G. (2019). KTH Tangrams: A Dataset for Research on Alignment and Conceptual Pacts in Task-Oriented Dialogue. In LREC 2018 - 11th International Conference on Language Resources and Evaluation. (pp. 768-775). Tokyo.
[13]
Jonell, P., Oertel, C., Kontogiorgos, D., Beskow, J., Gustafson, J. (2018). Crowdsourced Multimodal Corpora Collection Tool. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). (pp. 728-734). Paris.
[14]
Sibirtseva, E., Kontogiorgos, D., Nykvist, O., Karaoguz, H., Leite, I., Gustafson, J., Kragic, D. (2018). A Comparison of Visualisation Methods for Disambiguating Verbal Requests in Human-Robot Interaction. In 2018 27th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN) 2018 27th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN)..
[15]
Holzapfel, A., Sturm, B. & Coeckelbergh, M. (2018). Ethical Dimensions of Music Information Retrieval Technology. Transactions of the International Society for Music Information Retrieval, 1(1), 44-55.
[16]
Körner Gustafsson, J., Södersten, M., Ternström, S. & Schalling, E. (2019). Long-term effects of Lee Silverman Voice Treatment on daily voice use in Parkinson’s disease as measured with a portable voice accumulator. Logopedics, Phoniatrics, Vocology, 44(3), 124-133.
[17]
Pabon, P. & Ternström, S. (2018). Feature maps of the acoustic spectrum of the voice. Journal of Voice.
[18]
Selamtzis, A., Ternström, S., Richter, B., Burk, F., Köberlein, M., Echternach, M. (2018). A comparison of electroglottographic and glottal area waveforms for phonation type differentiation in male professional singers. (Manuscript).
[19]
Selamtzis, A., Ternström, S., Richter, B., Burk, F., Köberlein, M. & Echternach, M. (2018). A comparison of electroglottographic and glottal area waveforms for phonation type differentiation in male professional singers. Journal of the Acoustical Society of America, 144(6), 3275-3288.
[20]
Ternström, S., D'Amario, S. & Selamtzis, A. (2018). Effects of the lung volume on the electroglottographic waveform in trained female singers. Journal of Voice.
[21]
Ternström, S., Johansson, D. & Selamtzis, A. (2018). FonaDyn - A system for real-time analysis of the electroglottogram, over the voice range. Software Quality Professional, 7, 74-80.
[22]
Hultén, M., Artman, H. & House, D. (2018). A model to analyse students’ cooperative ideageneration in conceptual design. International journal of technology and design education, 28(2), 451-470.
[23]
Chettri, B., Sturm, B., Benetos, E. (2018). ANALYSING REPLAY SPOOFING COUNTERMEASURE PERFORMANCE UNDER VARIED CONDITIONS. In 2018 IEEE 28TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP). IEEE.
[25]
Dabbaghchian, S. (2018). Computational Modeling of the Vocal Tract : Applications to Speech Production (Doctoral thesis , KTH Royal Institute of Technology, TRITA-EECS-AVL 2018:90). Retrieved from http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-239071.
[27]
[29]
Pabon, P. (2018). Mapping Individual Voice Quality over the Voice Range : The Measurement Paradigm of the Voice Range Profile (Doctoral thesis , KTH Royal Institute of Technology, Stockholm, TRITA-EECS-AVL 2018:70). Retrieved from http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-235824.
[30]
Elowsson, A. (2018). Modeling Music : Studies of Music Transcription, Music Perception and Music Production (Doctoral thesis , KTH Royal Institute of Technology, Stockholm, TRITA-EECS-AVL 2018-35). Retrieved from http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-226894.
[31]
Roddy, M., Skantze, G., Harte, N. (2018). Investigating speech features for continuous turn-taking prediction using LSTMs. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. (pp. 586-590). International Speech Communication Association.
[32]
Bisesi, E., Friberg, A. & Parncutt, R. (2019). A Computational Model of Immanent Accent Salience in Tonal Music. Frontiers in Psychology, 10(317), 1-19.
[33]
Finkel, S., Veit, R., Lotze, M., Friberg, A., Vuust, P., Soekadar, S. ... Kleber, B. (2019). Intermittent theta burst stimulation over right somatosensory larynx cortex enhances vocal pitch‐regulation in nonsingers. Human Brain Mapping.
[34]
Kragic, D., Gustafson, J., Karaoǧuz, H., Jensfelt, P., Krug, R. (2018). Interactive, collaborative robots : Challenges and opportunities. In IJCAI International Joint Conference on Artificial Intelligence. (pp. 18-25). International Joint Conferences on Artificial Intelligence.
[35]
Hallström, E., Mossmyr, S., Sturm, B., Vegeborn, V., Wedin, J. (2019). From Jigs and Reels to Schottisar och Polskor : Generating Scandinavian-like Folk Music with Deep Recurrent Networks. Presented at The 16th Sound & Music Computing Conference, Malaga, Spain, 28-31 May 2019.
[36]
Chettri, B., Mishra, S., Sturm, B. (2018). Analysing the predictions of a CNN-based replay spoofing detection system. In 2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018). (pp. 92-97). IEEE.
[37]
Kucherenko, T., Hasegawa, D., Naoshi, K., Henter, G. E., Kjellström, H. (2019). On the Importance of Representations for Speech-Driven Gesture Generation : Extended Abstract. Presented at International Conference on Autonomous Agents and Multiagent Systems (AAMAS '19), May 13-17, 2019, Montréal, Canada. (pp. 2072-2074). The International Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS).
[38]
Kontogiorgos, D., Abelho Pereira, A. T., Gustafson, J. (2019). The Trade-off between Interaction Time and Social Facilitation with Collaborative Social Robots. In The Challenges of Working on Social Robots that Collaborate with People..
[39]
Skantze, G., Gustafson, J. & Beskow, J. (2019). Multimodal Conversational Interaction with Robots. In Sharon Oviatt, Björn Schuller, Philip R. Cohen, Daniel Sonntag, Gerasimos Potamianos, Antonio Krüger (Ed.), The Handbook of Multimodal-Multisensor Interfaces, Volume 3: Language Processing, Software, Commercialization, and Emerging Directions. ACM Press.
[40]
Friberg, A., Bisesi, E., Addessi, A. R. & Baroni, M. (2019). Probing the Underlying Principles of Perceived Immanent Accents Using a Modeling Approach. Frontiers in Psychology, 10.
[41]
Kucherenko, T., Hasegawa, D., Henter, G. E., Kaneko, N., Kjellström, H. (2019). Analyzing Input and Output Representations for Speech-Driven Gesture Generation. In 19th ACM International Conference on Intelligent Virtual Agents. New York, NY, USA: ACM Publications.
[42]
Ternström, S. (2019). Normalized time-domain parameters for electroglottographic waveforms. Journal of the Acoustical Society of America, 146(1), EL65-EL70.
[43]
Kontogiorgos, D., Skantze, G., Abelho Pereira, A. T., Gustafson, J. (2019). The Effects of Embodiment and Social Eye-Gaze in Conversational Agents. In Proceedings of the 41st Annual Conference of the Cognitive Science Society (CogSci)..
[44]
Rodríguez-Algarra, F., Sturm, B. & Dixon, S. (2019). Characterising Confounding Effects in Music Classification Experiments through Interventions. Transactions of the International Society for Music Information Retrieval, 52-66.
[45]
Mishra, S., Stoller, D., Benetos, E., Sturm, B., Dixon, S. (2019). GAN-Based Generation and Automatic Selection of Explanations for Neural Networks. Presented at Safe Machine Learning 2019 Workshop at the International Conference on Learning Representations.
[46]
Stefanov, K., Salvi, G., Kontogiorgos, D., Kjellström, H. & Beskow, J. (2019). Modeling of Human Visual Attention in Multiparty Open-World Dialogues. ACM TRANSACTIONS ON HUMAN-ROBOT INTERACTION, 8(2).
[47]
Sturm, B., Iglesias, M., Ben-Tal, O., Miron, M. & Gómez, E. (2019). Artificial Intelligence and Music: Open Questions of Copyright Law and Engineering Praxis. MDPI Arts, 8(3).
[48]
Kontogiorgos, D. (2019). Multimodal Language Grounding for Human-Robot Collaboration : YRRSDS 2019 - Dimosthenis Kontogiorgos. In Young Researchers Roundtable on Spoken Dialogue Systems..
[49]
Lã, F. M.B., Ternström, S. (2019). Flow ball-assisted training : immediate effects on vocal fold contacting. In Pan-European Voice Conference 2019. (pp. 50-51). University of Copenhagen.
[50]
Ternström, S., Pabon, P. (2019). Accounting for variability over the voice range. In Proceedings of the ICA 2019 and EAA Euroregio. (pp. 4146-4151). Aachen, DE: Deutsche Gesellschaft für Akustik (DEGA e.V.).
Full list in the KTH publications portal
Page responsible:Web editors at EECS
Belongs to: Speech, Music and Hearing
Last changed: Oct 17, 2018