TMH Publications (latest 50)

Below are the 50 latest publications from the Department of Speech, Music and Hearing.

TMH Publications

[1]
Friberg, A., Lindeberg, T., Hellwagner, M., Helgason, P., Salomão, G. L., Elovsson, A. ... Ternström, S. (2018). Prediction of three articulatory categories in vocal sound imitations using models for auditory receptive fields. Journal of the Acoustical Society of America, 144(3), 1467-1483.
[2]
Saponaro, G. (2019). Beyond the Self: Using Grounded Affordances to Interpret and Describe Others’ Actions. IEEE Transactions on Cognitive and Developmental Systems.
[3]
Selamtzis, A., Castellana, A., Salvi, G., Carullo, A. & Astolfi, A. (2019). Effect of vowel context in cepstral and entropy analysis of pathological voices. Biomedical Signal Processing and Control, 47, 350-357.
[4]
Li, C., Androulakaki, T., Gao, A. Y., Yang, F., Saikia, H., Peters, C., Skantze, G. (2018). Effects of Posture and Embodiment on Social Distance in Human-Agent Interaction in Mixed Reality. In Proceedings of the 18th International Conference on Intelligent Virtual Agents. (pp. 191-196). ACM Digital Library.
[5]
Peters, C., Li, C., Yang, F., Avramova, V., Skantze, G. (2018). Investigating Social Distances between Humans, Virtual Humans and Virtual Robots in Mixed Reality. In Proceedings of 17th International Conference on Autonomous Agents and MultiAgent Systems. (pp. 2247-2249).
[6]
Szabo Portela, A., Granqvist, S., Ternström, S. & Södersten, M. (2018). Vocal Behavior in Environmental Noise : Comparisons Between Work and Leisure Conditions in Women With Work-related Voice Disorders and Matched Controls. Journal of Voice, 32(1), 126.e23-126.e38.
[7]
Wistbacka, G., Andrade, P. A., Simberg, S., Hammarberg, B., Sodersten, M., Svec, J. G. & Granqvist, S. (2018). Resonance Tube Phonation in Water-the Effect of Tube Diameter and Water Depth on Back Pressure and Bubble Characteristics at Different Airflows. Journal of Voice, 32(1).
[8]
Fallgren, P., Malisz, Z., Edlund, J. (2018). A tool for exploring large amounts of found audio data. In CEUR Workshop Proceedings. (pp. 499-503). CEUR-WS.
[9]
Jonell, P., Mattias, B., Per, F., Kontogiorgos, D., David Aguas Lopes, J., Malisz, Z., Samuel, M., Oertel, C., Eran, R., Shore, T. (2018). FARMI: A Framework for Recording Multi-Modal Interactions. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). (pp. 3969-3974). Paris: European Language Resources Association.
[10]
Malisz, Z., Żygis, M. (2018). Lexical stress in Polish : Evidence from focus and phrase-position differentiated production data. In Proceedings of the International Conference on Speech Prosody. (pp. 1008-1012). International Speech Communications Association.
[11]
Per, F., Malisz, Z., Edlund, J. (2019). Bringing order to chaos : A non-sequential approach for browsing large sets of found audio data. In LREC 2018 - 11th International Conference on Language Resources and Evaluation. (pp. 4307-4311). European Language Resources Association (ELRA).
[12]
Kontogiorgos, D., Avramova, V., Alexanderson, S., Jonell, P., Oertel, C., Beskow, J., Skantze, G., Gustafson, J. (2018). A Multimodal Corpus for Mutual Gaze and Joint Attention in Multiparty Situated Interaction. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). (pp. 119-127). Paris.
[13]
Kontogiorgos, D., Sibirtseva, E., Pereira, A., Skantze, G., Gustafson, J. (2018). Multimodal reference resolution in collaborative assembly tasks. In Multimodal reference resolution in collaborative assembly tasks. ACM Digital Library.
[14]
Lopes, J., Engwall, O., Skantze, G. (2017). A First Visit to the Robot Language Café. In Proceedings of the ISCA workshop on Speech and Language Technology in Education. Stockholm.
[15]
Roddy, M., Skantze, G., Harte, N. (2018). Multimodal Continuous Turn-Taking Prediction Using Multiscale RNNs. In ICMI 2018 - Proceedings of the 2018 International Conference on Multimodal Interaction. (pp. 186-190).
[16]
Shore, T., Androulakaki, T., Skantze, G. (2019). KTH Tangrams: A Dataset for Research on Alignment and Conceptual Pacts in Task-Oriented Dialogue. In LREC 2018 - 11th International Conference on Language Resources and Evaluation. (pp. 768-775). Tokyo.
[17]
Shore, T., Skantze, G. (2018). Using Lexical Alignment and Referring Ability to Address Data Sparsity in Situated Dialog Reference Resolution. In Proceedings of 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP). (pp. 2288-2297).
[18]
Jonell, P., Oertel, C., Kontogiorgos, D., Beskow, J., Gustafson, J. (2018). Crowdsourced Multimodal Corpora Collection Tool. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). (pp. 728-734). Paris.
[19]
Sibirtseva, E., Kontogiorgos, D., Nykvist, O., Karaoguz, H., Leite, I., Gustafson, J., Kragic, D. (2018). A Comparison of Visualisation Methods for Disambiguating Verbal Requests in Human-Robot Interaction. In 2018 27th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN) 2018 27th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN)..
[20]
Holzapfel, A., Sturm, B. & Coeckelbergh, M. (2018). Ethical Dimensions of Music Information Retrieval Technology. Transactions of the International Society for Music Information Retrieval, 1(1), 44-55.
[21]
Borin, L., Forsberg, M., Edlund, J., Domeij, R. (2018). Språkbanken 2018 : Research resources for text, speech, & society. In CEUR Workshop Proceedings. (pp. 504-506). CEUR-WS.
[22]
Körner Gustafsson, J., Södersten, M., Ternström, S. & Schalling, E. (2018). Long-term effects of Lee Silverman Voice Treatment on daily voice use in Parkinson’s disease as measured with a portable voice accumulator. Logopedics, Phoniatrics, Vocology, 1-10.
[23]
Pabon, P. & Ternström, S. (2018). Feature maps of the acoustic spectrum of the voice. Journal of Voice.
[24]
Selamtzis, A., Ternström, S., Richter, B., Burk, F., Köberlein, M., Echternach, M. (2018). A comparison of electroglottographic and glottal area waveforms for phonation type differentiation in male professional singers. (Manuscript).
[25]
Selamtzis, A., Ternström, S., Richter, B., Burk, F., Köberlein, M. & Echternach, M. (2018). A comparison of electroglottographic and glottal area waveforms for phonation type differentiation in male professional singers. Journal of the Acoustical Society of America, 144(6), 3275-3288.
[26]
Ternström, S., D'Amario, S. & Selamtzis, A. (2018). Effects of the lung volume on the electroglottographic waveform in trained female singers. Journal of Voice.
[27]
Ternström, S., Johansson, D. & Selamtzis, A. (2018). FonaDyn - A system for real-time analysis of the electroglottogram, over the voice range. Software Quality Professional, 7, 74-80.
[28]
Hultén, M., Artman, H. & House, D. (2018). A model to analyse students’ cooperative ideageneration in conceptual design. International journal of technology and design education, 28(2), 451-470.
[29]
Chettri, B., Sturm, B., Benetos, E. (2018). ANALYSING REPLAY SPOOFING COUNTERMEASURE PERFORMANCE UNDER VARIED CONDITIONS. In 2018 IEEE 28TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP). IEEE.
[31]
Dabbaghchian, S. (2018). Computational Modeling of the Vocal Tract : Applications to Speech Production (Doctoral thesis , KTH Royal Institute of Technology, TRITA-EECS-AVL 2018:90). Retrieved from http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-239071.
[33]
[35]
Pabon, P. (2018). Mapping Individual Voice Quality over the Voice Range : The Measurement Paradigm of the Voice Range Profile (Doctoral thesis , KTH Royal Institute of Technology, Stockholm, TRITA-EECS-AVL 2018:70). Retrieved from http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-235824.
[36]
Elowsson, A. (2018). Modeling Music : Studies of Music Transcription, Music Perception and Music Production (Doctoral thesis , KTH Royal Institute of Technology, Stockholm, TRITA-EECS-AVL 2018-35). Retrieved from http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-226894.
[37]
Jansson, E. V. & Kabała, A. (2018). On the influence of arching and material on the vibration of a shell - Towards understanding the soloist violin. Vibrations in Physical Systems, 29.
[39]
Dabbaghchian, S., Arnela, M., Engwall, O. & Guasch, O. (2018). Reconstruction of vocal tract geometries from biomechanical simulations. International Journal for Numerical Methods in Biomedical Engineering.
[40]
Dabbaghchian, S., Arnela, M., Engwall, O. & Guasch, O. (2018). Synthesis of vowels and vowel-vowel utterancesusing a 3D biomechanical-acoustic model. IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[41]
Vijayan, A. E., Alexanderson, S., Beskow, J., Leite, I. (2018). Using Constrained Optimization for Real-Time Synchronization of Verbal and Nonverbal Robot Behavior. In 2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA). (pp. 1955-1961). IEEE Computer Society.
[42]
Sturm, B. (2018). What do these 5,599,881 parameters mean? : An analysis of a specific LSTM music transcription model, starting with the 70,281 parameters of its softmax layer. In Proceedings of the 6th International Workshop on Musical Metacreation (MUME 2018)..
[43]
Roddy, M., Skantze, G., Harte, N. (2018). Investigating speech features for continuous turn-taking prediction using LSTMs. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. (pp. 586-590). International Speech Communication Association.
[44]
Bisesi, E., Friberg, A. & Parncutt, R. (2019). A Computational Model of Immanent Accent Salience in Tonal Music. Frontiers in Psychology, 10(317), 1-19.
[45]
Finkel, S., Veit, R., Lotze, M., Friberg, A., Vuust, P., Soekadar, S. ... Kleber, B. (2019). Intermittent theta burst stimulation over right somatosensory larynx cortex enhances vocal pitch‐regulation in nonsingers. Human Brain Mapping.
[46]
Kragic, D., Gustafson, J., Karaoǧuz, H., Jensfelt, P., Krug, R. (2018). Interactive, collaborative robots : Challenges and opportunities. In IJCAI International Joint Conference on Artificial Intelligence. (pp. 18-25). International Joint Conferences on Artificial Intelligence.
[47]
Hallström, E., Mossmyr, S., Sturm, B., Vegeborn, V., Wedin, J. (2019). From Jigs and Reels to Schottisar och Polskor : Generating Scandinavian-like Folk Music with Deep Recurrent Networks. Presented at The 16th Sound & Music Computing Conference, Malaga, Spain, 28-31 May 2019.
[48]
Chettri, B., Mishra, S., Sturm, B. (2018). Analysing the predictions of a CNN-based replay spoofing detection system. In 2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018). (pp. 92-97). IEEE.
[49]
Kucherenko, T., Hasegawa, D., Naoshi, K., Henter, G. E., Kjellström, H. (2019). On the Importance of Representations for Speech-Driven Gesture Generation : Extended Abstract. Presented at International Conference on Autonomous Agents and Multiagent Systems (AAMAS '19), May 13-17, 2019, Montréal, Canada. (pp. 2072-2074). The International Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS).
[50]
Kontogiorgos, D., Abelho Pereira, A. T., Gustafson, J. (2019). The Trade-off between Interaction Time and Social Facilitation with Collaborative Social Robots. In The Challenges of Working on Social Robots that Collaborate with People..
Full list in the KTH publications portal
Page responsible:Web editors at EECS
Belongs to: Speech, Music and Hearing
Last changed: Oct 17, 2018