TMH Publications (latest 50)

Below are the 50 latest publications from the Department of Speech, Music and Hearing.

TMH Publications

[1]
Friberg, A., Lindeberg, T., Hellwagner, M., Helgason, P., Salomão, G. L., Elovsson, A. ... Ternström, S. (2018). Prediction of three articulatory categories in vocal sound imitations using models for auditory receptive fields. Journal of the Acoustical Society of America, 144(3), 1467-1483.
[2]
Saponaro, G. (2019). Beyond the Self: Using Grounded Affordances to Interpret and Describe Others’ Actions. IEEE Transactions on Cognitive and Developmental Systems.
[3]
Selamtzis, A., Castellana, A., Salvi, G., Carullo, A. & Astolfi, A. (2019). Effect of vowel context in cepstral and entropy analysis of pathological voices. Biomedical Signal Processing and Control, 47, 350-357.
[4]
Li, C., Androulakaki, T., Gao, A. Y., Yang, F., Saikia, H., Peters, C., Skantze, G. (2018). Effects of Posture and Embodiment on Social Distance in Human-Agent Interaction in Mixed Reality. I Proceedings of the 18th International Conference on Intelligent Virtual Agents. (s. 191-196). ACM Digital Library.
[5]
Peters, C., Li, C., Yang, F., Avramova, V., Skantze, G. (2018). Investigating Social Distances between Humans, Virtual Humans and Virtual Robots in Mixed Reality. I Proceedings of 17th International Conference on Autonomous Agents and MultiAgent Systems. (s. 2247-2249).
[6]
Szabo Portela, A., Granqvist, S., Ternström, S. & Södersten, M. (2018). Vocal Behavior in Environmental Noise : Comparisons Between Work and Leisure Conditions in Women With Work-related Voice Disorders and Matched Controls. Journal of Voice, 32(1), 126.e23-126.e38.
[7]
Wistbacka, G., Andrade, P. A., Simberg, S., Hammarberg, B., Sodersten, M., Svec, J. G. & Granqvist, S. (2018). Resonance Tube Phonation in Water-the Effect of Tube Diameter and Water Depth on Back Pressure and Bubble Characteristics at Different Airflows. Journal of Voice, 32(1).
[8]
Fallgren, P., Malisz, Z., Edlund, J. (2018). A tool for exploring large amounts of found audio data. I CEUR Workshop Proceedings. (s. 499-503). CEUR-WS.
[9]
Jonell, P., Mattias, B., Per, F., Kontogiorgos, D., David Aguas Lopes, J., Malisz, Z., Samuel, M., Oertel, C., Eran, R., Shore, T. (2018). FARMI: A Framework for Recording Multi-Modal Interactions. I Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). (s. 3969-3974). Paris: European Language Resources Association.
[10]
Malisz, Z., Żygis, M. (2018). Lexical stress in Polish : Evidence from focus and phrase-position differentiated production data. I Proceedings of the International Conference on Speech Prosody. (s. 1008-1012). International Speech Communications Association.
[11]
Per, F., Malisz, Z., Edlund, J. (2019). Bringing order to chaos : A non-sequential approach for browsing large sets of found audio data. I LREC 2018 - 11th International Conference on Language Resources and Evaluation. (s. 4307-4311). European Language Resources Association (ELRA).
[12]
Kontogiorgos, D., Avramova, V., Alexanderson, S., Jonell, P., Oertel, C., Beskow, J., Skantze, G., Gustafson, J. (2018). A Multimodal Corpus for Mutual Gaze and Joint Attention in Multiparty Situated Interaction. I Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). (s. 119-127). Paris.
[13]
Kontogiorgos, D., Sibirtseva, E., Pereira, A., Skantze, G., Gustafson, J. (2018). Multimodal reference resolution in collaborative assembly tasks. I Multimodal reference resolution in collaborative assembly tasks. ACM Digital Library.
[14]
Lopes, J., Engwall, O., Skantze, G. (2017). A First Visit to the Robot Language Café. I Proceedings of the ISCA workshop on Speech and Language Technology in Education. Stockholm.
[15]
Roddy, M., Skantze, G., Harte, N. (2018). Multimodal Continuous Turn-Taking Prediction Using Multiscale RNNs. I ICMI 2018 - Proceedings of the 2018 International Conference on Multimodal Interaction. (s. 186-190).
[16]
Shore, T., Androulakaki, T., Skantze, G. (2019). KTH Tangrams: A Dataset for Research on Alignment and Conceptual Pacts in Task-Oriented Dialogue. I LREC 2018 - 11th International Conference on Language Resources and Evaluation. (s. 768-775). Tokyo.
[17]
Shore, T., Skantze, G. (2018). Using Lexical Alignment and Referring Ability to Address Data Sparsity in Situated Dialog Reference Resolution. I Proceedings of 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP). (s. 2288-2297).
[18]
Jonell, P., Oertel, C., Kontogiorgos, D., Beskow, J., Gustafson, J. (2018). Crowdsourced Multimodal Corpora Collection Tool. I Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). (s. 728-734). Paris.
[19]
Sibirtseva, E., Kontogiorgos, D., Nykvist, O., Karaoguz, H., Leite, I., Gustafson, J., Kragic, D. (2018). A Comparison of Visualisation Methods for Disambiguating Verbal Requests in Human-Robot Interaction. I 2018 27th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN) 2018 27th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN)..
[20]
Holzapfel, A., Sturm, B. & Coeckelbergh, M. (2018). Ethical Dimensions of Music Information Retrieval Technology. Transactions of the International Society for Music Information Retrieval, 1(1), 44-55.
[21]
Borin, L., Forsberg, M., Edlund, J., Domeij, R. (2018). Språkbanken 2018 : Research resources for text, speech, & society. I CEUR Workshop Proceedings. (s. 504-506). CEUR-WS.
[22]
Körner Gustafsson, J., Södersten, M., Ternström, S. & Schalling, E. (2018). Long-term effects of Lee Silverman Voice Treatment on daily voice use in Parkinson’s disease as measured with a portable voice accumulator. Logopedics, Phoniatrics, Vocology, 1-10.
[23]
Pabon, P. & Ternström, S. (2018). Feature maps of the acoustic spectrum of the voice. Journal of Voice.
[24]
Selamtzis, A., Ternström, S., Richter, B., Burk, F., Köberlein, M., Echternach, M. (2018). A comparison of electroglottographic and glottal area waveforms for phonation type differentiation in male professional singers. (Manuskript).
[25]
Selamtzis, A., Ternström, S., Richter, B., Burk, F., Köberlein, M. & Echternach, M. (2018). A comparison of electroglottographic and glottal area waveforms for phonation type differentiation in male professional singers. Journal of the Acoustical Society of America, 144(6), 3275-3288.
[26]
Ternström, S., D'Amario, S. & Selamtzis, A. (2018). Effects of the lung volume on the electroglottographic waveform in trained female singers. Journal of Voice.
[27]
Ternström, S., Johansson, D. & Selamtzis, A. (2018). FonaDyn - A system for real-time analysis of the electroglottogram, over the voice range. Software Quality Professional, 7, 74-80.
[28]
Hultén, M., Artman, H. & House, D. (2018). A model to analyse students’ cooperative ideageneration in conceptual design. International journal of technology and design education, 28(2), 451-470.
[29]
Chettri, B., Sturm, B., Benetos, E. (2018). ANALYSING REPLAY SPOOFING COUNTERMEASURE PERFORMANCE UNDER VARIED CONDITIONS. I 2018 IEEE 28TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP). IEEE.
[31]
Dabbaghchian, S. (2018). Computational Modeling of the Vocal Tract : Applications to Speech Production (Doktorsavhandling , KTH Royal Institute of Technology, TRITA-EECS-AVL 2018:90). Hämtad från http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-239071.
[33]
[35]
Pabon, P. (2018). Mapping Individual Voice Quality over the Voice Range : The Measurement Paradigm of the Voice Range Profile (Doktorsavhandling , KTH Royal Institute of Technology, Stockholm, TRITA-EECS-AVL 2018:70). Hämtad från http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-235824.
[36]
Elowsson, A. (2018). Modeling Music : Studies of Music Transcription, Music Perception and Music Production (Doktorsavhandling , KTH Royal Institute of Technology, Stockholm, TRITA-EECS-AVL 2018-35). Hämtad från http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-226894.
[37]
Jansson, E. V. & Kabała, A. (2018). On the influence of arching and material on the vibration of a shell - Towards understanding the soloist violin. Vibrations in Physical Systems, 29.
[39]
Dabbaghchian, S., Arnela, M., Engwall, O. & Guasch, O. (2018). Reconstruction of vocal tract geometries from biomechanical simulations. International Journal for Numerical Methods in Biomedical Engineering.
[40]
Dabbaghchian, S., Arnela, M., Engwall, O. & Guasch, O. (2018). Synthesis of vowels and vowel-vowel utterancesusing a 3D biomechanical-acoustic model. IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[41]
Vijayan, A. E., Alexanderson, S., Beskow, J., Leite, I. (2018). Using Constrained Optimization for Real-Time Synchronization of Verbal and Nonverbal Robot Behavior. I 2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA). (s. 1955-1961). IEEE Computer Society.
[42]
Sturm, B. (2018). What do these 5,599,881 parameters mean? : An analysis of a specific LSTM music transcription model, starting with the 70,281 parameters of its softmax layer. I Proceedings of the 6th International Workshop on Musical Metacreation (MUME 2018)..
[43]
Roddy, M., Skantze, G., Harte, N. (2018). Investigating speech features for continuous turn-taking prediction using LSTMs. I Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. (s. 586-590). International Speech Communication Association.
[44]
Bisesi, E., Friberg, A. & Parncutt, R. (2019). A Computational Model of Immanent Accent Salience in Tonal Music. Frontiers in Psychology, 10(317), 1-19.
[45]
Finkel, S., Veit, R., Lotze, M., Friberg, A., Vuust, P., Soekadar, S. ... Kleber, B. (2019). Intermittent theta burst stimulation over right somatosensory larynx cortex enhances vocal pitch‐regulation in nonsingers. Human Brain Mapping.
[46]
Kragic, D., Gustafson, J., Karaoǧuz, H., Jensfelt, P., Krug, R. (2018). Interactive, collaborative robots : Challenges and opportunities. I IJCAI International Joint Conference on Artificial Intelligence. (s. 18-25). International Joint Conferences on Artificial Intelligence.
[47]
Hallström, E., Mossmyr, S., Sturm, B., Vegeborn, V., Wedin, J. (2019). From Jigs and Reels to Schottisar och Polskor : Generating Scandinavian-like Folk Music with Deep Recurrent Networks. Presenterad vid The 16th Sound & Music Computing Conference, Malaga, Spain, 28-31 May 2019.
[48]
Chettri, B., Mishra, S., Sturm, B. (2018). Analysing the predictions of a CNN-based replay spoofing detection system. I 2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018). (s. 92-97). IEEE.
[49]
Kucherenko, T., Hasegawa, D., Naoshi, K., Henter, G. E., Kjellström, H. (2019). On the Importance of Representations for Speech-Driven Gesture Generation : Extended Abstract. Presenterad vid International Conference on Autonomous Agents and Multiagent Systems (AAMAS '19), May 13-17, 2019, Montréal, Canada. (s. 2072-2074). The International Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS).
[50]
Kontogiorgos, D., Abelho Pereira, A. T., Gustafson, J. (2019). The Trade-off between Interaction Time and Social Facilitation with Collaborative Social Robots. I The Challenges of Working on Social Robots that Collaborate with People..
Fullständig lista i KTH:s publikationsportal