TMH Publications (latest 50)

Below are the 50 latest publications from the Department of Speech, Music and Hearing.

TMH Publications

[1]
Friberg, A., Lindeberg, T., Hellwagner, M., Helgason, P., Salomão, G. L., Elovsson, A. ... Ternström, S. (2018). Prediction of three articulatory categories in vocal sound imitations using models for auditory receptive fields. Journal of the Acoustical Society of America, 144(3), 1467-1483.
[2]
Selamtzis, A., Castellana, A., Salvi, G., Carullo, A. & Astolfi, A. (2019). Effect of vowel context in cepstral and entropy analysis of pathological voices. Biomedical Signal Processing and Control, 47, 350-357.
[3]
Li, C., Androulakaki, T., Gao, A. Y., Yang, F., Saikia, H., Peters, C., Skantze, G. (2018). Effects of Posture and Embodiment on Social Distance in Human-Agent Interaction in Mixed Reality. In Proceedings of the 18th International Conference on Intelligent Virtual Agents. (pp. 191-196). ACM Digital Library.
[4]
Peters, C., Li, C., Yang, F., Avramova, V., Skantze, G. (2018). Investigating Social Distances between Humans, Virtual Humans and Virtual Robots in Mixed Reality. In Proceedings of 17th International Conference on Autonomous Agents and MultiAgent Systems. (pp. 2247-2249).
[5]
Szabo Portela, A., Granqvist, S., Ternström, S. & Södersten, M. (2018). Vocal Behavior in Environmental Noise : Comparisons Between Work and Leisure Conditions in Women With Work-related Voice Disorders and Matched Controls. Journal of Voice, 32(1), 126.e23-126.e38.
[6]
Wistbacka, G., Andrade, P. A., Simberg, S., Hammarberg, B., Sodersten, M., Svec, J. G. & Granqvist, S. (2018). Resonance Tube Phonation in Water-the Effect of Tube Diameter and Water Depth on Back Pressure and Bubble Characteristics at Different Airflows. Journal of Voice, 32(1).
[7]
Fallgren, P., Malisz, Z., Edlund, J. (2018). A tool for exploring large amounts of found audio data. In CEUR Workshop Proceedings. (pp. 499-503). CEUR-WS.
[8]
Jonell, P., Mattias, B., Per, F., Kontogiorgos, D., David Aguas Lopes, J., Malisz, Z., Samuel, M., Oertel, C., Eran, R., Shore, T. (2018). FARMI: A Framework for Recording Multi-Modal Interactions. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). (pp. 3969-3974). Paris: European Language Resources Association.
[9]
Malisz, Z., Żygis, M. (2018). Lexical stress in Polish : Evidence from focus and phrase-position differentiated production data. In Proceedings of the International Conference on Speech Prosody. (pp. 1008-1012). International Speech Communications Association.
[10]
Per, F., Malisz, Z., Edlund, J. (2019). Bringing order to chaos : A non-sequential approach for browsing large sets of found audio data. In LREC 2018 - 11th International Conference on Language Resources and Evaluation. (pp. 4307-4311). European Language Resources Association (ELRA).
[11]
Johansson, R., Skantze, G., Jönsson, A. (2017). A psychotherapy training environment with virtual patients implemented using the furhat robot platform. In 17th International Conference on Intelligent Virtual Agents, IVA 2017. (pp. 184-187). Springer.
[12]
Kontogiorgos, D., Avramova, V., Alexanderson, S., Jonell, P., Oertel, C., Beskow, J., Skantze, G., Gustafson, J. (2018). A Multimodal Corpus for Mutual Gaze and Joint Attention in Multiparty Situated Interaction. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). (pp. 119-127). Paris.
[13]
Kontogiorgos, D., Sibirtseva, E., Pereira, A., Skantze, G., Gustafson, J. (2018). Multimodal reference resolution in collaborative assembly tasks. In Multimodal reference resolution in collaborative assembly tasks. ACM Digital Library.
[14]
Lopes, J., Engwall, O., Skantze, G. (2017). A First Visit to the Robot Language Café. In Proceedings of the ISCA workshop on Speech and Language Technology in Education. Stockholm.
[15]
Roddy, M., Skantze, G., Harte, N. (2018). Multimodal Continuous Turn-Taking Prediction Using Multiscale RNNs. In ICMI 2018 - Proceedings of the 2018 International Conference on Multimodal Interaction. (pp. 186-190).
[16]
Shore, T., Androulakaki, T., Skantze, G. (2019). KTH Tangrams: A Dataset for Research on Alignment and Conceptual Pacts in Task-Oriented Dialogue. In LREC 2018 - 11th International Conference on Language Resources and Evaluation. (pp. 768-775). Tokyo.
[17]
Shore, T., Skantze, G. (2018). Using Lexical Alignment and Referring Ability to Address Data Sparsity in Situated Dialog Reference Resolution. In Proceedings of 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP). (pp. 2288-2297).
[18]
Jonell, P., Oertel, C., Kontogiorgos, D., Beskow, J., Gustafson, J. (2018). Crowdsourced Multimodal Corpora Collection Tool. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). (pp. 728-734). Paris.
[19]
Sibirtseva, E., Kontogiorgos, D., Nykvist, O., Karaoguz, H., Leite, I., Gustafson, J., Kragic, D. (2018). A Comparison of Visualisation Methods for Disambiguating Verbal Requests in Human-Robot Interaction. In 2018 27th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN) 2018 27th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN)..
[20]
Holzapfel, A., Sturm, B. & Coeckelbergh, M. (2018). Ethical Dimensions of Music Information Retrieval Technology. Transactions of the International Society for Music Information Retrieval, 1(1), 44-55.
[21]
Borin, L., Forsberg, M., Edlund, J., Domeij, R. (2018). Språkbanken 2018 : Research resources for text, speech, & society. In CEUR Workshop Proceedings. (pp. 504-506). CEUR-WS.
[22]
Körner Gustafsson, J., Södersten, M., Ternström, S. & Schalling, E. (2018). Long-term effects of Lee Silverman Voice Treatment on daily voice use in Parkinson’s disease as measured with a portable voice accumulator. Logopedics, Phoniatrics, Vocology, 1-10.
[23]
Pabon, P. & Ternström, S. (2018). Feature maps of the acoustic spectrum of the voice. Journal of Voice.
[24]
Selamtzis, A., Ternström, S., Richter, B., Burk, F., Köberlein, M., Echternach, M. (2018). A comparison of electroglottographic and glottal area waveforms for phonation type differentiation in male professional singers. (Manuscript).
[25]
Selamtzis, A., Ternström, S., Richter, B., Burk, F., Köberlein, M. & Echternach, M. (2018). A comparison of electroglottographic and glottal area waveforms for phonation type differentiation in male professional singers. Journal of the Acoustical Society of America, 144(6), 3275-3288.
[26]
Ternström, S., D'Amario, S. & Selamtzis, A. (2018). Effects of the lung volume on the electroglottographic waveform in trained female singers. Journal of Voice.
[27]
Ternström, S., Johansson, D. & Selamtzis, A. (2018). FonaDyn - A system for real-time analysis of the electroglottogram, over the voice range. Software Quality Professional, 7, 74-80.
[28]
Hultén, M., Artman, H. & House, D. (2018). A model to analyse students’ cooperative ideageneration in conceptual design. International journal of technology and design education, 28(2), 451-470.
[29]
Stefanov, K., Beskow, J. (2017). A Real-time Gesture Recognition System for Isolated Swedish Sign Language Signs. In Proceedings of the 4th European and 7th Nordic Symposium on Multimodal Communication (MMSYM 2016). Linköping University Electronic Press.
[30]
Arnela, M., Dabbaghchian, S., Guasch, O., Engwall, O. (2017). A semi-polar grid strategy for the three-dimensional finite element simulation of vowel-vowel sequences. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2017. (pp. 3477-3481). The International Speech Communication Association (ISCA).
[31]
Chettri, B., Sturm, B., Benetos, E. (2018). ANALYSING REPLAY SPOOFING COUNTERMEASURE PERFORMANCE UNDER VARIED CONDITIONS. In 2018 IEEE 28TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP). IEEE.
[33]
Dabbaghchian, S. (2018). Computational Modeling of the Vocal Tract : Applications to Speech Production (Doctoral thesis , KTH Royal Institute of Technology, TRITA-EECS-AVL 2018:90). Retrieved from http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-239071.
[35]
[37]
Pabon, P. (2018). Mapping Individual Voice Quality over the Voice Range : The Measurement Paradigm of the Voice Range Profile (Doctoral thesis , KTH Royal Institute of Technology, Stockholm, TRITA-EECS-AVL 2018:70). Retrieved from http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-235824.
[38]
Elowsson, A. (2018). Modeling Music : Studies of Music Transcription, Music Perception and Music Production (Doctoral thesis , KTH Royal Institute of Technology, Stockholm, TRITA-EECS-AVL 2018-35). Retrieved from http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-226894.
[39]
Jansson, E. V. & Kabała, A. (2018). On the influence of arching and material on the vibration of a shell - Towards understanding the soloist violin. Vibrations in Physical Systems, 29.
[41]
Dabbaghchian, S., Arnela, M., Engwall, O. & Guasch, O. (2018). Reconstruction of vocal tract geometries from biomechanical simulations. International Journal for Numerical Methods in Biomedical Engineering.
[42]
Dabbaghchian, S., Arnela, M., Engwall, O. & Guasch, O. (2018). Synthesis of vowels and vowel-vowel utterancesusing a 3D biomechanical-acoustic model. IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[43]
Vijayan, A. E., Alexanderson, S., Beskow, J., Leite, I. (2018). Using Constrained Optimization for Real-Time Synchronization of Verbal and Nonverbal Robot Behavior. In 2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA). (pp. 1955-1961). IEEE Computer Society.
[44]
Sturm, B. (2018). What do these 5,599,881 parameters mean? : An analysis of a specific LSTM music transcription model, starting with the 70,281 parameters of its softmax layer. In Proceedings of the 6th International Workshop on Musical Metacreation (MUME 2018)..
[45]
Roddy, M., Skantze, G., Harte, N. (2018). Investigating speech features for continuous turn-taking prediction using LSTMs. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. (pp. 586-590). International Speech Communication Association.
[46]
Bisesi, E., Friberg, A. & Parncutt, R. (2019). A Computational Model of Immanent Accent Salience in Tonal Music. Frontiers in Psychology, 10(317), 1-19.
[47]
Finkel, S., Veit, R., Lotze, M., Friberg, A., Vuust, P., Soekadar, S. ... Kleber, B. (2019). Intermittent theta burst stimulation over right somatosensory larynx cortex enhances vocal pitch‐regulation in nonsingers. Human Brain Mapping.
[48]
Hallström, E., Mossmyr, S., Sturm, B., Vegeborn, V., Wedin, J. (2019). From Jigs and Reels to Schottisar och Polskor : Generating Scandinavian-like Folk Music with Deep Recurrent Networks. Presented at Sound and Music Computing.
[49]
Chettri, B., Sturm, B., Benetos, E. (2018). ANALYSING REPLAY SPOOFING COUNTERMEASURE PERFORMANCE UNDER VARIED CONDITIONS. Presented at International Workshop on Machine Learning for Signal Processing (MLSP). Aalborg, DK.
[50]
Chettri, B., Mishra, S., Sturm, B. (2018). Analysing the predictions of a CNN-based replay spoofing detection system. Presented at IEEE Workshop Spoken Lang. Tech.. IEEE.
Full list in the KTH publications portal
Belongs to: Speech, Music and Hearing
Last changed: Oct 17, 2018