TMH Publications (latest 50)

Below are the 50 latest publications from the Department of Speech, Music and Hearing.

TMH Publications

[1]
Selamtzis, A., Castellana, A., Salvi, G., Carullo, A. & Astolfi, A. (2019). Effect of vowel context in cepstral and entropy analysis of pathological voices. Biomedical Signal Processing and Control, 47, 350-357.
[2]
Selamtzis, A., Castellana, A., Salvi, G., Carullo, A. & Astolfi, A. (2019). Effect of vowel context in cepstral and entropy analysis of pathological voices. Biomedical Signal Processing and Control, 47, 350-357.
[3]
Selamtzis, A., Ternström, S., Richter, B., Burk, F., Köberlein, M., Echternach, M. (2018). A comparison of electroglottographic and glottal area waveforms for phonation type differentiation in male professional singers. (Manuscript).
[4]
Sibirtseva, E., Kontogiorgos, D., Nykvist, O., Karaoguz, H., Leite, I., Gustafson, J., Kragic, D. (2018). A Comparison of Visualisation Methods for Disambiguating Verbal Requests in Human-Robot Interaction. In 2018 27th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN) 2018 27th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN)..
[5]
Hultén, M., Artman, H. & House, D. (2018). A model to analyse students’ cooperative ideageneration in conceptual design. International journal of technology and design education, 28(2), 451-470.
[6]
Kontogiorgos, D., Avramova, V., Alexanderson, S., Jonell, P., Oertel, C., Beskow, J., Skantze, G., Gustafson, J. (2018). A Multimodal Corpus for Mutual Gaze and Joint Attention in Multiparty Situated Interaction. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). (pp. 119-127). Paris.
[7]
Fallgren, P., Malisz, Z., Edlund, J. (2018). A tool for exploring large amounts of found audio data. In CEUR Workshop Proceedings. (pp. 499-503). CEUR-WS.
[9]
Dabbaghchian, S. (2018). Computational Modeling of the Vocal Tract : Applications to Speech Production (Doctoral thesis , KTH Royal Institute of Technology, TRITA-EECS-AVL 2018:90). Retrieved from http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-239071.
[10]
Jonell, P., Oertel, C., Kontogiorgos, D., Beskow, J., Gustafson, J. (2018). Crowdsourced Multimodal Corpora Collection Tool. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). (pp. 728-734). Paris.
[12]
Ternström, S., D'Amario, S. & Selamtzis, A. (2018). Effects of the lung volume on the electroglottographic waveform in trained female singers. Journal of Voice.
[13]
Holzapfel, A., Sturm, B. & Coeckelbergh, M. (2018). Ethical Dimensions of Music Information Retrieval Technology. Transactions of the International Society for Music Information Retrieval, 1(1), 44-55.
[14]
Jonell, P., Mattias, B., Per, F., Kontogiorgos, D., David Aguas Lopes, J., Malisz, Z., Samuel, M., Oertel, C., Eran, R., Shore, T. (2018). FARMI: A Framework for Recording Multi-Modal Interactions. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). (pp. 3969-3974). Paris: European Language Resources Association.
[15]
Pabon, P. & Ternström, S. (2018). Feature maps of the acoustic spectrum of the voice. Journal of Voice.
[16]
[17]
Ternström, S., Johansson, D. & Selamtzis, A. (2018). FonaDyn - A system for real-time analysis of the electroglottogram, over the voice range. Software Quality Professional, 7, 74-80.
[18]
Shore, T., Androulakaki, T., Skantze, G. (2018). KTH Tangrams: A Dataset for Research on Alignment and Conceptual Pacts in Task-Oriented Dialogue. Presented at Eleventh International Conference on Language Resources and Evaluation (LREC 2018). Tokyo.
[20]
Malisz, Z., Żygis, M. (2018). Lexical stress in Polish : Evidence from focus and phrase-position differentiated production data. In Proceedings of the International Conference on Speech Prosody. (pp. 1008-1012). International Speech Communications Association.
[21]
Körner Gustafsson, J., Södersten, M., Ternström, S. & Schalling, E. (2018). Long-term effects of Lee Silverman Voice Treatment on daily voice use in Parkinson’s disease as measured with a portable voice accumulator. Logopedics, Phoniatrics, Vocology, 1-10.
[22]
Pabon, P. (2018). Mapping Individual Voice Quality over the Voice Range : The Measurement Paradigm of the Voice Range Profile (Doctoral thesis , KTH Royal Institute of Technology, Stockholm, TRITA-EECS-AVL 2018:70). Retrieved from http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-235824.
[23]
Elowsson, A. (2018). Modeling Music : Studies of Music Transcription, Music Perception and Music Production (Doctoral thesis , KTH Royal Institute of Technology, Stockholm, TRITA-EECS-AVL 2018-35). Retrieved from http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-226894.
[24]
Kontogiorgos, D., Sibirtseva, E., Pereira, A., Skantze, G., Gustafson, J. (2018). Multimodal reference resolution in collaborative assembly tasks. Presented at Workshop on Multimodal Analyses enabling Artificial Agents in Human-Machine Interaction. ACM Digital Library.
[25]
Jansson, E. V. & Kabała, A. (2018). On the influence of arching and material on the vibration of a shell - Towards understanding the soloist violin. Vibrations in Physical Systems, 29.
[26]
Friberg, A., Lindeberg, T., Hellwagner, M., Helgason, P., Salomão, G. L., Elovsson, A. ... Ternström, S. (2018). Prediction of three articulatory categories in vocal sound imitations using models for auditory receptive fields. Journal of the Acoustical Society of America.
[28]
Dabbaghchian, S., Arnela, M., Engwall, O. & Guasch, O. (2018). Reconstruction of vocal tract geometries from biomechanical simulations. International Journal for Numerical Methods in Biomedical Engineering.
[29]
Wistbacka, G., Andrade, P. A., Simberg, S., Hammarberg, B., Sodersten, M., Svec, J. G. & Granqvist, S. (2018). Resonance Tube Phonation in Water-the Effect of Tube Diameter and Water Depth on Back Pressure and Bubble Characteristics at Different Airflows. Journal of Voice, 32(1).
[30]
Borin, L., Forsberg, M., Edlund, J., Domeij, R. (2018). Språkbanken 2018 : Research resources for text, speech, & society. In CEUR Workshop Proceedings. (pp. 504-506). CEUR-WS.
[31]
Dabbaghchian, S., Arnela, M., Engwall, O. & Guasch, O. (2018). Synthesis of vowels and vowel-vowel utterancesusing a 3D biomechanical-acoustic model. IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[32]
Vijayan, A. E., Alexanderson, S., Beskow, J., Leite, I. (2018). Using Constrained Optimization for Real-Time Synchronization of Verbal and Nonverbal Robot Behavior. In 2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA). (pp. 1955-1961). IEEE Computer Society.
[33]
Szabo Portela, A., Granqvist, S., Ternström, S. & Södersten, M. (2018). Vocal Behavior in Environmental Noise : Comparisons Between Work and Leisure Conditions in Women With Work-related Voice Disorders and Matched Controls. Journal of Voice, 32(1), 126.e23-126.e38.
[34]
Sturm, B. (2018). What do these 5,599,881 parameters mean? : An analysis of a specific LSTM music transcription model, starting with the 70,281 parameters of its softmax layer. In Proceedings of the 6th International Workshop on Musical Metacreation (MUME 2018)..
[35]
Lopes, J., Engwall, O., Skantze, G. (2017). A First Visit to the Robot Language Café. In Proceedings of the ISCA workshop on Speech and Language Technology in Education. Stockholm.
[36]
Johansson, R., Skantze, G., Jönsson, A. (2017). A psychotherapy training environment with virtual patients implemented using the furhat robot platform. In 17th International Conference on Intelligent Virtual Agents, IVA 2017. (pp. 184-187). Springer.
[37]
Stefanov, K., Beskow, J. (2017). A Real-time Gesture Recognition System for Isolated Swedish Sign Language Signs. In Proceedings of the 4th European and 7th Nordic Symposium on Multimodal Communication (MMSYM 2016). Linköping University Electronic Press.
[38]
Arnela, M., Dabbaghchian, S., Guasch, O., Engwall, O. (2017). A semi-polar grid strategy for the three-dimensional finite element simulation of vowel-vowel sequences. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2017. (pp. 3477-3481). The International Speech Communication Association (ISCA).
[39]
Degirmenci, N. C., Jansson, J., Hoffman, J., Arnela, M., Sánchez-Martín, P., Guasch, O., Ternström, S. (2017). A Unified Numerical Simulation of Vowel Production That Comprises Phonation and the Emitted Sound. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2017. (pp. 3492-3496). The International Speech Communication Association (ISCA).
[40]
Avramova, V., Yang, F., Li, C., Peters, C., Skantze, G. (2017). A virtual poster presenter using mixed reality. In 17th International Conference on Intelligent Virtual Agents, IVA 2017. (pp. 25-28). Springer.
[41]
Strömbergsson, S., Edlund, J., Götze, J., Björkenstam, K. N. (2017). Approximating phonotactic input in children's linguistic environments from orthographic transcripts. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2017. (pp. 2213-2217). International Speech Communication Association.
[42]
Mendelson, J., Aylett, M. (2017). Beyond the listening test : An interactive approach to TTS Evaluation. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. (pp. 249-253). International Speech Communication Association.
[43]
Castellana, A., Selamtzis, A., Salvi, G., Carullo, A., Astolfi, A. (2017). Cepstral and entropy analyses in vowels excerpted from continuous speech of dysphonic and control speakers. In Proceedings of the Annual Conference of the International Speech Communication Association, Interspeech 2017. (pp. 1814-1818). International Speech Communication Association.
[44]
[45]
Friberg, A. (2017). Commentary on Polak How short is the shortest metric subdivision?. Empirical Musicology Review, 12(3-4), 227-228.
[46]
Karipidou, K., Ahnlund, J., Friberg, A., Alexanderson, S., Kjellström, H. (2017). Computer Analysis of Sentiment Interpretation in Musical Conducting. In Proceedings - 12th IEEE International Conference on Automatic Face and Gesture Recognition, FG 2017. (pp. 400-405). IEEE.
[47]
Malisz, Z., Berthelsen, H., Beskow, J., Gustafson, J. (2017). Controlling prominence realisation in parametric DNN-based speech synthesis. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2017. (pp. 1079-1083). International Speech Communication Association.
[49]
Friberg, A., Choi, K., Schön, R., Downie, J. S., Elowsson, A. (2017). Cross-cultural aspects of perceptual features in K-pop : A pilot study comparing Chinese and Swedish listeners. In 2017 ICMC/EMW - 43rd International Computer Music Conference and the 6th International Electronic Music Week. (pp. 291-296). Shanghai Conservatory of Music.
[50]
Jonell, P., Oertel, C., Kontogiorgos, D., Beskow, J., Gustafson, J. (2017). Crowd-powered design of virtual attentive listeners. In 17th International Conference on Intelligent Virtual Agents, IVA 2017. (pp. 188-191). Springer.
Full list in the KTH publications portal
Top page top