TMH Publications (latest 50)

Below are the 50 latest publications from the Department of Speech, Music and Hearing.

TMH Publications

[1]
Per, F., Malisz, Z., Edlund, J. (2019). Bringing order to chaos : A non-sequential approach for browsing large sets of found audio data. In LREC 2018 - 11th International Conference on Language Resources and Evaluation. (pp. 4307-4311). European Language Resources Association (ELRA).
[2]
Selamtzis, A., Castellana, A., Salvi, G., Carullo, A. & Astolfi, A. (2019). Effect of vowel context in cepstral and entropy analysis of pathological voices. Biomedical Signal Processing and Control, 47, 350-357.
[3]
Selamtzis, A., Ternström, S., Richter, B., Burk, F., Köberlein, M., Echternach, M. (2018). A comparison of electroglottographic and glottal area waveforms for phonation type differentiation in male professional singers. (Manuscript).
[4]
Selamtzis, A., Ternström, S., Richter, B., Burk, F., Köberlein, M. & Echternach, M. (2018). A comparison of electroglottographic and glottal area waveforms for phonation type differentiation in male professional singers. Journal of the Acoustical Society of America, 144(6), 3275-3288.
[5]
Sibirtseva, E., Kontogiorgos, D., Nykvist, O., Karaoguz, H., Leite, I., Gustafson, J., Kragic, D. (2018). A Comparison of Visualisation Methods for Disambiguating Verbal Requests in Human-Robot Interaction. In 2018 27th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN) 2018 27th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN)..
[6]
Hultén, M., Artman, H. & House, D. (2018). A model to analyse students’ cooperative ideageneration in conceptual design. International journal of technology and design education, 28(2), 451-470.
[7]
Kontogiorgos, D., Avramova, V., Alexanderson, S., Jonell, P., Oertel, C., Beskow, J., Skantze, G., Gustafson, J. (2018). A Multimodal Corpus for Mutual Gaze and Joint Attention in Multiparty Situated Interaction. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). (pp. 119-127). Paris.
[8]
Fallgren, P., Malisz, Z., Edlund, J. (2018). A tool for exploring large amounts of found audio data. In CEUR Workshop Proceedings. (pp. 499-503). CEUR-WS.
[9]
Chettri, B., Sturm, B., Benetos, E. (2018). ANALYSING REPLAY SPOOFING COUNTERMEASURE PERFORMANCE UNDER VARIED CONDITIONS. In 2018 IEEE 28TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP). IEEE.
[11]
Dabbaghchian, S. (2018). Computational Modeling of the Vocal Tract : Applications to Speech Production (Doctoral thesis , KTH Royal Institute of Technology, TRITA-EECS-AVL 2018:90). Retrieved from http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-239071.
[12]
Jonell, P., Oertel, C., Kontogiorgos, D., Beskow, J., Gustafson, J. (2018). Crowdsourced Multimodal Corpora Collection Tool. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). (pp. 728-734). Paris.
[14]
Ternström, S., D'Amario, S. & Selamtzis, A. (2018). Effects of the lung volume on the electroglottographic waveform in trained female singers. Journal of Voice.
[15]
Holzapfel, A., Sturm, B. & Coeckelbergh, M. (2018). Ethical Dimensions of Music Information Retrieval Technology. Transactions of the International Society for Music Information Retrieval, 1(1), 44-55.
[16]
Jonell, P., Mattias, B., Per, F., Kontogiorgos, D., David Aguas Lopes, J., Malisz, Z., Samuel, M., Oertel, C., Eran, R., Shore, T. (2018). FARMI: A Framework for Recording Multi-Modal Interactions. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). (pp. 3969-3974). Paris: European Language Resources Association.
[17]
Pabon, P. & Ternström, S. (2018). Feature maps of the acoustic spectrum of the voice. Journal of Voice.
[18]
[19]
Ternström, S., Johansson, D. & Selamtzis, A. (2018). FonaDyn - A system for real-time analysis of the electroglottogram, over the voice range. Software Quality Professional, 7, 74-80.
[20]
Peters, C., Li, C., Yang, F., Avramova, V., Skantze, G. (2018). Investigating Social Distances between Humans, Virtual Humans and Virtual Robots in Mixed Reality. In Proceedings of 17th International Conference on Autonomous Agents and MultiAgent Systems. (pp. 2247-2249).
[21]
Shore, T., Androulakaki, T., Skantze, G. (2018). KTH Tangrams: A Dataset for Research on Alignment and Conceptual Pacts in Task-Oriented Dialogue. Presented at Eleventh International Conference on Language Resources and Evaluation (LREC 2018). Tokyo.
[23]
Malisz, Z., Żygis, M. (2018). Lexical stress in Polish : Evidence from focus and phrase-position differentiated production data. In Proceedings of the International Conference on Speech Prosody. (pp. 1008-1012). International Speech Communications Association.
[24]
Körner Gustafsson, J., Södersten, M., Ternström, S. & Schalling, E. (2018). Long-term effects of Lee Silverman Voice Treatment on daily voice use in Parkinson’s disease as measured with a portable voice accumulator. Logopedics, Phoniatrics, Vocology, 1-10.
[25]
Pabon, P. (2018). Mapping Individual Voice Quality over the Voice Range : The Measurement Paradigm of the Voice Range Profile (Doctoral thesis , KTH Royal Institute of Technology, Stockholm, TRITA-EECS-AVL 2018:70). Retrieved from http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-235824.
[26]
Elowsson, A. (2018). Modeling Music : Studies of Music Transcription, Music Perception and Music Production (Doctoral thesis , KTH Royal Institute of Technology, Stockholm, TRITA-EECS-AVL 2018-35). Retrieved from http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-226894.
[27]
Roddy, M., Skantze, G., Harte, N. (2018). Multimodal Continuous Turn-Taking Prediction Using Multiscale RNNs. In ICMI 2018 - Proceedings of the 2018 International Conference on Multimodal Interaction. (pp. 186-190).
[28]
Kontogiorgos, D., Sibirtseva, E., Pereira, A., Skantze, G., Gustafson, J. (2018). Multimodal reference resolution in collaborative assembly tasks. Presented at Workshop on Multimodal Analyses enabling Artificial Agents in Human-Machine Interaction. ACM Digital Library.
[29]
Jansson, E. V. & Kabała, A. (2018). On the influence of arching and material on the vibration of a shell - Towards understanding the soloist violin. Vibrations in Physical Systems, 29.
[30]
Friberg, A., Lindeberg, T., Hellwagner, M., Helgason, P., Salomão, G. L., Elovsson, A. ... Ternström, S. (2018). Prediction of three articulatory categories in vocal sound imitations using models for auditory receptive fields. Journal of the Acoustical Society of America, 144(3), 1467-1483.
[32]
Dabbaghchian, S., Arnela, M., Engwall, O. & Guasch, O. (2018). Reconstruction of vocal tract geometries from biomechanical simulations. International Journal for Numerical Methods in Biomedical Engineering.
[33]
Wistbacka, G., Andrade, P. A., Simberg, S., Hammarberg, B., Sodersten, M., Svec, J. G. & Granqvist, S. (2018). Resonance Tube Phonation in Water-the Effect of Tube Diameter and Water Depth on Back Pressure and Bubble Characteristics at Different Airflows. Journal of Voice, 32(1).
[34]
Borin, L., Forsberg, M., Edlund, J., Domeij, R. (2018). Språkbanken 2018 : Research resources for text, speech, & society. In CEUR Workshop Proceedings. (pp. 504-506). CEUR-WS.
[35]
Dabbaghchian, S., Arnela, M., Engwall, O. & Guasch, O. (2018). Synthesis of vowels and vowel-vowel utterancesusing a 3D biomechanical-acoustic model. IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[36]
Vijayan, A. E., Alexanderson, S., Beskow, J., Leite, I. (2018). Using Constrained Optimization for Real-Time Synchronization of Verbal and Nonverbal Robot Behavior. In 2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA). (pp. 1955-1961). IEEE Computer Society.
[37]
Shore, T., Skantze, G. (2018). Using Lexical Alignment and Referring Ability to Address Data Sparsity in Situated Dialog Reference Resolution. In Proceedings of 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP). (pp. 2288-2297).
[38]
Szabo Portela, A., Granqvist, S., Ternström, S. & Södersten, M. (2018). Vocal Behavior in Environmental Noise : Comparisons Between Work and Leisure Conditions in Women With Work-related Voice Disorders and Matched Controls. Journal of Voice, 32(1), 126.e23-126.e38.
[39]
Sturm, B. (2018). What do these 5,599,881 parameters mean? : An analysis of a specific LSTM music transcription model, starting with the 70,281 parameters of its softmax layer. In Proceedings of the 6th International Workshop on Musical Metacreation (MUME 2018)..
[40]
Lopes, J., Engwall, O., Skantze, G. (2017). A First Visit to the Robot Language Café. In Proceedings of the ISCA workshop on Speech and Language Technology in Education. Stockholm.
[41]
Johansson, R., Skantze, G., Jönsson, A. (2017). A psychotherapy training environment with virtual patients implemented using the furhat robot platform. In 17th International Conference on Intelligent Virtual Agents, IVA 2017. (pp. 184-187). Springer.
[42]
Stefanov, K., Beskow, J. (2017). A Real-time Gesture Recognition System for Isolated Swedish Sign Language Signs. In Proceedings of the 4th European and 7th Nordic Symposium on Multimodal Communication (MMSYM 2016). Linköping University Electronic Press.
[43]
Arnela, M., Dabbaghchian, S., Guasch, O., Engwall, O. (2017). A semi-polar grid strategy for the three-dimensional finite element simulation of vowel-vowel sequences. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2017. (pp. 3477-3481). The International Speech Communication Association (ISCA).
[44]
Degirmenci, N. C., Jansson, J., Hoffman, J., Arnela, M., Sánchez-Martín, P., Guasch, O., Ternström, S. (2017). A Unified Numerical Simulation of Vowel Production That Comprises Phonation and the Emitted Sound. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2017. (pp. 3492-3496). The International Speech Communication Association (ISCA).
[45]
Avramova, V., Yang, F., Li, C., Peters, C., Skantze, G. (2017). A virtual poster presenter using mixed reality. In 17th International Conference on Intelligent Virtual Agents, IVA 2017. (pp. 25-28). Springer.
[46]
Strömbergsson, S., Edlund, J., Götze, J., Björkenstam, K. N. (2017). Approximating phonotactic input in children's linguistic environments from orthographic transcripts. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2017. (pp. 2213-2217). International Speech Communication Association.
[47]
Mendelson, J., Aylett, M. (2017). Beyond the listening test : An interactive approach to TTS Evaluation. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. (pp. 249-253). International Speech Communication Association.
[48]
Castellana, A., Selamtzis, A., Salvi, G., Carullo, A., Astolfi, A. (2017). Cepstral and entropy analyses in vowels excerpted from continuous speech of dysphonic and control speakers. In Proceedings of the Annual Conference of the International Speech Communication Association, Interspeech 2017. (pp. 1814-1818). International Speech Communication Association.
[49]
[50]
Friberg, A. (2017). Commentary on Polak How short is the shortest metric subdivision?. Empirical Musicology Review, 12(3-4), 227-228.
Full list in the KTH publications portal
Belongs to: Speech, Music and Hearing
Last changed: Oct 17, 2018