TMH Publications (latest 50)

Below are the 50 latest publications from the Department of Speech, Music and Hearing.

TMH Publications

[1]
Saponaro, G., Jamone, L., Bernardino, A. & Salvi, G. (2019). Beyond the Self: Using Grounded Affordances to Interpret and Describe Others’ Actions. IEEE Transactions on Cognitive and Developmental Systems.
[2]
Selamtzis, A., Castellana, A., Salvi, G., Carullo, A. & Astolfi, A. (2019). Effect of vowel context in cepstral and entropy analysis of pathological voices. Biomedical Signal Processing and Control, 47, 350-357.
[3]
Fallgren, P., Malisz, Z., Edlund, J. (2018). A tool for exploring large amounts of found audio data. In CEUR Workshop Proceedings. (pp. 499-503). CEUR-WS.
[4]
Per, F., Malisz, Z., Edlund, J. (2019). Bringing order to chaos : A non-sequential approach for browsing large sets of found audio data. In LREC 2018 - 11th International Conference on Language Resources and Evaluation. (pp. 4307-4311). European Language Resources Association (ELRA).
[5]
Kontogiorgos, D., Avramova, V., Alexanderson, S., Jonell, P., Oertel, C., Beskow, J., Skantze, G., Gustafson, J. (2018). A Multimodal Corpus for Mutual Gaze and Joint Attention in Multiparty Situated Interaction. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). (pp. 119-127). Paris.
[6]
Shore, T., Androulakaki, T., Skantze, G. (2019). KTH Tangrams: A Dataset for Research on Alignment and Conceptual Pacts in Task-Oriented Dialogue. In LREC 2018 - 11th International Conference on Language Resources and Evaluation. (pp. 768-775). Tokyo.
[7]
Sibirtseva, E., Kontogiorgos, D., Nykvist, O., Karaoguz, H., Leite, I., Gustafson, J., Kragic, D. (2018). A Comparison of Visualisation Methods for Disambiguating Verbal Requests in Human-Robot Interaction. In 2018 27th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN) 2018 27th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN)..
[8]
Körner Gustafsson, J., Södersten, M., Ternström, S. & Schalling, E. (2019). Long-term effects of Lee Silverman Voice Treatment on daily voice use in Parkinson’s disease as measured with a portable voice accumulator. Logopedics, Phoniatrics, Vocology, 44(3), 124-133.
[9]
Selamtzis, A., Ternström, S., Richter, B., Burk, F., Köberlein, M., Echternach, M. (2018). A comparison of electroglottographic and glottal area waveforms for phonation type differentiation in male professional singers. (Manuscript).
[10]
Selamtzis, A., Ternström, S., Richter, B., Burk, F., Köberlein, M. & Echternach, M. (2018). A comparison of electroglottographic and glottal area waveforms for phonation type differentiation in male professional singers. Journal of the Acoustical Society of America, 144(6), 3275-3288.
[11]
Hultén, M., Artman, H. & House, D. (2018). A model to analyse students’ cooperative ideageneration in conceptual design. International journal of technology and design education, 28(2), 451-470.
[12]
Chettri, B., Sturm, B., Benetos, E. (2018). ANALYSING REPLAY SPOOFING COUNTERMEASURE PERFORMANCE UNDER VARIED CONDITIONS. In 2018 IEEE 28TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP). IEEE.
[13]
Bisesi, E., Friberg, A. & Parncutt, R. (2019). A Computational Model of Immanent Accent Salience in Tonal Music. Frontiers in Psychology, 10(317), 1-19.
[14]
Finkel, S., Veit, R., Lotze, M., Friberg, A., Vuust, P., Soekadar, S. ... Kleber, B. (2019). Intermittent theta burst stimulation over right somatosensory larynx cortex enhances vocal pitch‐regulation in nonsingers. Human Brain Mapping.
[15]
Hallström, E., Mossmyr, S., Sturm, B., Vegeborn, V., Wedin, J. (2019). From Jigs and Reels to Schottisar och Polskor : Generating Scandinavian-like Folk Music with Deep Recurrent Networks. Presented at The 16th Sound & Music Computing Conference, Malaga, Spain, 28-31 May 2019.
[16]
Kucherenko, T., Hasegawa, D., Naoshi, K., Henter, G. E., Kjellström, H. (2019). On the Importance of Representations for Speech-Driven Gesture Generation : Extended Abstract. Presented at International Conference on Autonomous Agents and Multiagent Systems (AAMAS '19), May 13-17, 2019, Montréal, Canada. (pp. 2072-2074). The International Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS).
[17]
Kontogiorgos, D., Abelho Pereira, A. T., Gustafson, J. (2019). The Trade-off between Interaction Time and Social Facilitation with Collaborative Social Robots. In The Challenges of Working on Social Robots that Collaborate with People..
[18]
Skantze, G., Gustafson, J. & Beskow, J. (2019). Multimodal Conversational Interaction with Robots. In Sharon Oviatt, Björn Schuller, Philip R. Cohen, Daniel Sonntag, Gerasimos Potamianos, Antonio Krüger (Ed.), The Handbook of Multimodal-Multisensor Interfaces, Volume 3: Language Processing, Software, Commercialization, and Emerging Directions. ACM Press.
[19]
Friberg, A., Bisesi, E., Addessi, A. R. & Baroni, M. (2019). Probing the Underlying Principles of Perceived Immanent Accents Using a Modeling Approach. Frontiers in Psychology, 10.
[20]
Kucherenko, T., Hasegawa, D., Henter, G. E., Kaneko, N., Kjellström, H. (2019). Analyzing Input and Output Representations for Speech-Driven Gesture Generation. In 19th ACM International Conference on Intelligent Virtual Agents. New York, NY, USA: ACM Publications.
[21]
Ternström, S. (2019). Normalized time-domain parameters for electroglottographic waveforms. Journal of the Acoustical Society of America, 146(1), EL65-EL70.
[22]
Kontogiorgos, D., Skantze, G., Abelho Pereira, A. T., Gustafson, J. (2019). The Effects of Embodiment and Social Eye-Gaze in Conversational Agents. In Proceedings of the 41st Annual Conference of the Cognitive Science Society (CogSci)..
[23]
Rodríguez-Algarra, F., Sturm, B. & Dixon, S. (2019). Characterising Confounding Effects in Music Classification Experiments through Interventions. Transactions of the International Society for Music Information Retrieval, 52-66.
[24]
Mishra, S., Stoller, D., Benetos, E., Sturm, B., Dixon, S. (2019). GAN-Based Generation and Automatic Selection of Explanations for Neural Networks. Presented at Safe Machine Learning 2019 Workshop at the International Conference on Learning Representations.
[25]
Stefanov, K., Salvi, G., Kontogiorgos, D., Kjellström, H. & Beskow, J. (2019). Modeling of Human Visual Attention in Multiparty Open-World Dialogues. ACM TRANSACTIONS ON HUMAN-ROBOT INTERACTION, 8(2).
[26]
Sturm, B., Iglesias, M., Ben-Tal, O., Miron, M. & Gómez, E. (2019). Artificial Intelligence and Music: Open Questions of Copyright Law and Engineering Praxis. MDPI Arts, 8(3).
[27]
Kontogiorgos, D. (2019). Multimodal Language Grounding for Human-Robot Collaboration : YRRSDS 2019 - Dimosthenis Kontogiorgos. In Young Researchers Roundtable on Spoken Dialogue Systems..
[28]
Lã, F. M.B., Ternström, S. (2019). Flow ball-assisted training : immediate effects on vocal fold contacting. In Pan-European Voice Conference 2019. (pp. 50-51). University of Copenhagen.
[29]
Ternström, S., Pabon, P. (2019). Accounting for variability over the voice range. In Proceedings of the ICA 2019 and EAA Euroregio. (pp. 7775-7780). Aachen, DE: Deutsche Gesellschaft für Akustik (DEGA e.V.).
[30]
Stefanov, K. (2019). Self-Supervised Vision-Based Detection of the Active Speaker as Support for Socially-Aware Language Acquisition. IEEE Transactions on Cognitive and Developmental Systems.
[31]
Clark, L., Cowan, B. R., Edwards, J., Munteanu, C., Murad, C., Aylett, M., Moore, R. K., Edlund, J., Székely, É., Healey, P., Harte, N., Torre, I., Doyle, P. (2019). Mapping Theoretical and Methodological Perspectives for Understanding Speech Interface Interactions. In CHI EA '19 EXTENDED ABSTRACTS: EXTENDED ABSTRACTS OF THE 2019 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS. ASSOC COMPUTING MACHINERY.
[32]
Székely, É., Henter, G. E., Gustafson, J. (2019). CASTING TO CORPUS : SEGMENTING AND SELECTING SPONTANEOUS DIALOGUE FOR TTS WITH A CNN-LSTM SPEAKER-DEPENDENT BREATH DETECTOR. In 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP). (pp. 6925-6929). IEEE.
[33]
Jonell, P., Kucherenko, T., Ekstedt, E., Beskow, J. (2019). Learning Non-verbal Behavior for a Social Robot from YouTube Videos. Presented at ICDL-EpiRob Workshop on Naturalistic Non-Verbal and Affective Human-Robot Interactions, Oslo, Norway, August 19, 2019.
[34]
Kontogiorgos, D., Pereira, A., Gustafson, J. (2019). Estimating Uncertainty in Task Oriented Dialogue. Presented at 21st ACM International Conference on Multimodal Interaction, Suzhou, Jiangsu, China. October 14-18, 2019.
[35]
Betz, S., Zarrieß, S., Székely, É., Wagner, P. (2019). The greennn tree - lengthening position influences uncertainty perception. Presented at Interspeech 2019.
[36]
Székely, É., Henter, G. E., Beskow, J., Gustafson, J. (2019). Spontaneous conversational speech synthesis from found data. Presented at Interspeech.
[37]
Székely, É., Henter, G. E., Beskow, J., Gustafson, J. (2019). Off the cuff: Exploring extemporaneous speech delivery with TTS. Presented at Interspeech.
[38]
Székely, É., Henter, G. E., Beskow, J., Gustafson, J. (2019). How to train your fillers: uh and um in spontaneous speech synthesis. Presented at The 10th ISCA Speech Synthesis Workshop.
[39]
Zhang, C., Oztireli, C., Mandt, S., Salvi, G. (2019). Active Mini-Batch Sampling Using Repulsive Point Processes. Presented at 33rd AAAI Conference on Artificial Intelligence / 31st Innovative Applications of Artificial Intelligence Conference / 9th AAAI Symposium on Educational Advances in Artificial Intelligence, Location: Honolulu, HI, JAN 27-FEB 01, 2019. (pp. 5741-5748). ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE.
[40]
Elowsson, A., Friberg, A. (2019). Modeling Music Modality with a Key-Class Invariant Pitch Chroma CNN. Presented at 20th International Society for Music In-formation Retrieval Conference, Delft, Netherlands, November 4-8, 2019.
[41]
Dubois, J., Elovsson, A., Friberg, A. (2019). Predicting Perceived Dissonance of Piano Chords Using a Chord-Class Invariant CNN and Deep Layered Learning. In Proceedings of 16th Sound & Music Computing Conference (SMC), Malaga, Spain. (pp. 530-536).
[42]
Kalpakchi, D., Boye, J. (2019). SpaceRefNet : a neural approach to spatial reference resolution in a real city environment. In Proceedings of the 20th Annual SIGdial Meeting on Discourse and Dialogue. (pp. 422-431). Association for Computational Linguistics.
[43]
Kontogiorgos, D., Abelho Pereira, A. T., Andersson, O., Koivisto, M., Gonzalez Rabal, E., Vartiainen, V., Gustafson, J. (2019). The effects of anthropomorphism and non-verbal social behaviour in virtual assistants. In IVA 2019 - Proceedings of the 19th ACM International Conference on Intelligent Virtual Agents. (pp. 133-140). Association for Computing Machinery (ACM).
[44]
Gulz, T., Holzapfel, A., Friberg, A. (2019). Developing a Method for Identifying Improvisation Strategies in Jazz Duos. In Proc. of the 14th International Symposium on CMMR. (pp. 482-489). Marseille Cedex.
[45]
Jonell, P. (2019). Using Social and Physiological Signals for User Adaptation in Conversational Agents. In AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS. (pp. 2420-2422). ASSOC COMPUTING MACHINERY.
[46]
[47]
Arnela, M., Dabbaghchian, S., Guasch, O. & Engwall, O. (2019). MRI-based vocal tract representations for the three-dimensional finite element synthesis of diphthongs. IEEE Transactions on Audio, Speech, and Language Processing, 27(12), 2173-2182.
[48]
Malisz, Z., Henter, G. E., Valentini-Botinhao, C., Watts, O., Beskow, J., Gustafson, J. (2019). Modern speech synthesis for phonetic sciences : A discussion and an evaluation. In Proceedings of ICPhS..
[49]
Székely, É., Henter, G. E., Beskow, J., Gustafson, J. (2019). Off the cuff : Exploring extemporaneous speech delivery with TTS. Presented at The 20th Annual Conference of the International Speech Communication Association INTERSPEECH 2019 | Graz, Austria, Sep. 15-19, 2019.. (pp. 3687-3688).
[50]
Székely, É., Henter, G. E., Beskow, J., Gustafson, J. (2019). Spontaneous conversational speech synthesis from found data. Presented at The 20th Annual Conference of the International Speech Communication Association INTERSPEECH 2019 | Graz, Austria, Sep. 15-19, 2019..
Full list in the KTH publications portal
Page responsible:Web editors at EECS
Belongs to: Speech, Music and Hearing
Last changed: Oct 17, 2018