Publications by Jens Edlund

Peer reviewed

Articles

[1]

Pandey, A., Edlund, J., Le Maguer, S. & Harte, N. (2026). The use of variable length stimuli for assessing segmental distortion in TTS evaluation. Computer speech & language (Print), 97.

[2]

Ekström, A. G., Gärdenfors, P., Snyder, W. D., Friedrichs, D., McCarthy, R. C., Tsapos, M. ... Moran, S. (2025). Correlates of Vocal Tract Evolution in Late Pliocene and Pleistocene Hominins. Human Nature, 36(1), 22-69.

[3]

Ekström, A. G., Gannon, C., Edlund, J., Moran, S. & Lameira, A. R. (2024). Chimpanzee utterances refute purported missing links for novel vocalizations and syllabic speech. Scientific Reports, 14(1).

[4]

Ekström, A. G. & Edlund, J. (2024). Sketches of chimpanzee (Pan troglodytes) hoo’s : vowels by any other name?. Primates, 65(2), 81-88.

[5]

Ekström, A. G. & Edlund, J. (2023). Evolution of the human tongue and emergence of speech biomechanics. Frontiers in Psychology, 14.

[6]

Strömbergsson, S., Götze, J., Edlund, J. & Nilsson Björkenstam, K. (2022). Simulating Speech Error Patterns Across Languages and Different Datasets. Language and Speech, 65(1), 105-142.

[7]

Strombergsson, S., Edlund, J., McAllister, A. & Lagerberg, T. (2021). Understanding acceptability of disordered speech through Audience Response Systems-based evaluation. Speech Communication, 131, 13-22.

[8]

Strombergsson, S., Holm, K., Edlund, J., Lagerberg, T. & McAllister, A. (2020). Audience Response System-Based Evaluation of Intelligibility of Children's Connected Speech - Validity, Reliability and Listener Differences. Journal of Communication Disorders, 87.

[9]

Clark, L., Doyle, P., Garaialde, D., Gilmartin, E., Schloegl, S., Edlund, J. ... Cowan, B. R. (2019). The State of Speech in HCI : Trends, Themes and Challenges. Interacting with computers, 31(4), 349-371.

[10]

Oertel, C., Cummins, F., Edlund, J., Wagner, P. & Campbell, N. (2013). D64 : A corpus of richly recorded conversational interaction. Journal on Multimodal User Interfaces, 7(1-2), 19-28.

[11]

Al Moubayed, S., Edlund, J. & Beskow, J. (2012). Taming Mona Lisa : communicating gaze faithfully in 2D and 3D facial projections. ACM Transactions on Interactive Intelligent Systems, 1(2), 25.

[12]

Heldner, M. & Edlund, J. (2010). Pauses, gaps and overlaps in conversations. Journal of Phonetics, 38(4), 555-568.

[13]

Edlund, J. & Beskow, J. (2009). MushyPeek : A Framework for Online Investigation of Audiovisual Dialogue Phenomena. Language and Speech, 52, 351-367.

[14]

Hincks, R. & Edlund, J. (2009). PROMOTING INCREASED PITCH VARIATION IN ORAL PRESENTATIONS WITH TRANSIENT VISUAL FEEDBACK. Language Learning & Technology, 13(3), 32-50.

[15]

Edlund, J., Gustafson, J., Heldner, M. & Hjalmarsson, A. (2008). Towards human-like spoken dialogue systems. Speech Communication, 50(8-9), 630-645.

[16]

Heldner, M. & Edlund, J. (2007). What turns speech into conversation? : A project description. TMH-QPSR, 50(1), 45-48.

Conference papers

[17]

Tånnander, C., House, D., Beskow, J., Edlund, J. (2025). Intrasentential English in Swedish TTS : perceived English-accentedness. In Interspeech 2025. (pp. 1638-1642). International Speech Communication Association.

[18]

Kirkland, A., Edlund, J. (2025). Who knows best? Effects of speech disfluencies on incentivized decision-making. In Interspeech 2025. (pp. 4508-4512). International Speech Communication Association.

[19]

Edlund, J., Tånnander, C., Le Maguer, S., Wagner, P. (2024). Assessing the impact of contextual framing on subjective TTS quality. In Interspeech 2024. (pp. 1205-1209). International Speech Communication Association.

[20]

Tånnander, C., Mehta, S., Beskow, J., Edlund, J. (2024). Beyond graphemes and phonemes: continuous phonological features in neural text-to-speech synthesis. In Interspeech 2024. (pp. 2815-2819). International Speech Communication Association.

[21]

Tånnander, C., O'Regan, J., House, D., Edlund, J., Beskow, J. (2024). Prosodic characteristics of English-accented Swedish neural TTS. In Proceedings of Speech Prosody 2024. (pp. 1035-1039). Leiden, The Netherlands: International Speech Communication Association.

[22]

Tånnander, C., Edlund, J., Gustafsson, J. (2024). Revisiting Three Text-to-Speech Synthesis Experiments with a Web-Based Audience Response System. In 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024 - Main Conference Proceedings. (pp. 14111-14121). European Language Resources Association (ELRA).

[23]

Esfandiari-Baiat, G., Edlund, J. (2024). The MEET Corpus: Collocated, Distant and Hybrid Three-party Meetings with a Ranking Task. In ISA 2024: 20th Joint ACL - ISO Workshop on Interoperable Semantic Annotation at LREC-COLING 2024, Workshop Proceedings. (pp. 1-7). European Language Resources Association (ELRA).

[24]

Tånnander, C., House, D., Edlund, J. (2023). Analysis-by-synthesis : phonetic-phonological variation indeep neural network-based text-to-speech synthesis. In Proceedings of the 20th International Congress of Phonetic Sciences, Prague 2023. (pp. 3156-3160). Prague, Czech Republic: GUARANT International.

[25]

Fallgren, P., Edlund, J. (2023). Crowdsource-based validation of the audio cocktail as a sound browsing tool. In Interspeech 2023. (pp. 2178-2182). International Speech Communication Association.

[26]

Pandey, A., Edlund, J., Le Maguer, S., Harte, N. (2023). Listener sensitivity to deviating obstruents in WaveNet. In Interspeech 2023. (pp. 1080-1084). International Speech Communication Association.

[27]

Edlund, J., Brodén, D., Fridlund, M., Lindhé, C., Olsson, L. -., Ängsal, M., Öhberg, P. (2022). A Multimodal Digital Humanities Study of Terrorism in Swedish Politics : An Interdisciplinary Mixed Methods Project on the Configuration of Terrorism in Parliamentary Debates, Legislation, and Policy Networks 1968–2018. In Lecture Notes in Networks and Systems. (pp. 435-449). Springer Nature.

[28]

Tånnander, C., House, D., Edlund, J. (2022). Syllable duration as a proxy to latent prosodic features. In Proceedings of Speech Prosody 2022. (pp. 220-224). Lisbon, Portugal: International Speech Communication Association.

[29]

Fallgren, P., Edlund, J. (2021). Human-in-the-Loop Efficiency Analysis for Binary Classification in Edyson. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. (pp. 3685-3689). International Speech Communication Association.

[30]

Tånnander, C., Edlund, J. (2021). Methods of slowing down speech. In Proceedings. 11th ISCA Speech Synthesis Workshop (SSW 11). (pp. 43-47).

[31]

Székely, É., Edlund, J., Gustafsson, J. (2020). Augmented Prompt Selection for Evaluation of Spontaneous Speech Synthesis. In Proceedings of The 12th Language Resources and Evaluation Conference. (pp. 6368-6374). European Language Resources Association.

[32]

Domeij, R., Edlund, J., Eriksson, G., Fallgren, P., House, D., Lindström, E., Skog, S. N., Öqvist, J. (2020). Exploring the archives for textual entry points to speech - Experiences of interdisciplinary collaboration in making cultural heritage accessible for research. In CEUR Workshop Proceedings. (pp. 45-55). CEUR-WS.

[33]

Fallgren, P., Malisz, Z., Edlund, J. (2019). Bringing order to chaos : A non-sequential approach for browsing large sets of found audio data. In Proceedings Of The Eleventh International Conference On Language Resources And Evaluation (LREC 2018). (pp. 4307-4311). European Language Resources Association (ELRA).

[34]

Tånnander, C., Edlund, J. (2019). First steps towards text profiling for speech synthesis. In CEUR Workshop Proceedings. (pp. 457-468). CEUR-WS.

[35]

Fallgren, P., Malisz, Z., Edlund, J. (2019). How to annotate 100 hours in 45 minutes. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. (pp. 341-345). ISCA.

[36]

Clark, L., Cowan, B. R., Edwards, J., Munteanu, C., Murad, C., Aylett, M., Moore, R. K., Edlund, J., Székely, É., Healey, P., Harte, N., Torre, I., Doyle, P. (2019). Mapping Theoretical and Methodological Perspectives for Understanding Speech Interface Interactions. In CHI EA '19 EXTENDED ABSTRACTS: EXTENDED ABSTRACTS OF THE 2019 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS. ASSOC COMPUTING MACHINERY.

[37]

Bystedt, M., Edlund, J. (2019). New applications of gaze tracking in speech science. In CEUR Workshop Proceedings. (pp. 73-78). CEUR-WS.

[38]

Tånnander, C., Edlund, J. (2019). Preliminary guidelines for the efficient management of OOV words for spoken text. In Speech Synthesis Workshop (SSW). (pp. 137-142).

[39]

Edlund, J. (2019). Shoehorning in the name of science. In Procs. of CUI19. ACM Digital Library.

[40]

Wagner, P., Beskow, J., Betz, S., Edlund, J., Gustafson, J., Henter, G. E., Le Maguer, S., Malisz, Z., Székely, É., Tånnander, C. (2019). Speech Synthesis Evaluation : State-of-the-Art Assessment and Suggestion for a Novel Research Program. In Proceedings of the 10th Speech Synthesis Workshop (SSW10)..

[41]

Tånnander, C., Fallgren, P., Edlund, J., Gustafson, J. (2019). Spot the pleasant people! Navigating the cocktail party buzz. In Proceedings Interspeech 2019, 20th Annual Conference of the International Speech Communication Association. (pp. 4220-4224).

[42]

Fallgren, P., Malisz, Z., Edlund, J. (2019). Towards fast browsing of found audio data : 11 presidents. In CEUR Workshop Proceedings. (pp. 133-142). CEUR-WS.

[43]

Fallgren, P., Malisz, Z., Edlund, J. (2018). A tool for exploring large amounts of found audio data. In CEUR Workshop Proceedings. (pp. 499-503). CEUR-WS.

[44]

Borin, L., Forsberg, M., Edlund, J., Domeij, R. (2018). Språkbanken 2018 : Research resources for text, speech, & society. In CEUR Workshop Proceedings. (pp. 504-506). CEUR-WS.

[45]

Strömbergsson, S., Edlund, J., Götze, J., Björkenstam, K. N. (2017). Approximating phonotactic input in children's linguistic environments from orthographic transcripts. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2017. (pp. 2213-2217). International Speech Communication Association.

[46]

Edlund, J., Gustafson, J. (2016). Hidden resources - Strategies to acquire and exploit potential spoken language resources in national archives. In Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016. (pp. 4531-4534). European Language Resources Association (ELRA).

[47]

Edlund, J., Tånnander, C., Gustafson, J. (2015). Audience response system-based assessment for analysis-by-synthesis. In Proc. of ICPhS 2015. ICPhS.

[48]

Włodarczak, M., Heldner, M., Edlund, J. (2015). Communicative needs and respiratory constraints. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. (pp. 3051-3055). The International Speech Communication Association (ISCA).

[49]

Edlund, J., Heldner, M., Wlodarczak, M. (2014). Catching wind of multiparty conversation. In LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION..

[50]

Edlund, J., Edelstam, F., Gustafson, J. (2014). Human pause and resume behaviours for unobtrusive humanlike in-car spoken dialogue systems. In Proceedings of the of the EACL 2014 Workshop on Dialogue in Motion (DM). (pp. 73-77). Gothenburg, Sweden.

[51]

Dalmas, T., Götze, J., Gustafsson, J., Janarthanam, S., Kleindienst, J., Mueller, C., Stent, A., Vlachos, A., Artzi, Y., Benotti, L., Boye, J., Clark, S., Curin, J., Dethlefs, N., Edlund, J., Goldwasser, D., Heeman, P., Jurcicek, F., Kelleher, J., Komatani, K., Kwiatkowski, T., Larsson, S., Lemon, O., Lenke, N., Macek, J., Macek, T., Mooney, R., Ramachandran, D., Rieser, V., Shi, H., Tenbrink, T., Williams, J. (2014). Introduction. In Proceedings 2014 Workshop on Dialogue in Motion, DM 2014. Association for Computational Linguistics (ACL).

[52]

Strömbergsson, S., Tånnander, C., Edlund, J. (2014). Ranking severity of speech errors by their phonological impact in context. In Proceedings of the Annual ConfereProceedings of the Annual Conference of the International Speech Communication Association. (pp. 1568-1572).

[53]

Al Moubayed, S., Edlund, J., Gustafson, J. (2013). Analysis of gaze and speech patterns in three-party quiz game interaction. In Interspeech 2013. (pp. 1126-1130). The International Speech Communication Association (ISCA).

[54]

Heldner, M., Hjalmarsson, A., Edlund, J. (2013). Backchannel relevance spaces. In Nordic Prosody: Proceedings of the XIth Conference, Tartu 2012. (pp. 137-146). Franktfurt am Main, Germany: Peter Lang Publishing Group.

[55]

Edlund, J., Al Moubayed, S., Tånnander, C., Gustafson, J. (2013). Temporal precision and reliability of audience response system based annotation. In Proc. of Multimodal Corpora 2013..

[56]

Oertel, C., Salvi, G., Götze, J., Edlund, J., Gustafson, J., Heldner, M. (2013). The KTH Games Corpora : How to Catch a Werewolf. In IVA 2013 Workshop Multimodal Corpora: Beyond Audio and Video: MMC 2013..

[57]

Strömbergsson, S., Hjalmarsson, A., Edlund, J., House, D. (2013). Timing responses to questions in dialogue. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2013. (pp. 2583-2587). Lyon, France: International Speech and Communication Association.

[58]

Edlund, J., Alexanderson, S., Beskow, J., Gustavsson, L., Heldner, M., Hjalmarsson, A., Kallionen, P., Marklund, E. (2012). 3rd party observer gaze as a continuous measure of dialogue flow. In Proceedings of the 8th International Conference on Language Resources and Evaluation, LREC 2012. (pp. 1354-1358). Istanbul, Turkey: European Language Resources Association.

[59]

Edlund, J., Heldner, M., Hjalmarsson, A. (2012). 3rd party observer gaze during backchannels. In Proc. of the Interspeech 2012 Interdisciplinary Workshop on Feedback Behaviors in Dialog. Skamania Lodge, WA, USA.

[60]

Strömbergsson, S., Edlund, J., House, D. (2012). A study of Swedish questions and their prosodic characteristics. In Proceedings of Workshop on Innovation and Applications in Speech Technology (IAST). (pp. 61-64). Dublin, Ireland.

[61]

Oertel, C., Wlodarczak, M., Edlund, J., Wagner, P., Gustafson, J. (2012). Gaze Patterns in Turn-Taking. In 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012, Vol 3. (pp. 2243-2246). Portland, Oregon, US.

[62]

Edlund, J., Oertel, C., Gustafson, J. (2012). Investigating negotiation for load-time in the GetHomeSafe project. In Proc. of Workshop on Innovation and Applications in Speech Technology (IAST). (pp. 45-48). Dublin, Ireland.

[63]

Edlund, J., Hjalmarsson, A. (2012). Is it really worth it? : Cost-based selection of system responses to speech-in-overlap. In Proc. of the IVA 2012 workshop on Realtime Conversational Virtual Agents (RCVA 2012). Santa Crux, CA, USA.

[64]

Laskowski, K., Heldner, M., Edlund, J. (2012). On the dynamics of overlap in multi-party conversation. In 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012. (pp. 846-849).

[65]

Edlund, J., Heldner, M., Gustafson, J. (2012). On the effect of the acoustic environment on the accuracy of perception of speaker orientation from auditory cues alone. In 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012, Vol 2. (pp. 1482-1485).

[66]

Strömbergsson, S., Edlund, J., House, D. (2012). Prosodic measurements and question types in the Spontal corpus of Swedish dialogues. In 13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012, Vol 1. (pp. 838-841).

[67]

Edlund, J., House, D., Strömbergsson, S. (2012). Question types and some prosodic correlates in 600 questions in the Spontal database of Swedish dialogues. In Proceedings Of The 6th International Conference On Speech Prosody, Vols I and II. (pp. 737-740). Shanghai, China: Tongji Univ Press.

[68]

Strömbergsson, S., Edlund, J., House, D. (2012). Question types and some prosodic correlates in the Spontal corpus of Swedish dialogues. In Proceedings of Fonetik 2012. Gothenburg, Sweden.

[69]

Strömbergsson, S., Edlund, J., House, D. (2012). Questions and reported speech in Swedish dialogues. In Nordic Prosody: Proceedings of the XIth Conference, Tartu 2012. Tartu, Estonia.

[70]

Edlund, J., Strömbergsson, S., House, D. (2012). Telling questions from statements in spoken dialogue systems. In Proc. of SLTC 2012. Lund, Sweden.

[71]

Edlund, J., Heldner, M., Gustafson, J. (2012). Who am I speaking at? : perceiving the head orientation of speakers from acoustic cues alone. In Proc. of LREC Workshop on Multimodal Corpora 2012. Istanbul, Turkey.

[72]

Laskowski, K., Edlund, J., Heldner, M. (2011). A single-port non-parametric model of turn-taking in multi-party conversation. In Proc. of ICASSP 2011. (pp. 5600-5603). Prague, Czech Republic.

[73]

Al Moubayed, S., Beskow, J., Edlund, J., Granström, B., House, D. (2011). Animated Faces for Robotic Heads : Gaze and Beyond. In Analysis of Verbal and Nonverbal Communication and Enactment: The Processing Issues. (pp. 19-35). Springer Berlin/Heidelberg.

[74]

Laskowski, K., Edlund, J., Heldner, M. (2011). Incremental learning and forgetting in incremental stochastic turn-taking models. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH. (pp. 2080-2083). Florence, Italy.

[75]

Beskow, J., Alexanderson, S., Al Moubayed, S., Edlund, J., House, D. (2011). Kinetic Data for Large-Scale Analysis and Modeling of Face-to-Face Conversation. In Proceedings of International Conference on Audio-Visual Speech Processing 2011. (pp. 103-106). Stockholm: KTH Royal Institute of Technology.

[76]

Landsiedel, C., Edlund, J., Eyben, F., Neiberg, D., Schuller, B. (2011). Syllabification of conversational speech using bidirectional long-short-term memory neural networks. In Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on. (pp. 5256-5259). Prague, Czech Republic.

[77]

Edlund, J., Al Moubayed, S., Beskow, J. (2011). The Mona Lisa Gaze Effect as an Objective Metric for Perceived Cospatiality. In Proc. of the Intelligent Virtual Agents 10th International Conference (IVA 2011). (pp. 439-440). Springer.

[78]

Heldner, M., Edlund, J., Hjalmarsson, A., Laskowski, K. (2011). Very short utterances and timing in turn-taking. In Proceedings of Interspeech 2011. (pp. 2848-2851).

[79]

Laskowski, K., Edlund, J. (2010). A Snack Implementation and Tcl/Tk Interface to the Fundamental Frequency Variation Spectrum Algorithm. In Proceedings of the International Conference on Language Resources and Evaluation, LREC 2010. (pp. 3742-3749). Valetta, Malta: European Language Resources Association.

[80]

Edlund, J., Beskow, J. (2010). Capturing massively multimodal dialogues : affordable synchronization and visualization. In Proc. of Multimodal Corpora: Advances in Capturing, Coding and Analyzing Multimodality (MMC 2010). (pp. 160-161).

[81]

Oertel, C., Cummins, F., Campbell, N., Edlund, J., Wagner, P. (2010). D64: A corpus of richly recorded conversational interaction. In Proceedings of LREC 2010 Workshop on Multimodal Corpora: Advances in Capturing, Coding and Analyzing Multimodality. (pp. 27-30).

[82]

Beskow, J., Edlund, J., Granström, B., Gustafsson, J., House, D. (2010). Face-to-Face Interaction and the KTH Cooking Show. In Development of multimodal interfaces: Active listing and synchrony. (pp. 157-168).

[83]

Heldner, M., Edlund, J., Hirschberg, J. (2010). Pitch similarity in the vicinity of backchannels. In Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010. (pp. 3054-3057). Makuhari, Japan.

[84]

Laskowski, K., Heldner, M., Edlund, J. (2010). Preliminaries to an account of multi-party conversational turn-taking as an antiferromagnetic spin glass. In Proceedings of NIPS Workshop on Modeling Human Communication Dynamics. Vancouver, B.C., Canada.

[85]

Edlund, J., Beskow, J., Elenius, K., Hellmer, K., Strömbergsson, S., House, D. (2010). Spontal : a Swedish spontaneous dialogue corpus of audio, video and motion capture. In Proc. of the Seventh conference on International Language Resources and Evaluation (LREC'10). (pp. 2992-2995).

[86]

Sikveland, R.-O., Öttl, A., Amdal, I., Ernestus, M., Svendsen, T., Edlund, J. (2010). Spontal-N : A Corpus of Interactional Spoken Norwegian. In Proc. of the Seventh conference on International Language Resources and Evaluation (LREC'10). (pp. 2986-2991).

[87]

Laskowski, K., Heldner, M., Edlund, J. (2009). A general-purpose 32 ms prosodic vector for Hidden Markov Modeling. In Proceedings of Interspeech 2009. (pp. 724-729). Brighton, UK: ISCA.

[88]

Laskowski, K., Heldner, M., Edlund, J. (2009). Exploring the prosody of floor mechanisms in English using the fundamental frequency variation spectrum. In Proceedings of the 2009 European Signal Processing Conference (EUSIPCO-2009). (pp. 2539-2543). Glasgow, Scotland.

[89]

Edlund, J., Heldner, M., Hirschberg, J. (2009). Pause and gap length in face-to-face interaction. In INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009. (pp. 2779-2782). BAIXAS: ISCA-INST SPEECH COMMUNICATION ASSOC.

[90]

Heldner, M., Edlund, J., Laskowski, K., Pelcé, A. (2009). Prosodic features in the vicinity of pauses, gaps and overlaps. In Nordic Prosody: Proceedings of the Xth Conference. (pp. 95-106). Frankfurt am Main: Peter Lang.

[91]

Edlund, J., Heldner, M., Pelcé, A. (2009). Prosodic features of very short utterances in dialogue. In Nordic Prosody: Proceedings of the Xth Conference. (pp. 57-68). Frankfurt am Main: Peter Lang.

[92]

Beskow, J., Edlund, J., Granström, B., Gustafson, J., Skantze, G., Tobiasson, H. (2009). The MonAMI Reminder : a spoken dialogue system for face-to-face interaction. In Proceedings of the 10th Annual Conference of the International Speech Communication Association, INTERSPEECH 2009. (pp. 300-303). Brighton, U.K.

[93]

Hincks, R., Edlund, J. (2009). Using speech technology to promote increased pitch variation in oral presentations. In Proc. of SLaTE Workshop on Speech and Language Technology in Education. Wroxall, UK.

[94]

Laskowski, K., Edlund, J., Heldner, M. (2008). An instantaneous vector representation of delta pitch for speaker-change prediction in conversational dialogue systems. In 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING. (pp. 5041-5044). New York: IEEE.

[95]

Laskowski, K., Wölfel, M., Heldner, M., Edlund, J. (2008). Computing the fundamental frequency variation spectrum in conversational spoken dialogue systems. In Proceedings of Acoustics'08. (pp. 3305-3310). Paris, France.

[96]

Gustafson, J., Edlund, J. (2008). EXPROS : A toolkit for exploratory experimentation with prosody in customized diphone voices. In Perception In Multimodal Dialogue Systems, Proceedings. (pp. 293-296).

[97]

Hjalmarsson, A., Edlund, J. (2008). Human-likeness in utterance generation : Effects of variability. In Perception In Multimodal Dialogue Systems, Proceedings. (pp. 252-255).

[98]

Beskow, J., Edlund, J., Granström, B., Gustafson, J., Skantze, G. (2008). Innovative interfaces in MonAMI : The Reminder. In Perception In Multimodal Dialogue Systems, Proceedings. (pp. 272-275).

[99]

Laskowski, K., Edlund, J., Heldner, M. (2008). Learning prosodic sequences using the fundamental frequency variation spectrum. In Proceedings of the Speech Prosody 2008 Conference. (pp. 151-154). Campinas, Brazil: Editora RG/CNPq.

[100]

Gustafson, J., Heldner, M., Edlund, J. (2008). Potential benefits of human-like dialogue behaviour in the call routing domain. In Perception In Multimodal Dialogue Systems, Proceedings. (pp. 240-251).

[101]

Edlund, J., Beskow, J. (2007). Pushy versus meek : using avatars to influence turn-taking behaviour. In INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION. (pp. 2784-2787). BAIXAS: ISCA-INST SPEECH COMMUNICATION ASSOC.

[102]

Heldner, M., Edlund, J., Carlson, R. (2006). Interruption impossible. In Nordic Prosody: Proceedings of the IXth Conference, Lund 2004. (pp. 97-105). Frankfurt am Main, Germany.

[103]

Skantze, G., Edlund, J., Carlson, R. (2006). Talking with Higgins : Research challenges in a spoken dialogue system. In PERCEPTION AND INTERACTIVE TECHNOLOGIES, PROCEEDINGS. (pp. 193-196). BERLIN: SPRINGER-VERLAG BERLIN.

[104]

Wallers, Å., Edlund, J., Skantze, G. (2006). The effect of prosodic features on the interpretation of synthesised backchannels. In Perception And Interactive Technologies, Proceedings. (pp. 183-187).

[105]

Edlund, J., Heldner, M., Gustafson, J. (2006). Two faces of spoken dialogue systems. In Interspeech 2006 - ICSLP Satellite Workshop Dialogue on Dialogues: Multidisciplinary Evaluation of Advanced Speech-based Interactive Systems. Pittsburgh PA, USA.

[106]

Skantze, G., House, D., Edlund, J. (2006). User Responses to Prosodic Variation in Fragmentary Grounding Utterances in Dialog. In INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING. (pp. 2002-2005). BAIXAS: ISCA-INST SPEECH COMMUNICATION ASSOC.

[107]

Edlund, J., Heldner, M. (2006). vertical bar nailon vertical bar : Software for Online Analysis of Prosody. Presented at 9th International Conference on Spoken Language Processing/INTERSPEECH 2006, Pittsburgh, PA, USA, 17-21 September 2006. (pp. 2022-2025). BAIXAS: ISCA-INST SPEECH COMMUNICATION ASSOC.

[108]

Edlund, J., Hjalmarsson, A. (2005). Applications of distributed dialogue systems : the KTH Connector. In Proceedings of ISCA Tutorial and Research Workshop on Applied Spoken Language Interaction in Distributed Environments (ASIDE 2005)..

[109]

Edlund, J., House, D., Skantze, G. (2005). The effects of prosodic features on the interpretation of clarification ellipses. In Proceedings of Interspeech 2005: Eurospeech. (pp. 2389-2392).

[110]

Heldner, M., Edlund, J., Björkenstam, T. (2004). Automatically extracted F0 features as acoustic correlates of prosodic boundaries. In Fonetik 2004: Proc of The XVIIth Swedish Phonetics Conference. (pp. 52-55). Stockholm University.

[111]

Skantze, G., Edlund, J. (2004). Early error detection on word level. In Proceedings of ISCA Tutorial and Research Workshop (ITRW) on Robustness Issues in Conversational Interaction..

[112]

Edlund, J., Skantze, G., Carlson, R. (2004). Higgins : a spoken dialogue system for investigating error handling techniques. In Proceedings of the International Conference on Spoken Language Processing, ICSLP 04. (pp. 229-231).

[113]

Skantze, G., Edlund, J. (2004). Robust interpretation in the Higgins spoken dialogue system. In Proceedings of ISCA Tutorial and Research Workshop (ITRW) on Robustness Issues in Conversational Interaction..

[114]

Gustafson, J., Bell, L., Johan, B., Edlund, J., Wirn, M. (2002). Constraint Manipulation and Visualization in a Multimodal Dialogue System. In Proceedings of MultiModal Dialogue in Mobile Environments..

Books

[115]

Borin, L., Brandt, M. D., Edlund, J., Lindh, J. & Parkvall, M. (2012). The Swedish Language in the Digital Age/Svenska språket i den digitala tidsåldern. Springer.

Chapters in books

[116]

Edlund, J., Al Moubayed, S. & Beskow, J. (2013). Co-present or Not? : Embodiment, Situatedness and the Mona Lisa Gaze Effect. In Nakano, Yukiko; Conati, Cristina; Bader, Thomas (Ed.), Eye gaze in intelligent user interfaces: gaze-based analyses, models and applications (pp. 185-203). London: Springer London.

[117]

Edlund, J., House, D. & Beskow, J. (2012). Gesture movement profiles in dialogues from a Swedish multimodal database of spontaneous speech. In Bergmann, Pia; Brenning, Jana; Pfeiffer, Martin C.; Reber, Elisabeth (Ed.), Prosodic and Visual Resources in Interactional Grammar. Walter de Gruyter.

[118]

Edlund, J. & Gustafson, J. (2010). Ask the experts : Part II: Analysis. In Juel Henrichsen, Peter (Ed.), Linguistic Theory and Raw Sound (pp. 183-198). Frederiksberg: Samfundslitteratur.

[119]

Gustafson, J. & Edlund, J. (2010). Ask the experts - Part I: Elicitation. In Juel Henrichsen, Peter (Ed.), Linguistic Theory and Raw Sound (pp. 169-182). Samfundslitteratur.

[120]

Beskow, J., Carlson, R., Edlund, J., Granström, B., Heldner, M., Hjalmarsson, A. & Skantze, G. (2009). Multimodal Interaction Control. In Waibel, Alexander; Stiefelhagen, Rainer (Ed.), Computers in the Human Interaction Loop (pp. 143-158). Berlin/Heidelberg: Springer Berlin/Heidelberg.

[121]

Edlund, J. & Heldner, M. (2007). Underpinning /nailon/ - automatic estimation of pitch range and speaker relative pitch. In Müller, C. (Ed.), Speaker Classification I: Fundamentals, Features, and Methods. Berlin: Springer.

[122]

Beskow, J., Edlund, J. & Nordstrand, M. (2005). A Model for Multimodal Dialogue System Output Applied to an Animated Talking Head. In Minker, Wolfgang; Bühler, Dirk; Dybkjær, Laila (Ed.), SPOKEN MULTIMODAL HUMAN-COMPUTER DIALOGUE IN MOBILE ENVIRONMENTS (pp. 93-113). Dordrecht: Springer.

[123]

Edlund, J., Heldner, M. & Gustafson, J. (2005). Utterance segmentation and turn-taking in spoken dialogue systems. In Fisseni, B.; Schmitz, H-C.; Schröder, B.; Wagner, P. (Ed.), Computer Studies in Language and Speech (pp. 576-587). Frankfurt am Main, Germany: Peter Lang.

Non-peer reviewed

Conference papers

[124]

Tånnander, C., Edlund, J. (2022). Mapping specific characteristics of spoken text to listener ratings. In Proceedings of Fonetik 2022. Stockholm, Sweden.

[125]

Tånnander, C., Edlund, J. (2022). Sardin : speech-oriented text processing. In Proceedings of Fonetik 2022. Stockholm, Sweden.

[126]

Tånnander, C., Edlund, J. (2022). Towards a Swedish test set for speech-oriented text normalisation. Presented at Swedish Language Technology Conference (SLTC),November 18-20 2020, Göteborg. Göteborg: Göteborgs universitet.

[127]

Tånnander, C., Edlund, J. (2021). Self-perceived preferences of voice and speaking style characteristics in spoken text. Presented at Swedish Language Technology Conference (SLTC) 2021.

[128]

Tånnander, C., Edlund, J. (2021). Stress manipulation in text-to-speech synthesis using speaking rate categories. In Proceedings of Fonetik 2021, Centre for Languages and Literature, Lund University. (pp. 17-22). Lund.

[129]

Edlund, J., Al Moubayed, S., Tånnander, C., Gustafson, J. (2013). Audience response system based annotation of speech. In Proceedings of Fonetik 2013. (pp. 13-16). Linköping: Linköping University.

[130]

Heldner, M., Edlund, J. (2012). Continuer relevance spaces. In Proc. of Nordic Prosody XI. Tartu, Estonia.

[131]

Renklint, E., Cardell, F., Dahlbäck, J., Edlund, J., Heldner, M. (2012). Conversational gaze in light and darkness. In Proc. of Fonetik 2012. (pp. 59-60). Gothenburg, Sweden.

[132]

Edlund, J., Hjalmarsson, A., Tånnander, C. (2012). Unconventional methods in perception experiments. In Proc. of Nordic Prosody XI. Tartu, Estonia.

[133]

Edlund, J. (2011). How deeply rooted are the turns we take?. In SemDial 2011: Proceedings of the 15th Workshop on the Semantics and Pragmatics of Dialogue. (pp. 196-197).

[134]

Edlund, J., Gustafson, J., Beskow, J. (2010). Cocktail : a demonstration of massively multi-component audio environments for illustration and analysis. In SLTC 2010, The Third Swedish Language Technology Conference (SLTC 2010): Proceedings of the Conference..

[135]

Beskow, J., Edlund, J., Gustafson, J., Heldner, M., Hjalmarsson, A., House, D. (2010). Modelling humanlike conversational behaviour. In SLTC 2010: The Third Swedish Language Technology Conference (SLTC 2010), Proceedings of the Conference. (pp. 9-10). Linköping, Sweden.

[136]

Beskow, J., Edlund, J., Gustafson, J., Heldner, M., Hjalmarsson, A., House, D. (2010). Research focus : Interactional aspects of spoken face-to-face communication. In Proceedings from Fonetik, Lund, June 2-4, 2010: . (pp. 7-10). Lund, Sweden: Lund University.

[137]

Edlund, J., Heldner, M., Al Moubayed, S., Gravano, A., Hirschberg, J. (2010). Very short utterances in conversation. In Proceedings from Fonetik 2010, Lund, June 2-4, 2010. (pp. 11-16). Lund, Sweden: Lund University.

[138]

Beskow, J., Edlund, J., Elenius, K., Hellmer, K., House, D., Strömbergsson, S. (2009). Project presentation: Spontal : multimodal database of spontaneous dialog. In Proceedings of Fonetik 2009: The XXIIth Swedish Phonetics Conference. (pp. 190-193). Stockholm: Stockholm University.

[139]

Hincks, R., Edlund, J. (2009). Transient visual feedback on pitch variation for Chinese speakers of English. In Proc. of Fonetik 2009. Stockholm.

[140]

Gustafson, J., Edlund, J. (2008). EXPROS : Tools for exploratory experimentation with prosody. In Proceedings of FONETIK 2008. (pp. 17-20). Gothenburg, Sweden.

[141]

Beskow, J., Edlund, J., Granström, B., Gustafson, J., Jonsson, O., Skantze, G. (2008). Speech technology in the European project MonAMI. In Proceedings of FONETIK 2008. (pp. 33-36). Gothenburg, Sweden: University of Gothenburg.

[142]

Laskowski, K., Heldner, M., Edlund, J. (2008). The fundamental frequency variation spectrum. In Proceedings of FONETIK 2008. (pp. 29-32). Gothenburg, Sweden: Department of Linguistics, University of Gothenburg.

[143]

Edlund, J., Beskow, J., Heldner, M. (2007). MushyPeek : an experiment framework for controlled investigation of human-human interaction control behaviour. In Proceedings of Fonetik 2007. (pp. 61-64).

[144]

Edlund, J., Heldner, M. (2006). /nailon/ - online analysis of prosody. In Working Papers 52: Proceedings of Fonetik 2006. (pp. 37-40). Lund University, Centre for Languages & Literature, Dept. of Linguistics & Phonetics.

[145]

Skantze, G., House, D., Edlund, J. (2006). Grounding and prosody in dialog. In Working Papers 52: Proceedings of Fonetik 2006. (pp. 117-120). Lund, Sweden: Lund University, Centre for Languages & Literature, Dept. of Linguistics & Phonetics.

[146]

Heldner, M., Edlund, J. (2006). Prosodic cues for interaction control in spoken dialogue systems. In Proceedings of Fonetik 2006. (pp. 53-56). Lund, Sweden: Lund University, Centre for Languages & Literature, Dept. of Linguistics & Phonetics.

[147]

Carlson, R., Edlund, J., Heldner, M., Hjalmarsson, A., House, D., Skantze, G. (2006). Towards human-like behaviour in spoken dialog systems. In Proceedings of Swedish Language Technology Conference (SLTC 2006). Gothenburg, Sweden.

[148]

Edlund, J., House, D., Skantze, G. (2005). Prosodic Features in the Perception of Clarification Ellipses. In Proceedings of Fonetik 2005: The XVIIIth Swedish Phonetics Conference. (pp. 107-110). Gothenburg, Sweden.

Chapters in books

[149]

Borin, L., Domeij, R., Edlund, J. & Forsberg, M. (2023). Language Report Swedish. In Cognitive Technologies (pp. 219-222). Springer Nature.

Theses

[150]

Edlund, J. (2011). In search for the conversational homunculus : serving to understand spoken human face-to-face interaction (Doctoral thesis , KTH Royal Institute of Technology, Stockholm, Trita-CSC-A 11:03). Retrieved from https://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-31172.

Other

[151]

Fallgren, P., Edlund, J. (). Edyson: rapid human-in-the-loop browsing, exploration and annotation of large speech and audio data. (Manuscript).

[152]

Ekström, A. G., Crockford, C., Grawunder, S., Moran, S., Edlund, J. (). Evolution and function of hominid air sacs : A synthesis bearing on vowel production. (Manuscript).

[153]

Ekström, A. G., Holmer, S., Sward, K., Moran, S., Lameira, A. R., Friedrichs, D., Edlund, J. (). Gibbon vowel-like quality is tied to superhuman articulator landmarks. (Manuscript).

[154]

Ekström, A. G., Gannon, C., Edlund, J., Moran, S., Lameira, A. R. (). No neural “missing link” for verbal control in chimpanzees. (Manuscript).

[155]

Ekström, A. G., Gärdenfors, P., Snyder, W., Friedrichs, D., McCarthy, R. C., Tsapos, M., Tennie, C., Strait, D. S., Edlund, J., Moran, S. (). Phonetic correlates of hominin evolution in the late Pliocene and Pleistocene epochs : Becoming pre-adapted for speech. (Manuscript).

[156]

Ekström, A. G., Bortolato, T., Wittig, R. W., Shumaker, R. W., Masi, S., Nellissen, L., Crockford, C., Moran, S., Lameira, A. R., Edlund, J. (). Reverse engineering great ape vocal tract configurations with implications for evolving speech biomechanics. (Manuscript).

[157]

Fallgren, P., Edlund, J. (). The audio cocktail as a sound browsing tool - a crowdsourcing based validation. (Manuscript).

Latest sync with DiVA:

2026-06-20 23:30:23 UTC

Studies

Research

Collaboration

About KTH

Library

Publications by Jens Edlund

Peer reviewed

Articles

Conference papers

Books

Chapters in books

Non-peer reviewed

Conference papers

Chapters in books

Theses

Other

Contact