Skip to main content
Till KTH:s startsida Till KTH:s startsida

Publications by Jonas Beskow

Refereegranskade

Artiklar

[1]
A. Deichler et al., "Learning to generate pointing gestures in situated embodied conversational agents," Frontiers in Robotics and AI, vol. 10, 2023.
[2]
S. Alexanderson et al., "Listen, Denoise, Action! Audio-Driven Motion Synthesis with Diffusion Models," ACM Transactions on Graphics, vol. 42, no. 4, 2023.
[3]
M. Cohn et al., "Vocal accommodation to technology: the role of physical form," Language sciences (Oxford), vol. 99, 2023.
[5]
G. Valle-Perez et al., "Transflower : probabilistic autoregressive dance generation with multimodal attention," ACM Transactions on Graphics, vol. 40, no. 6, 2021.
[6]
G. E. Henter, S. Alexanderson and J. Beskow, "MoGlow : Probabilistic and controllable motion synthesis using normalising flows," ACM Transactions on Graphics, vol. 39, no. 6, pp. 1-14, 2020.
[7]
K. Stefanov, J. Beskow and G. Salvi, "Self-Supervised Vision-Based Detection of the Active Speaker as Support for Socially-Aware Language Acquisition," IEEE Transactions on Cognitive and Developmental Systems, vol. 12, no. 2, pp. 250-259, 2020.
[8]
S. Alexanderson et al., "Style-Controllable Speech-Driven Gesture Synthesis Using Normalising Flows," Computer graphics forum (Print), vol. 39, no. 2, pp. 487-496, 2020.
[9]
K. Stefanov et al., "Modeling of Human Visual Attention in Multiparty Open-World Dialogues," ACM Transactions on Human-Robot Interaction, vol. 8, no. 2, 2019.
[10]
S. Alexanderson et al., "Mimebot—Investigating the Expressibility of Non-Verbal Communication Across Agent Embodiments," ACM Transactions on Applied Perception, vol. 14, no. 4, 2017.
[11]
S. Alexanderson, C. O'Sullivan and J. Beskow, "Real-time labeling of non-rigid motion capture marker sets," Computers & graphics, vol. 69, no. Supplement C, pp. 59-67, 2017.
[12]
S. Alexanderson and J. Beskow, "Towards Fully Automated Motion Capture of Signs -- Development and Evaluation of a Key Word Signing Avatar," ACM Transactions on Accessible Computing, vol. 7, no. 2, pp. 7:1-7:17, 2015.
[13]
S. Alexanderson and J. Beskow, "Animated Lombard speech : Motion capture, facial animation and visual intelligibility of speech produced in adverse conditions," Computer speech & language (Print), vol. 28, no. 2, pp. 607-618, 2014.
[14]
N. Mirnig et al., "Face-To-Face With A Robot : What do we actually talk about?," International Journal of Humanoid Robotics, vol. 10, no. 1, pp. 1350011, 2013.
[15]
S. Al Moubayed, G. Skantze and J. Beskow, "The Furhat Back-Projected Humanoid Head-Lip Reading, Gaze And Multi-Party Interaction," International Journal of Humanoid Robotics, vol. 10, no. 1, pp. 1350005, 2013.
[16]
S. Al Moubayed, J. Edlund and J. Beskow, "Taming Mona Lisa : communicating gaze faithfully in 2D and 3D facial projections," ACM Transactions on Interactive Intelligent Systems, vol. 1, no. 2, pp. 25, 2012.
[17]
S. Al Moubayed, J. Beskow and B. Granström, "Auditory visual prominence From intelligibility to behavior," Journal on Multimodal User Interfaces, vol. 3, no. 4, pp. 299-309, 2009.
[18]
J. Edlund and J. Beskow, "MushyPeek : A Framework for Online Investigation of Audiovisual Dialogue Phenomena," Language and Speech, vol. 52, pp. 351-367, 2009.
[19]
G. Salvi et al., "SynFace-Speech-Driven Facial Animation for Virtual Speech-Reading Support," Eurasip Journal on Audio, Speech, and Music Processing, vol. 2009, pp. 191940, 2009.
[20]
J. Beskow et al., "Visualization of speech and audio for hearing-impaired persons," Technology and Disability, vol. 20, no. 2, pp. 97-107, 2008.
[21]
B. Lidestam and J. Beskow, "Motivation and appraisal in perception of poorly specified speech," Scandinavian Journal of Psychology, vol. 47, no. 2, pp. 93-101, 2006.
[22]
B. Lidestam and J. Beskow, "Visual phonemic ambiguity and speechreading," Journal of Speech, Language and Hearing Research, vol. 49, no. 4, pp. 835-847, 2006.
[23]
J. Beskow, "Trainable articulatory control models for visual speech synthesis," International Journal of Speech Technology, vol. 7, no. 4, pp. 335-349, 2004.

Konferensbidrag

[24]
A. Deichler et al., "Difusion-Based Co-Speech Gesture Generation Using Joint Text and Audio Representation," in PROCEEDINGS OF THE 25TH INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, ICMI 2023, 2023, pp. 755-762.
[25]
J. Gustafsson, É. Székely and J. Beskow, "Generation of speech and facial animation with controllable articulatory effort for amusing conversational characters," in 23rd ACM International Conference on Interlligent Virtual Agent (IVA 2023), 2023.
[26]
J. Miniotaitė et al., "Hi robot, it's not what you say, it's how you say it," in 2023 32ND IEEE INTERNATIONAL CONFERENCE ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION, RO-MAN, 2023, pp. 307-314.
[27]
S. Mehta et al., "OverFlow : Putting flows on top of neural transducers for better TTS," in Interspeech 2023, 2023, pp. 4279-4283.
[28]
S. Mehta et al., "Neural HMMs are all you need (for high-quality attention-free TTS)," in 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022, pp. 7457-7461.
[29]
B. Moell et al., "Speech Data Augmentation for Improving Phoneme Transcriptions of Aphasic Speech Using Wav2Vec 2.0 for the PSST Challenge," in The RaPID4 Workshop : Resources and ProcessIng of linguistic, para-linguistic and extra-linguistic Data from people with various forms of cognitive/psychiatric/developmental impairments, 2022, pp. 62-70.
[30]
A. Deichler et al., "Towards Context-Aware Human-like Pointing Gestures with RL Motion Imitation," in Context-Awareness in Human-Robot Interaction: Approaches and Challenges, workshop at 2022 ACM/IEEE International Conference on Human-Robot Interaction, 2022, p. 2022.
[31]
J. Beskow et al., "Expressive Robot Performance based on Facial Motion Capture," in INTERSPEECH 2021, 2021, pp. 2343-2344.
[32]
J. Beskow et al., "Expressive robot performance based on facial motion capture," in Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2021, pp. 2165-2166.
[33]
S. Wang et al., "Integrated Speech and Gesture Synthesis," in ICMI 2021 - Proceedings of the 2021 International Conference on Multimodal Interaction, 2021, pp. 177-185.
[34]
P. Jonell et al., "Mechanical Chameleons : Evaluating the effects of a social robot’snon-verbal behavior on social influence," in Proceedings of SCRITA 2021, a workshop at IEEE RO-MAN 2021, 2021.
[35]
É. Székely et al., "Breathing and Speech Planning in Spontaneous Speech Synthesis," in 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, pp. 7649-7653.
[36]
P. Jonell et al., "Can we trust online crowdworkers? : Comparing online and offline participants in a preference test of virtual agents.," in IVA '20: Proceedings of the 20th ACM International Conference on Intelligent Virtual Agents, 2020.
[37]
M. Cohn et al., "Embodiment and gender interact in alignment to TTS voices," in Proceedings for the 42nd Annual Meeting of the Cognitive Science Society : Developing a Mind: Learning in Humans, Animals, and Machines, CogSci 2020, 2020, pp. 220-226.
[38]
S. Alexanderson et al., "Generating coherent spontaneous speech and gesture from text," in Proceedings of the 20th ACM International Conference on Intelligent Virtual Agents, IVA 2020, 2020.
[39]
P. Jonell et al., "Let’s face it : Probabilistic multi-modal interlocutor-aware generation of facial gestures in dyadic settings," in IVA '20: Proceedings of the 20th ACM International Conference on Intelligent Virtual Agents, 2020.
[40]
K. Håkansson et al., "Robot-assisted detection of subclinical dementia : progress report and preliminary findings," in In 2020 Alzheimer's Association International Conference. ALZ., 2020.
[41]
C. Chen et al., "Equipping social robots with culturally-sensitive facial expressions of emotion using data-driven methods," in 2019 14th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2019), 2019, pp. 1-8.
[42]
É. Székely et al., "How to train your fillers: uh and um in spontaneous speech synthesis," in The 10th ISCA Speech Synthesis Workshop, 2019.
[43]
P. Jonell et al., "Learning Non-verbal Behavior for a Social Robot from YouTube Videos," in ICDL-EpiRob Workshop on Naturalistic Non-Verbal and Affective Human-Robot Interactions, Oslo, Norway, August 19, 2019, 2019.
[45]
É. Székely et al., "Off the cuff : Exploring extemporaneous speech delivery with TTS," in Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2019, pp. 3687-3688.
[46]
[47]
Z. Malisz et al., "PROMIS: a statistical-parametric speech synthesis system with prominence control via a prominence network," in Proceedings of SSW 10 - The 10th ISCA Speech Synthesis Workshop, 2019.
[48]
P. Wagner et al., "Speech Synthesis Evaluation : State-of-the-Art Assessment and Suggestion for a Novel Research Program," in Proceedings of the 10th Speech Synthesis Workshop (SSW10), 2019.
[49]
É. Székely et al., "Spontaneous conversational speech synthesis from found data," in Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2019, pp. 4435-4439.
[50]
Z. Malisz et al., "The speech synthesis phoneticians need is both realistic and controllable," in Proceedings from FONETIK 2019, 2019.
[51]
Z. Malisz, P. Jonell and J. Beskow, "The visual prominence of whispered speech in Swedish," in Proceedings of 19th International Congress of Phonetic Sciences, 2019.
[52]
D. Kontogiorgos et al., "A Multimodal Corpus for Mutual Gaze and Joint Attention in Multiparty Situated Interaction," in Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), 2018, pp. 119-127.
[53]
P. Jonell et al., "Crowdsourced Multimodal Corpora Collection Tool," in Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), 2018, pp. 728-734.
[54]
H. -. Vögel et al., "Emotion-awareness for intelligent vehicle assistants : A research agenda," in Proceedings - International Conference on Software Engineering, 2018, pp. 11-15.
[55]
C. Chen et al., "Reverse engineering psychologically valid facial expressions of emotion into social robots," in 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), 2018, pp. 448-452.
[56]
A. E. Vijayan et al., "Using Constrained Optimization for Real-Time Synchronization of Verbal and Nonverbal Robot Behavior," in 2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2018, pp. 1955-1961.
[57]
K. Stefanov and J. Beskow, "A Real-time Gesture Recognition System for Isolated Swedish Sign Language Signs," in Proceedings of the 4th European and 7th Nordic Symposium on Multimodal Communication (MMSYM 2016), 2017.
[58]
Z. Malisz et al., "Controlling prominence realisation in parametric DNN-based speech synthesis," in Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2017, 2017, pp. 1079-1083.
[59]
C. Oertel et al., "Crowd-Sourced Design of Artificial Attentive Listeners," in Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2017, pp. 854-858.
[60]
P. Jonell et al., "Crowd-powered design of virtual attentive listeners," in 17th International Conference on Intelligent Virtual Agents, IVA 2017, 2017, pp. 188-191.
[61]
C. Oertel et al., "Crowdsourced design of artificial attentive listeners," in INTERSPEECH: Situated Interaction, Augusti 20-24 Augusti, 2017, 2017.
[62]
Y. Zhang, J. Beskow and H. Kjellström, "Look but Don’t Stare : Mutual Gaze Interaction in Social Robots," in 9th International Conference on Social Robotics, ICSR 2017, 2017, pp. 556-566.
[63]
M. S. L. Khan et al., "Moveable facial features in a social mediator," in 17th International Conference on Intelligent Virtual Agents, IVA 2017, 2017, pp. 205-208.
[64]
J. Beskow et al., "Preface," in 17th International Conference on Intelligent Virtual Agents, IVA 2017, 2017, pp. V-VI.
[65]
K. Stefanov, J. Beskow and G. Salvi, "Vision-based Active Speaker Detection in Multiparty Interaction," in Grounding Language Understanding, 2017.
[66]
K. Stefanov and J. Beskow, "A Multi-party Multi-modal Dataset for Focus of Visual Attention in Human-human and Human-robot Interaction," in Proceedings of the 10th edition of the Language Resources and Evaluation Conference, 2016.
[67]
J. Beskow and H. Berthelsen, "A hybrid harmonics-and-bursts modelling approach to speech synthesis," in Proceedings 9th ISCA Speech Synthesis Workshop, SSW 2016, 2016, pp. 208-213.
[68]
S. Alexanderson, D. House and J. Beskow, "Automatic annotation of gestural units in spontaneous face-to-face interaction," in MA3HMI 2016 - Proceedings of the Workshop on Multimodal Analyses Enabling Artificial Agents in Human-Machine Interaction, 2016, pp. 15-19.
[69]
K. Stefanov and J. Beskow, "Gesture Recognition System for Isolated Sign Language Signs," in The 4th European and 7th Nordic Symposium on Multimodal Communication, 29-30 September 2016, University of Copenhagen, Denmark, 2016, pp. 57-59.
[70]
K. Stefanov, A. Sugimoto and J. Beskow, "Look Who’s Talking : Visual Identification of the Active Speaker in Multi-party Human-robot Interaction," in 2nd Workshop on Advancements in Social Signal Processing for Multimodal Interaction 2016, ASSP4MI 2016 - Held in conjunction with the 18th ACM International Conference on Multimodal Interaction 2016, ICMI 2016, 2016, pp. 22-27.
[71]
S. Alexanderson, C. O'Sullivan and J. Beskow, "Robust online motion capture labeling of finger markers," in Proceedings - Motion in Games 2016 : 9th International Conference on Motion in Games, MIG 2016, 2016, pp. 7-13.
[72]
J. Beskow, "Spoken and non-verbal interaction experiments with a social robot," in The Journal of the Acoustical Society of America, 2016.
[73]
G. Skantze, M. Johansson and J. Beskow, "A Collaborative Human-Robot Game as a Test-bed for Modelling Multi-party, Situated Interaction," in INTELLIGENT VIRTUAL AGENTS, IVA 2015, 2015, pp. 348-351.
[74]
G. Skantze, M. Johansson and J. Beskow, "Exploring Turn-taking Cues in Multi-party Human-robot Discussions about Objects," in Proceedings of the 2015 ACM International Conference on Multimodal Interaction, 2015.
[75]
S. Al Moubayed et al., "Human-robot Collaborative Tutoring Using Multiparty Multimodal Spoken Dialogue," in 9th Annual ACM/IEEE International Conference on Human-Robot Interaction, Bielefeld, Germany, 2014.
[76]
S. Al Moubayed, J. Beskow and G. Skantze, "Spontaneous spoken dialogues with the Furhat human-like robot head," in HRI '14 Proceedings of the 2014 ACM/IEEE international conference on Human-robot interaction, 2014, p. 326.
[77]
J. Beskow et al., "Tivoli - Learning Signs Through Games and Interaction for Children with Communicative Disorders," in 6th Biennial Conference of the International Society for Augmentative and Alternative Communication, Lisbon, Portugal, 2014.
[78]
S. Al Moubayed et al., "Tutoring Robots: Multiparty Multimodal Social Dialogue With an Embodied Tutor," in 9th International Summer Workshop on Multimodal Interfaces, Lisbon, Portugal, 2014.
[79]
K. Stefanov and J. Beskow, "A Kinect Corpus of Swedish Sign Language Signs," in Proceedings of the 2013 Workshop on Multimodal Corpora : Beyond Audio and Video, 2013.
[80]
S. Alexanderson, D. House and J. Beskow, "Aspects of co-occurring syllables and head nods in spontaneous dialogue," in Proceedings of 12th International Conference on Auditory-Visual Speech Processing (AVSP2013), 2013, pp. 169-172.
[81]
S. Alexanderson, D. House and J. Beskow, "Extracting and analysing co-speech head gestures from motion-capture data," in Proceedings of Fonetik 2013, 2013, pp. 1-4.
[82]
S. Alexanderson, D. House and J. Beskow, "Extracting and analyzing head movements accompanying spontaneous dialogue," in Conference Proceedings TiGeR 2013 : Tilburg Gesture Research Meeting, 2013.
[83]
B. Bollepalli, J. Beskow and J. Gustafsson, "Non-Linear Pitch Modification in Voice Conversion using Artificial Neural Networks," in Advances in nonlinear speech processing : 6th International Conference, NOLISP 2013, Mons, Belgium, June 19-21, 2013 : proceedings, 2013, pp. 97-103.
[84]
S. Al Moubayed, J. Beskow and G. Skantze, "The Furhat Social Companion Talking Head," in Interspeech 2013 - Show and Tell, 2013, pp. 747-749.
[85]
J. Beskow et al., "The Tivoli System - A Sign-driven Game for Children with Communicative Disorders," in 1st Symposium on Multimodal Communication, Msida, Malta, 2013.
[86]
J. Beskow and K. Stefanov, "Web-enabled 3D Talking Avatars Based on WebGL and HTML5," in 13th International Conference on Intelligent Virtual Agents, Edinburgh, UK, 2013.
[87]
J. Edlund et al., "3rd party observer gaze as a continuous measure of dialogue flow," in Proceedings of the 8th International Conference on Language Resources and Evaluation, LREC 2012, 2012, pp. 1354-1358.
[88]
S. Alexanderson and J. Beskow, "Can Anybody Read Me? Motion Capture Recordings for an Adaptable Visual Speech Synthesizer," in In proceedings of The Listening Talker, 2012, pp. 52-52.
[90]
S. Al Moubayed et al., "Furhat : A Back-projected Human-like Robot Head for Multiparty Human-Machine Interaction," in Cognitive Behavioural Systems : COST 2102 International Training School, Dresden, Germany, February 21-26, 2011, Revised Selected Papers, 2012, pp. 114-130.
[91]
G. Skantze et al., "Furhat at Robotville : A Robot Head Harvesting the Thoughts of the Public through Multi-party Dialogue," in Proceedings of the Workshop on Real-time Conversation with Virtual Agents IVA-RCVA, 2012.
[92]
S. Al Moubayed et al., "Furhat goes to Robotville: a large-scale multiparty human-robot interaction data collection in a public space," in Proc of LREC Workshop on Multimodal Corpora, 2012.
[93]
B. Bollepalli, J. Beskow and J. Gustafson, "HMM based speech synthesis system for Swedish Language," in The Fourth Swedish Language Technology Conference, 2012.
[94]
S. Al Moubayed, G. Skantze and J. Beskow, "Lip-reading : Furhat audio visual intelligibility of a back projected animated face," in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2012, pp. 196-203.
[95]
S. Al Moubayed et al., "Multimodal Multiparty Social Interaction with the Furhat Head," in 14th ACM International Conference on Multimodal Interaction, Santa Monica, CA, 2012, pp. 293-294.
[96]
S. Al Moubayed et al., "A robotic head using projected animated faces," in Proceedings of the International Conference on Audio-Visual Speech Processing 2011, 2011, p. 71.
[97]
S. Al Moubayed et al., "Animated Faces for Robotic Heads : Gaze and Beyond," in Analysis of Verbal and Nonverbal Communication and Enactment : The Processing Issues, 2011, pp. 19-35.
[98]
J. Beskow et al., "Kinetic Data for Large-Scale Analysis and Modeling of Face-to-Face Conversation," in Proceedings of International Conference on Audio-Visual Speech Processing 2011, 2011, pp. 103-106.
[99]
J. Edlund, S. Al Moubayed and J. Beskow, "The Mona Lisa Gaze Effect as an Objective Metric for Perceived Cospatiality," in Proc. of the Intelligent Virtual Agents 10th International Conference (IVA 2011), 2011, pp. 439-440.
[100]
S. Al Moubayed et al., "Audio-Visual Prosody : Perception, Detection, and Synthesis of Prominence," in 3rd COST 2102 International Training School on Toward Autonomous, Adaptive, and Context-Aware Multimodal Interfaces : Theoretical and Practical Issues, 2010, pp. 55-71.
[101]
J. Edlund and J. Beskow, "Capturing massively multimodal dialogues : affordable synchronization and visualization," in Proc. of Multimodal Corpora : Advances in Capturing, Coding and Analyzing Multimodality (MMC 2010), 2010, pp. 160-161.
[102]
J. Beskow et al., "Face-to-Face Interaction and the KTH Cooking Show," in Development of multimodal interfaces : Active listing and synchrony, 2010, pp. 157-168.
[103]
J. Beskow and S. Al Moubayed, "Perception of Gaze Direction in 2D and 3D Facial Projections," in The ACM / SSPNET 2nd International Symposium on Facial Analysis and Animation, 2010, pp. 24-24.
[104]
S. Al Moubayed and J. Beskow, "Perception of Nonverbal Gestures of Prominence in Visual Speech Animation," in Proceedings of the ACM/SSPNET 2nd International Symposium on Facial Analysis and Animation, 2010, p. 25.
[105]
S. Al Moubayed and J. Beskow, "Prominence Detection in Swedish Using Syllable Correlates," in Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010, 2010, pp. 1784-1787.
[106]
S. Schötz et al., "Simulating Intonation in Regional Varieties of Swedish," in Speech Prosody 2010, 2010.
[107]
J. Edlund et al., "Spontal : a Swedish spontaneous dialogue corpus of audio, video and motion capture," in Proc. of the Seventh conference on International Language Resources and Evaluation (LREC'10), 2010, pp. 2992-2995.
[108]
S. Al Moubayed and J. Beskow, "Effects of Visual Prominence Cues on Speech Intelligibility," in Proceedings of Auditory-Visual Speech Processing AVSP'09, 2009.
[109]
F. López-Colino, J. Beskow and J. Colas, "Mobile Synface : Talking head interface for mobile VoIP telephone calls," in Actas del X Congreso Internacional de Interaccion Persona-Ordenador, INTERACCION 2009, 2009.
[110]
J. Beskow, G. Salvi and S. Al Moubayed, "SynFace : Verbal and Non-verbal Face Animation from Audio," in Proceedings of The International Conference on Auditory-Visual Speech Processing AVSP'09, 2009.
[111]
J. Beskow, G. Salvi and S. Al Moubayed, "SynFace - Verbal and Non-verbal Face Animation from Audio," in Auditory-Visual Speech Processing 2009, AVSP 2009, 2009.
[112]
J. Beskow et al., "The MonAMI Reminder : a spoken dialogue system for face-to-face interaction," in Proceedings of the 10th Annual Conference of the International Speech Communication Association, INTERSPEECH 2009, 2009, pp. 300-303.
[113]
S. Al Moubayed et al., "Virtual Speech Reading Support for Hard of Hearing in a Domestic Multi-Media Setting," in INTERSPEECH 2009 : 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, 2009, pp. 1443-1446.
[114]
J. Beskow and L. Cerrato, "Evaluation of the expressivity of a Swedish talking head in the context of human-machine interaction," in Comunicazione parlatae manifestazione delle emozioni : Atti del I Convegno GSCP, Padova 29 novembre - 1 dicembre 2004, 2008.
[115]
J. Beskow et al., "Hearing at Home : Communication support in home environments for hearing impaired persons," in INTERSPEECH 2008 : 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, 2008, pp. 2203-2206.
[116]
J. Beskow et al., "Innovative interfaces in MonAMI : The Reminder," in Perception In Multimodal Dialogue Systems, Proceedings, 2008, pp. 272-275.
[117]
J. Beskow et al., "Recognizing and Modelling Regional Varieties of Swedish," in INTERSPEECH 2008 : 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, 2008, pp. 512-515.
[118]
J. Beskow, B. Granström and D. House, "Analysis and synthesis of multimodal verbal and non-verbal interaction for animated interface agents," in VERBAL AND NONVERBAL COMMUNICATION BEHAVIOURS, 2007, pp. 250-263.
[119]
J. Edlund and J. Beskow, "Pushy versus meek : using avatars to influence turn-taking behaviour," in INTERSPEECH 2007 : 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, 2007, pp. 2784-2787.
[120]
E. Agelfors et al., "User evaluation of the SYNFACE talking head telephone," in Computers Helping People With Special Needs, Proceedings, 2006, pp. 579-586.
[121]
J. Beskow, B. Granström and D. House, "Visual correlates to prominence in several expressive modes," in INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, 2006, pp. 1272-1275.
[122]
J. Beskow and M. Nordenberg, "Data-driven synthesis of expressive visual speech using an MPEG-4 talking head," in 9th European Conference on Speech Communication and Technology, 2005, pp. 793-796.
[123]
O. Engwall et al., "Design strategies for a virtual language tutor," in INTERSPEECH 2004, ICSLP, 8th International Conference on Spoken Language Processing, Jeju Island, Korea, October 4-8, 2004, 2004, pp. 1693-1696.
[124]
J. Beskow et al., "Expressive animated agents for affective dialogue systems," in AFFECTIVE DIALOGUE SYSTEMS, PROCEEDINGS, 2004, pp. 240-243.
[125]
J. Beskow et al., "Preliminary cross-cultural evaluation of expressiveness in synthetic faces," in Affective Dialogue Systems, Proceedings, 2004, pp. 301-304.
[126]
J. Beskow et al., "SYNFACE - A talking head telephone for the hearing-impaired," in COMPUTERS HELPING PEOPLE WITH SPECIAL NEEDS : PROCEEDINGS, 2004, pp. 1178-1185.
[127]
K.-E. Spens et al., "SYNFACE, a talking head telephone for the hearing impaired," in IFHOH 7th World Congress for the Hard of Hearing. Helsinki Finland. July 4-9, 2004, 2004.
[128]
J. Beskow et al., "The Swedish PFs-Star Multimodal Corpora," in Proceedings of LREC Workshop on Models of Human Behaviour for the Specification and Evaluation of Multimodal Input and Output Interfaces, 2004, pp. 34-37.
[129]
E. Agelfors et al., "A synthetic face as a lip-reading support for hearing impaired telephone users - problems and positive results," in European audiology in 1999 : proceeding of the 4th European Conference in Audiology, Oulu, Finland, June 6-10, 1999, 1999.
[130]
E. Agelfors et al., "Synthetic visual speech driven from auditory speech," in Proceedings of Audio-Visual Speech Processing (AVSP'99)), 1999.

Kapitel i böcker

[131]
G. Skantze, J. Gustafson and J. Beskow, "Multimodal Conversational Interaction with Robots," in The Handbook of Multimodal-Multisensor Interfaces, Volume 3 : Language Processing, Software, Commercialization, and Emerging Directions, Sharon Oviatt, Björn Schuller, Philip R. Cohen, Daniel Sonntag, Gerasimos Potamianos, Antonio Krüger Ed., : ACM Press, 2019.
[132]
J. Edlund, S. Al Moubayed and J. Beskow, "Co-present or Not? : Embodiment, Situatedness and the Mona Lisa Gaze Effect," in Eye gaze in intelligent user interfaces : gaze-based analyses, models and applications, Nakano, Yukiko; Conati, Cristina; Bader, Thomas Ed., London : Springer London, 2013, pp. 185-203.
[133]
J. Edlund, D. House and J. Beskow, "Gesture movement profiles in dialogues from a Swedish multimodal database of spontaneous speech," in Prosodic and Visual Resources in Interactional Grammar, Bergmann, Pia; Brenning, Jana; Pfeiffer, Martin C.; Reber, Elisabeth Ed., : Walter de Gruyter, 2012.
[134]
J. Beskow et al., "Multimodal Interaction Control," in Computers in the Human Interaction Loop, Waibel, Alexander; Stiefelhagen, Rainer Ed., Berlin/Heidelberg : Springer Berlin/Heidelberg, 2009, pp. 143-158.
[135]
J. Beskow, J. Edlund and M. Nordstrand, "A Model for Multimodal Dialogue System Output Applied to an Animated Talking Head," in SPOKEN MULTIMODAL HUMAN-COMPUTER DIALOGUE IN MOBILE ENVIRONMENTS, Minker, Wolfgang; Bühler, Dirk; Dybkjær, Laila Ed., Dordrecht : Springer, 2005, pp. 93-113.

Icke refereegranskade

Konferensbidrag

[136]
D. House, S. Alexanderson and J. Beskow, "On the temporal domain of co-speech gestures: syllable, phrase or talk spurt?," in Proceedings of Fonetik 2015, 2015, pp. 63-68.
[137]
S. Al Moubayed et al., "Talking with Furhat - multi-party interaction with a back-projected robot head," in Proceedings of Fonetik 2012, 2012, pp. 109-112.
[138]
S. Al Moubayed and J. Beskow, "A novel Skype interface using SynFace for virtual speech reading support," in Proceedings from Fonetik 2011, June 8 - June 10, 2011 : Speech, Music and Hearing, Quarterly Progress and Status Report, TMH-OPSR, Volume 51, 2011, 2011, pp. 33-36.
[139]
J. Edlund, J. Gustafson and J. Beskow, "Cocktail : a demonstration of massively multi-component audio environments for illustration and analysis," in SLTC 2010, The Third Swedish Language Technology Conference (SLTC 2010) : Proceedings of the Conference, 2010.
[140]
J. Beskow and B. Granström, "Goda utsikter för teckenspråksteknologi," in Språkteknologi för ökad tillgänglighet : Rapport från ett nordiskt seminarium, 2010, pp. 77-86.
[141]
J. Beskow et al., "Modelling humanlike conversational behaviour," in SLTC 2010 : The Third Swedish Language Technology Conference (SLTC 2010), Proceedings of the Conference, 2010, pp. 9-10.
[142]
J. Beskow et al., "Research focus : Interactional aspects of spoken face-to-face communication," in Proceedings from Fonetik, Lund, June 2-4, 2010 : , 2010, pp. 7-10.
[143]
S. Schötz et al., "Simulating Intonation in Regional Varieties of Swedish," in Fonetik 2010, 2010.
[144]
J. Beskow and J. Gustafson, "Experiments with Synthesis of Swedish Dialects," in Proceedings of Fonetik 2009, 2009, pp. 28-29.
[145]
J. Beskow et al., "Project presentation: Spontal : multimodal database of spontaneous dialog," in Proceedings of Fonetik 2009 : The XXIIth Swedish Phonetics Conference, 2009, pp. 190-193.
[146]
S. Al Moubayed et al., "Studies on Using the SynFace Talking Head for the Hearing Impaired," in Proceedings of Fonetik'09 : The XXIIth Swedish Phonetics Conference, June 10-12, 2009, 2009, pp. 140-143.
[147]
J. Beskow et al., "Human Recognition of Swedish Dialects," in Proceedings of Fonetik 2008 : The XXIst Swedish Phonetics Conference, 2008, pp. 61-64.
[148]
F. López-Colino, J. Beskow and J. Colás, "Mobile SynFace : Ubiquitous visual interface for mobile VoIP telephone calls," in Proceedings of The second Swedish Language Technology Conference (SLTC), 2008.
[149]
J. Beskow et al., "Speech technology in the European project MonAMI," in Proceedings of FONETIK 2008, 2008, pp. 33-36.
[150]
S. Al Moubayed, J. Beskow and G. Salvi, "SynFace Phone Recognizer for Swedish Wideband and Narrowband Speech," in Proceedings of The second Swedish Language Technology Conference (SLTC), 2008, pp. 3-6.
[151]
J. Edlund, J. Beskow and M. Heldner, "MushyPeek : an experiment framework for controlled investigation of human-human interaction control behaviour," in Proceedings of Fonetik 2007, 2007, pp. 61-64.
[152]
J. Beskow, B. Granström and D. House, "Focal accent and facial movements in expressive speech," in Proceedings from Fonetik 2006, Lund, June, 7-9, 2006, 2006, pp. 9-12.
[153]
C. Siciliano et al., "Evaluation of a Multilingual Synthetic Talking Faceas a Communication Aid for the Hearing Impaired," in Proceedings of the 15th International Congress of Phonetic Science (ICPhS'03), 2003, pp. 131-134.
[154]
J. Beskow, O. Engwall and B. Granström, "Resynthesis of Facial and Intraoral Articulation fromSimultaneous Measurements," in Proceedings of the 15th International Congress of phonetic Sciences (ICPhS'03), 2003.
[155]
D. W. Massaro et al., "Picture My Voice : Audio to Visual Speech Synthesis using Artificial Neural Networks," in Proceedings of International Conference on Auditory-Visual Speech Processing, 1999, pp. 133-138.
[156]
M. M. Cohen,, J. Beskow and D. W. Massaro, "RECENT DEVELOPMENTS IN FACIAL ANIMATION : AN INSIDE VIEW," in Proceedings of International Conference on Auditory-Visual Speech Processing, 1998, pp. 201-206.
[157]
J. Beskow, "ANIMATION OF TALKING AGENTS," in Proceedings of International Conference on Auditory-Visual Speech Processing, 1997, pp. 149-152.
[158]
J. Beskow, "RULE-BASED VISUAL SPEECH SYNTHESIS," in Proceedings of the 4th European Conference on Speech Communication and Technology, 1995, pp. 299-302.

Kapitel i böcker

[159]
D. W. Massaro et al., "Animated speech : Research progress and applications," in Audiovisual Speech Processing, : Cambridge University Press, 2012, pp. 309-345.

Avhandlingar

[160]
J. Beskow, "Talking Heads - Models and Applications for Multimodal Speech Synthesis," Doctoral thesis : Institutionen för talöverföring och musikakustik, Trita-TMH, 2003:7, 2003.
Senaste synkning med DiVA:
2024-05-05 02:15:25