How to predict a conversation
Gabriel Skantze and Erik Ekstedt from the Division of Speech, Music and Hearing (TMH)
The SIGIDAL best paper award went to Erik Ekstedt and Gabriel Skantze from Speech, Music and Hearing (TMH). Their model learns to predict what will happen in the next two seconds of the conversation. The research improves the interaction between humans and conversational systems, such as social robots or voice assistants.
Congratulations on winning the Best paper award at the SIGDIAL conference in Edinburgh. Please tell us about your paper.
"Thank you! We are improving the interaction between humans and conversational systems, such as social robots or voice assistants. More specifically, we are interested in modelling fluent turn-taking in conversation.
We have recently developed a deep learning model to train on large amounts of spoken interactions between humans. The model learns to predict continuously what will happen in the next two seconds of the conversation.
From a scientific perspective, a challenge when training deep learning models is that it is tough to know what they learn. In this paper, we present a method for analysing our model and show that it has learned to pick up very subtle prosodic cues, such as the tone of the voice, which is essential for human listeners."
"Our model can be directly applied to improve turn-taking in conversational systems of today and allow for applications in health care, education, and entertainment."
What impact could your research have on society?
"Our model can be directly applied to improve turn-taking in conversational systems of today and allow for applications in health care, education, and entertainment.
The present analysis can also be a powerful tool for improving our scientific understanding of how humans coordinate turn-taking in conversation. Doing psycholinguistic experiments on humans can be very expensive and limited, so it is interesting to see that we can complement such studies with large-scale experiments using our computational models."
What's the most exciting research in your field?
"Research on conversational systems has exploded in recent years in academia and industry; for example, the second Best Paper Award at SIGDIAL 2022 went to Google.
Many people might, for example, have read about the Google LaMDA chatbot or OpenAI's GPT-3, which can sometimes engage in very human-like interactions. However, most of these systems are text-based, and our focus is on how to allow for a more natural spoken interaction."