Analyzing speech synthesis quality in application-centered, interactive settings

Time: Fri 2018-09-14 15.00

Lecturer: Petra Wagner

Location: Fantum

In this seminar, I will argue for the necessity of a paradigm shift in
speech synthesis evaluation, moving away from the common
decontextualized studies that treat speech (synthesis) quality as if if
were independent of its application or usage context. I will furthermore
present a case study in which we compared the quality of an incremental
TTS including hesitation signals both (i) integrated in a dialogue
system as part of a Smart Home and realized as a function of a user‘s
attention towards a humanoid agent, and (ii) with the help of a
traditional crowdsourced MOS-test. I will close the seminar with a set
of propositions as to how our community should initiate the
investigation of novel and established methods for TTS evaluation as to
create a set of Best Practice standards and thus optimize their usage in
the future.

Petra Wagner received an M.A. degree in linguistics in 1998 (Bielefeld
University), and a doctoral degree in phonetics and communication
sciences from the University of Bonn in 2002. She continued working in
Bonn as a lecturer and researcher until 2008, when she was appointed
professor for phonetics and phonology at Bielefeld University. Her main
research interests lie in the area of (multimodal) prosody, especially
the production and perception of prominence and speech rhythm,
conversational speech synthesis, human-machine interaction and the
phonetic aspects of conversational speech.