EUNISON – Extensive UNIfied-domain SimulatiON of the human voice

This project seeks to build physics-based simulations of the human voice that are more detailed and more complete than before, using numerical models that have been validated against mechanical replicas. The simulations will be accessible for control in the mechanical, neuromotor and phonetic domains. The project engages seven research groups in Spain, Germany, France and Sweden.

Everyone needs their voice, and speech has a pivotal function in modern society. A detailed, working model of the voice would contribute to the human atlas and would find profound applications in fields such as speech technology, medical research, pedagogy, linguistics and the arts. But the physics are very intricate: we make the sounds of speech, song and emotions using multiple mechanisms; and these are under exquisite control, through muscle activation patterns acquired from years of training. Physically, voice involves complex interactions between laminar and turbulent airflow; vibrating, deforming, colliding elastic solids; and sound waves resonating in a contorting duct. So far, these mechanisms have had to be studied one at a time, using disparate tools and often gross approximations, for each of the subproblems. Now, advances in computing techniques suggest the possibility of simulating the entire voice organ, including its biomechanics and aeroacoustics, in a unified numerical domain. This major computational challenge would bring research and education much closer to reality. In the EUNISON project, we seek to build a new voice simulator that is based on physical first principles to an unprecedented degree. From given inputs, representing topology or muscle activations or phonemes, it will render the 3-D physics of the voice, including of course its acoustic output. This will give important insights into how the voice works, and how it fails. The goal is not a speech synthesis system, but rather a voice simulation engine, with many applications; given the right controls and enough computer time, it could be made to speak in any language, or sing in any style. The model will be operable on-line, as a reference and a platform for others to exploit in further studies. The long-term prospects include more natural speech synthesis, improved clinical procedures, greater public awareness of voice, better voice pedagogy and new forms of cultural expression.

Group: Sound and Music Computing

Staff:

Sten Ternström (Project leader)