SIGNBOT - Generative AI for Sign Language

a woman wearing motion detection devices and trying to make a gestures to be recorded

Progress in the processing of spoken languages has advanced rapidly in recent years, but sign language, used by more than 70 million people worldwide, has not seen this development. In this project, we aim to do high-quality generative modelling of sign language using state-of-the-art generative AI models for animation and language. Our unique starting point for this mission comes from our experience in developing state-of-the-art neural motion synthesis techniques and neural end-to-end speech generation systems.

The project is highly interdiciplinary in nature, reflected in the composition of the research team form KTH and Stockholm University that collectively encompasses expertise in multimodal speech and language processing, avatars and embodied agents, neural motion generation, motion capture as well as sign language linguistics, corpus collection and annotation.

Sign languages present unique challenges not found in written or spoken language due to their visuo-spatial nature and highly parallel structure. This makes sign language a suitable problem to approach using generative AI methods, and suggest that a combination of approaches from text, speech, motion, and image domains might be required. The societal significance of this problem cannot be overstated, as solutions could greatly increase accessibility and inclusion for sign language users, providing them with resources and communication modes that the rest of society takes for granted.

Publications

[1]

F. Malmberg et al., "Exploring Latent Sign Language Representations with Isolated Signs, Sentences and In-the-Wild Data," in 11th Workshop on the Representation and Processing of Sign Languages: Evaluation of Sign Language Resources, sign-lang@LREC-COLING 2024, 2024, pp. 219-224.