Till KTH:s startsida Till KTH:s startsida

Internal Projects

Internal Master Degree Projects at Speech, Music and Hearing (TMH)

The division of Speech, Music and Hearing hosts a number of Master students each year, who perform research-related projects during 6 months. The students get office space at the department. Below, we list topics for open project positions, both at KTH and at other places with which we collaborate. If you do not find a project that suits you, contact your favourite TMH faculty member to tailor a project. We welcome students' own ideas for their Bachelor's and Master's thesis projects, especially within the department's research areas.


Reinforcement learning of speech articulation

When children learn to speak, they do so through imitation, by comparing the sounds they can make through their own articulations to what they hear from their environment, and gradually refining articulatory strategies. 

In speech science, articulatory models have been developed that can produce sounds corresponding to certain articulation (see https://dood.al/pinktrombone/ for a fun example). Controlling such models to make them output intelligible speech has however turned out to be a difficult problem. 

The goal of the proposed project is to explore novel techniques in machine learning to learn control of an articulatory model. Specifically, reinforcement learning and imitation learning will be explored. In the generative adversarial imitation learning  setting the learning agent interacts with the environment, where the true reward function is not given.  Instead, the reward comes from a discriminator of a GAN network, which is learned alongside the policy network. Such techniques have been previously applied with great success to animation of human characters, where for example a system is trained to fulfill a target (e.g. kicking a ball) while at the same time exhibiting a motion behavior similar to example data from human recordings. 

Suitable background of the student: machine learning and deep learning, experience with speech/audio/signal processing and reinforcement learning is a plus.

Contact: Jonas Beskow beskow@kth.se, Anna Deichler deichler@kth.se 

Conversational Systems and Human-Robot Interaction

In Gabriel Skantze's research group, we have a number of projects related to Conversational Systems and Human-Robot Interaction where we welcome MSc Thesis students:

  • Modelling turn-taking in conversational systems
  • Social robots as hosts on self-driving buses
  • Social robots for language learning
  • Visual grounding of language in dialogue
  • Social robots as virtual patients
  • Understanding engagement in robot-human presentations

For more information about these projects, see this link

Gabriel Skantze (skantze@kth.se)


Voice Science and Technical Vocology

Measuring, simulating or synthesizing the human voice has many applications, in medicine, pedagogy, media and the arts. There are a lot of unsolved problems in this area, and the voice is something that is close to us all.


- Some knowledge of analog and digital audio
- Good programming skills

More info at this link.


Toward the next generation of collaborative embodied artificial intelligence

We are working towards creating an autonomous robot that can socially interact with other players and can competently play the tabletop game Pandemic. We will explore physical setups with robots but also virtual reality and augmented reality environments. This project requires cross-disciplinary collaboration with people that have different types of skills to come together. Therefore, we are opening multiple positions to work on this project in the areas of artificial intelligence, human-robot interaction, and extended reality research.

Project 1: From game action to embodied social dialog
Meta recently released their Diplomacy game-playing AI ‘Cicero’ which can generate social conversation informed by game actions. Similarly, in this project, you will create a conversation model based on game actions necessary for playing a different game (Pandemic). However, social dialog acts in humans are accompanied by embodied behaviors such as body gestures and facial expressions. In this project, you will not only tackle going from game actions to social dialog but also embed them with embodied information that allows an AI to be more competent in the physical world. Using the game pandemic as a research tool, you will focus on natural language machine learning techniques and perform user studies that compare embodied social agents with facial expressions to embodied agents with more neutral facial expressions.

Project 2: XR - Virtual and Augmented Reality Robots
In this project, you will be comparing a physical environment where people play pandemic with a physical robot against different eXtended Reality (XR) setups that replace the robot with a VR and/or AR version of it. The challenge in this task will be to translate the physical experience to digital while maximizing immersion between humans and the virtual/physical robot. In the end, you will conduct a human/robot interaction experiment that compares enjoyment in playing tabletop games with AR/VR agents and physical robots.

Project X: Discuss additional projects with us on the same topic
Feel free to contact us if you want to work on a different variation of the projects above or have a different idea on the same or similar topic.

Contact: André Pereira (atap@kth.se) and Jura Miniotaite (jura@kth.se).


Social Robotics

Several proposals are currently available, see the page of Ronald Cumbal.


Generative Machine Learning

Several projects are available to strong students interested in generative deep learning, especially with applications to audio, 3D animation, images, and VR.

Please see Gustav Eje Henter's thesis project suggestions for more information (only visible if logged in).


Human-Robot Interaction:

With contingent interpersonal interactions, we create a neural sense of grounding when the quality, intensity, and timing of others’ signals clearly reflect the signals that we have sent. In HRI, we operationalise contingency as a correlation between robot behaviour and changes in its environment.

Given a set of social actions, it is important for a robot to know what is appropriate to do while in dialogue with humans. In this master thesis project, you will investigate quantitative and qualitative indicators, to assess human reactions in human-robot dialogue. You will design the interaction and a task-oriented dialogue and explore objective and subjective measures from human users. Further, you will experiment with sensor data and build a machine learning classifier to interpret what features from human users contribute to understanding of robot actions.

You will experiment with open-source platforms such as OpenFace and OpenSmile and one of our robotic platforms (Furhat or Nao) to build an application that combines multimodal signals and generates appropriate robot responses.

Required skills:
- Knowledge in human-computer interaction
- Good programming skills in Python
- Knowledge in Machine Learning is a plus (equivalent to KTH machine learning course)


Contact Dimos Kontogiorgos (diko@kth.se) or Joakim Gustafson (jocke@speech.kth.se).


Robotics: Factories of the future (FACT)

The project "Factories of the Future: Human Robot Cooperative Systems" or FACT for short is a 5 year endeavour where the departments Robotics, Perception and Learning and Speech, Music and Hearing are collaborating to develop methods to allow humans and robots to share the same workspace and perform object manipulation tasks jointly. One of the main enabling technologies necessary to realise this is the design of a framework that enables the robot to cooperate smoothly with the human, working towards the same goal, which may not be explicitly communicated to the robot before the task is initiated. For human-robot collaboration to become as efficient as human-human collaboration, a robot must be able to perform both the active and passive parts of the interaction, just as a human would. To build a system which these capabilities requires research beyond the state-of-the-art in the areas of object handling and manipulation; programming by demonstration; natural and embodied interaction; control; perception; etc

For more information about the project, look at the FACT webpage

For thesis project suggestions, please look at this page

We believe that the MSc project has to be tailored for every students and therefore do not list specific thesis project. We want to define them together with you, based on what you know, what you and we are interested in and what fits in the context of the project. We have a handful of phd students are involved in the project and we envision thesis projects connected to the topics of these students. A first contact to find out more and as a way to be directed to the doctoral students involved, contact Patric Jensfelt (patric@kth.se) or Joakim Gustafson (jocke@speech.kth.se).