Till KTH:s startsida Till KTH:s startsida

Visa version

Version skapad av Sten Ternström 2021-10-07 13:29

Visa < föregående | nästa >
Jämför < föregående | nästa >

Internal Projects

Internal Master Degree Projects at Speech, Music and Hearing (TMH)

The division of Speech, Music and Hearing hosts a number of Master students each year, who perform research-related projects during 6 months. The students get office space at the department. Below, we list topics for open project positions, both at KTH and at other places with which we collaborate. If you do not find a project that suits you, contact your favourite TMH faculty member to tailor a project. We welcome students' own ideas for their Bachelor's and Master's thesis projects, especially within the department's research areas.

Conversational Systems and Human-Robot Interaction

In Gabriel Skantze's research group, we have a number of projects related to Conversational Systems and Human-Robot Interaction where we welcome MSc Thesis students:

  • Modelling turn-taking in conversational systems
  • Social robots as hosts on self-driving buses
  • Social robots for language learning
  • Visual grounding of language in dialogue
  • Social robots as virtual patients
  • Understanding engagement in robot-human presentations

For more information about these projects, see this link

Contact:
Gabriel Skantze (skantze@kth.se)

-------------------------------

Voice Science and Technical Vocology

Measuring, simulating or synthesizing the human voice has many applications, in medicine, pedagogy, media and the arts. There are a lot of unsolved problems in this area, and the voice is something that is close to us all.

Prerequisites:

- Some knowledge of analog and digital audio
- Good programming skills

More info at this link.

-------------------------------------------------------

Social Robotics

Several proposals are current, see the page of Ronald Cumbal.

--------------------------------------------------------

Generative Machine Learning

Several projects are available to strong students interested in generative deep learning, especially with applications to audio, 3D animation, images, and VR. Please see Gustav Eje Henter's thesis project suggestions for more information (requires being logged in).

--------------------------------------------------------

Human-Robot Interaction:

Topic:
With contingent interpersonal interactions, we create a neural sense of grounding when the quality, intensity, and timing of others’ signals clearly reflect the signals that we have sent. In HRI, we operationalise contingency as a correlation between robot behaviour and changes in its environment.

Given a set of social actions, it is important for a robot to know what is appropriate to do while in dialogue with humans. In this master thesis project, you will investigate quantitative and qualitative indicators, to assess human reactions in human-robot dialogue. You will design the interaction and a task-oriented dialogue and explore objective and subjective measures from human users. Further, you will experiment with sensor data and build a machine learning classifier to interpret what features from human users contribute to understanding of robot actions.

You will experiment with open-source platforms such as OpenFace and OpenSmile and one of our robotic platforms (Furhat or Nao) to build an application that combines multimodal signals and generates appropriate robot responses.

Required skills:
- Knowledge in human-computer interaction
- Good programming skills in Python
- Knowledge in Machine Learning is a plus (equivalent to KTH machine learning course)

Contact:

Contact Dimos Kontogiorgos (diko@kth.se) or Joakim Gustafson (jocke@speech.kth.se).

-------------------------------------------------------

Learning muscle activation-acoustic map using a (deep) neural network

Recently biomechanical models of human speech production apparatus has been developed (www.artisynth.org). The purpose of this model is to study speech production and to understand relation between muscle activation patterns, articulation and acoustics. To achieve this purpose, lots of simulations needs to be done by using this model. Another alternative is to choose some limited number of patterns from muscle activation space, run the simulations and save the articulation and acoustic output. Then a neural network (NN) is utilized to learn the relation between two spaces and capability of NN is used to predict the articulatory or acoustic output for any muscle activation patterns. In this thesis, the biomechanical model will be used to generate training and test data sets which are used for training and evaluation the NN. Results will be analyzed in order to explore how speech production is planned and what are the limitations of this method. The results of this study could be published in a conference or journal.

Requirements of applicant: Knowledge in neural networks and speech technology, MATLAB, and Java programing
Suitable as: Master Project

Supervisor and contactOlov Engwall

-----------------------------

From vocal tract resonance frequencies to vocal tract area function

Human's speech production apparatus is a very complex system which has been studied by researchers of different fields and still lots of questions is unanswered. One aspect of speech is acoustics which study wave propagation in human's vocal tract. Vocal tract tube or area of cross-sections (area function) is analyzed to calculate resonance frequencies. In some applications, we need to solve the inverse problem by estimating the area function for desired resonance frequencies. Based on Fant's perturbation theory, a desired formants can be achieved using an iterative method. An alternative to this method one could generate samples of area functions and calculate the corresponding formants. A machine learning method (e.g. neural network) is utilized to learn the relationship between area function and formants. Generalization capability of the algorithm may be used to predict the area function for any unseen area function. The results of this study could be published in a conference or journal.

Requirements of applicant: Knowledge in speech technology and machine learning, MATLAB, and Java programing
Suitable as: Master Project

Supervisor and contact: Olov Engwall

-----------------------

Robotics: Factories of the future (FACT)

The project "Factories of the Future: Human Robot Cooperative Systems" or FACT for short is a 5 year endeavour where the departments Robotics, Perception and Learning and Speech, Music and Hearing are collaborating to develop methods to allow humans and robots to share the same workspace and perform object manipulation tasks jointly. One of the main enabling technologies necessary to realise this is the design of a framework that enables the robot to cooperate smoothly with the human, working towards the same goal, which may not be explicitly communicated to the robot before the task is initiated. For human-robot collaboration to become as efficient as human-human collaboration, a robot must be able to perform both the active and passive parts of the interaction, just as a human would. To build a system which these capabilities requires research beyond the state-of-the-art in the areas of object handling and manipulation; programming by demonstration; natural and embodied interaction; control; perception; etc

For more information about the project, look at the FACT webpage

For thesis project suggestions, please look at this page

We believe that the MSc project has to be tailored for every students and therefore do not list specific thesis project. We want to define them together with you, based on what you know, what you and we are interested in and what fits in the context of the project. We have a handful of phd students are involved in the project and we envision thesis projects connected to the topics of these students. A first contact to find out more and as a way to be directed to the doctoral students involved, contact Patric Jensfelt (patric@kth.se) or Joakim Gustafson (jocke@speech.kth.se).