Skip to main content
Till KTH:s startsida Till KTH:s startsida

DT2118 Speech and Speaker Recognition 7.5 credits

Course offerings are missing for current or upcoming semesters.
Headings with content from the Course syllabus DT2118 (Spring 2016–) are denoted with an asterisk ( )

Content and learning outcomes

Course contents

The course consists of lectures, three practical laboratory exercises with assignments and a written essay on a chosen title. The essay will be presented orally during a closing seminar.

Included topics:

  • algorithms for training, recognition and adaptation to speaker and transmission channel, mainly based on Hidden Markov Models (HMMs) and Deep Neural Networks (DNNs)
  • methods for reducing the sensitivity to external noise and distortion
  • probability theory
  • signal processing and feature extraction
  • acoustic modelling of static and time-varying spectral properties of speech
  • statistic modelling of language in spontaneous speech and written text
  • search strategies – basic methods and algorithms for large vocabularies
  • specific analysis and decision techniques for speaker recognition.

The laboratory exercisea are intended to give practical experience of designing different aspects of a speech recognition application. It consists of the implementation of functions given prototypes and on testing those functions with real speech data.

Intended learning outcomes

After the course, the student will be able to:

  • use the described methods to recognise speech and speaker
  • design a system for a given application
  • adapt and modify existing algorithms for speech and speaker recognition
  • evaluate the performance of speech and speaker recognition systems
  • pursue research in the domain.

Literature and preparations

Specific prerequisites

Single course students: 90 university credits including 45 university credits in Mathematics or Information Technology. English B, or equivalent

Recommended prerequisites

Some knowledge of Machine learning, possibly DD2431, DD2434 or EN2202

Some programming knowledge, best if Python

Some knowledge in Signal Processing

Equipment

No information inserted

Literature

  • Huang, X., Acero, A., Hon, H.-W. Spoken Language Processing – A Guide to Theory, Algorithm and System Development, Prentice Hall, 2001.
  • Automatic Speech Recognition: A deep learning approach, Dong Yu and Li Deng, Springer 2015. You can download the PDF through KTH Library.
  • Särtryck på artiklar inom talarigenkänning och andra ämnen som inte ingår i boken.

Examination and completion

If the course is discontinued, students may request to be examined during the following two academic years.

Grading scale

P, F

Examination

  • LABC - Computer lab, 4.5 credits, grading scale: P, F
  • PROC - Project, 3.0 credits, grading scale: P, F

Based on recommendation from KTH’s coordinator for disabilities, the examiner will decide how to adapt an examination for students with documented disability.

The examiner may apply another examination format when re-examining individual students.

In this course all the regulations of the code of honor at the School of Computer science and Communication apply, see: http://www.kth.se/csc/student/hederskodex/1.17237?l=en_UK.

Other requirements for final grade

Practical laboratory exercise
Computational exercises (assignment)
A written essay with oral presentation in the closing seminar
Reviewing the essays of two other course participants and acting as opponent to their presentations.

Opportunity to complete the requirements via supplementary examination

No information inserted

Opportunity to raise an approved grade via renewed examination

No information inserted

Examiner

Ethical approach

  • All members of a group are responsible for the group's work.
  • In any assessment, every student shall honestly disclose any help received and sources used.
  • In an oral assessment, every student shall be able to present and answer questions about the entire assignment and solution.

Further information

Course room in Canvas

Registered students find further information about the implementation of the course in the course room in Canvas. A link to the course room can be found under the tab Studies in the Personal menu at the start of the course.

Offered by

Main field of study

Computer Science and Engineering, Information Technology, Information and Communication Technology

Education cycle

Second cycle

Add-on studies

No information inserted

Contact

Giampiero Salvi, tel: 790 7894, e-post: giampi@kth.se

Supplementary information

The course may be canceled or be given in another form if the number of regular registrations are too few.