ID2214 Programming for Data Science 7.5 credits

Programmering för data science

Please note

The information on this page is based on a course syllabus that is not yet valid.

The course covers the following topics:

  • Syntax and semantics for programming languages that are particularly suited for data science, e.g. Python, Julia.
  • Routines for importing, combining, transforming and selecting data.
  • Algorithms for handling missing values, discretisation and dimensionality reduction.
  • Algorithms for supervised machine learning, e.g. naïve Bayes, decision trees, random forests.
  • Algorithms for unsupervised machine learning e.g. k-means clustering.
  • Libraries for data analysis.
  • Evaluation methods and performance metrics.
  • Visualising and analysing results.
  • Education cycle

    Second cycle
  • Main field of study

    Computer Science and Engineering
  • Grading scale

    A, B, C, D, E, FX, F

Course offerings

Autumn 19 for programme students

Autumn 18 for programme students

Autumn 18 Doktorand for single courses students

  • Periods

    Autumn 18 P2 (7.5 credits)

  • Application code

    10172

  • Start date

    29/10/2018

  • End date

    14/01/2019

  • Language of instruction

    English

  • Campus

    Campus Kista

  • Tutoring time

    Daytime

  • Form of study

    Normal

  • Number of places *

    Max. 1

    *) If there are more applicants than number of places selection will be made.

  • Course responsible

    Henrik Boström <bostromh@kth.se>

  • Teacher

    Henrik Boström <bostromh@kth.se>

  • Target group

    For doctoral students at KTH

Intended learning outcomes

Having passed the course, the student should be able to:

  • account for and discuss the application of i) technologies to convert data to appropriate format for data analysis ii) algorithms to analyse data through supervised and unsupervised machine learning as well as iii) technologies and performance measurements for evaluation of data analysis results
  • implement and apply i) technologies to convert data to an appropriate format for data analysis ii) algorithms for supervised and unsupervised machine learning as well as iii) technologies and performance measurements for evaluation of data analysis results.

Course main content

Syntax and semantics for programming languages that are particularly suited for data science, e g Python.

Routines to import, combine, convert and make selection of data.

Algorithms for handling of missing values, discretisation and dimensionality reduction.

Algorithms for supervised machine learning, e g naïve Bayes, decision trees, random forests.

Algorithms for unsupervised machine learning e g clustering of k-means.

Libraries for data analysis.

Evaluation methods and performance measures.

Visualisation and analysis of results of data analysis.

Disposition

Eligibility

Admitted to the Master's (120 credits) programme at KTH in the main field of study.

Literature

I. Witten, E. Frank, M. Hall and C. Pal, Data Mining: Practical Machine Learning Tools and Techniques (4th ed.), Morgan Kaufmann, 2016 ISBN: 9780128042915. 

J. VanderPlas, Python Data Science Handbook: Essential tools for working with data (1st ed.), O’Reilly Media Inc., 2016 ISBN: 9781491912058.

Required equipment

Examination

  • INL1 - Assignment, 4.5, grading scale: A, B, C, D, E, FX, F
  • TEN1 - Examination, 3.0, grading scale: A, B, C, D, E, FX, F

Written examination. Written assignments.

In agreement with KTH´s coordinator for disabilities, it is the examiner who decides to adapt an examination for students in possess of a valid medical certificate.. The examiner may permit other examination forms at the re-examination of few students

Requirements for final grade

Offered by

EECS/Computer Science

Contact

Henrik Boström, bostromh@kth.se

Examiner

Henrik Boström <bostromh@kth.se>

Version

Course syllabus valid from: Autumn 2019.
Examination information valid from: Spring 2019.