ID2214 Programming for Data Science 7.5 credits

Programmering för data science

The course covers the following topics:

  • Syntax and semantics for programming languages that are particularly suited for data science, e.g. Python, Julia.
  • Routines for importing, combining, transforming and selecting data.
  • Algorithms for handling missing values, discretisation and dimensionality reduction.
  • Algorithms for supervised machine learning, e.g. naïve Bayes, decision trees, random forests.
  • Algorithms for unsupervised machine learning e.g. k-means clustering.
  • Libraries for data analysis.
  • Evaluation methods and performance metrics.
  • Visualising and analysing results.

Show course information based on the chosen semester and course offering:

Offering and execution

No offering selected

Select the semester and course offering above to get information from the correct course syllabus and course offering.

Course information

Content and learning outcomes

Course contents *

Syntax and semantics for programming languages that are particularly suited for data science, e g Python.

Routines to import, combine, convert and make selection of data.

Algorithms for handling of missing values, discretisation and dimensionality reduction.

Algorithms for supervised machine learning, e g naïve Bayes, decision trees, random forests.

Algorithms for unsupervised machine learning e g clustering of k-means.

Libraries for data analysis.

Evaluation methods and performance measures.

Visualisation and analysis of results of data analysis.

Intended learning outcomes *

Having passed the course, the student should be able to:

  • account for and discuss the application of i) technologies to convert data to appropriate format for data analysis ii) algorithms to analyse data through supervised and unsupervised machine learning as well as iii) technologies and performance measurements for evaluation of data analysis results
  • implement and apply i) technologies to convert data to an appropriate format for data analysis ii) algorithms for supervised and unsupervised machine learning as well as iii) technologies and performance measurements for evaluation of data analysis results.

Course Disposition

No information inserted

Literature and preparations

Specific prerequisites *

Admitted to the Master's (120 credits) programme at KTH in the main field of study.

Recommended prerequisites

No information inserted

Equipment

No information inserted

Literature

I. Witten, E. Frank, M. Hall and C. Pal, Data Mining: Practical Machine Learning Tools and Techniques (4th ed.), Morgan Kaufmann, 2016 ISBN: 9780128042915. 

J. VanderPlas, Python Data Science Handbook: Essential tools for working with data (1st ed.), O’Reilly Media Inc., 2016 ISBN: 9781491912058.

Examination and completion

Grading scale *

A, B, C, D, E, FX, F

Examination *

  • INL1 - Assignment, 4.5 credits, Grading scale: A, B, C, D, E, FX, F
  • TEN1 - Examination, 3.0 credits, Grading scale: A, B, C, D, E, FX, F

Based on recommendation from KTH’s coordinator for disabilities, the examiner will decide how to adapt an examination for students with documented disability.

The examiner may apply another examination format when re-examining individual students.

Written examination. Written assignments.

In agreement with KTH´s coordinator for disabilities, it is the examiner who decides to adapt an examination for students in possess of a valid medical certificate.. The examiner may permit other examination forms at the re-examination of few students

Opportunity to complete the requirements via supplementary examination

No information inserted

Opportunity to raise an approved grade via renewed examination

No information inserted

Examiner

Henrik Boström

Further information

Course web

Further information about the course can be found on the Course web at the link below. Information on the Course web will later be moved to this site.

Course web ID2214

Offered by

EECS/Computer Science

Main field of study *

Computer Science and Engineering

Education cycle *

Second cycle

Add-on studies

No information inserted

Contact

Henrik Boström, bostromh@kth.se

Ethical approach *

  • All members of a group are responsible for the group's work.
  • In any assessment, every student shall honestly disclose any help received and sources used.
  • In an oral assessment, every student shall be able to present and answer questions about the entire assignment and solution.

Supplementary information

In this course, the EECS code of honor applies, see: http://www.kth.se/en/eecs/utbildning/hederskodex.