Till KTH:s startsida Till KTH:s startsida

Programmering för data science

Logga in till din kurswebb

Du är inte inloggad på KTH så innehållet är inte anpassat efter dina val.

The course "Programming for Data Science" is given both at the master level (2nd cycle) with the course code ID2214, and at the PhD level (3rd cycle) with the course code FID3214. The lectures and seminars are joint, while the examination differs (see below).

Intended learning outcomes

Having passed the course, the student should be able to:

  • account for and discuss the application of
    i) techniques to convert data to appropriate format for data analysis
    ii) algorithms to analyse data through supervised and unsupervised machine learning
    iii) techniques and performance measurements for evaluation of data analysis results

  • implement and apply
    i) techniques to convert data to an appropriate format for data analysis
    ii) algorithms for supervised and unsupervised machine learning
    iii) techniques and performance measurements for evaluation of data analysis results.

Course main content

  • Syntax and semantics for programming languages that are particularly suited for data science, e.g., Python
  • Routines to import, combine, convert and make selection of data
  • Algorithms for handling of missing values, discretisation and dimensionality reduction
  • Algorithms for supervised machine learning, e.g., naïve Bayes, decision trees, random forests
  • Algorithms for unsupervised machine learning e.g., k-means clustering
  • Libraries for data analysis
  • Evaluation methods and performance measures
  • Visualisation and analysis of results of data analysis

Literature

I. Witten, E. Frank, M. Hall and C. Pal, Data Mining: Practical Machine Learning Tools and Techniques (4th ed.), Morgan Kaufmann, 2016 ISBN: 9780128042915. 

J. VanderPlas, Python Data Science Handbook: Essential tools for working with data (1st ed.), O’Reilly Media Inc., 2016 ISBN: 9781491912058. Available online for free here.

Examination

Links below point to Canvas

  • INL1 - Assignment, 4.5 ECTS, grading scale: A-F
  • TEN1 - Examination, 3.0 ECTS, grading scale: A-F
    • See example examination here and a solution here
    • See examination from Jan. 7, 2019 here and a solution here
    • - note that the requirements to pass has since been revised; see example examination
    • See examination from April 17, 2019 here and a solution here
    • - note that the requirements to pass has since been revised; see example examination
  • Course grading:
    • For ID2214, the course grading scale is A-F, and the course grade is an equally weighted average of the grades for INL1 and TEN1 (rounded upwards), where at least an E is required on each part
    • For FID3214, the course grading scale is Pass/Fail, and the course grade is Pass, if the grade on INL1 is at least E and TEN1 is at least C.

Lärare