ID2222 Data Mining 7.5 credits
Datautvinning
Education cycle
Second cycleMain field of study
Computer Science and Engineering
Grading scale
A, B, C, D, E, FX, F
Course offerings
Autumn 19 for programme students
-
Periods
Autumn 19 P2 (7.5 credits)
-
Application code
50243
Start date
28/10/2019
End date
14/01/2020
Language of instruction
English
Campus
Campus Kista
Tutoring time
Daytime
Form of study
Normal
-
Number of places *
Min. 25
*) The Course date may be cancelled if number of admitted are less than minimum of places.
Course responsible
Vladimir Vlassov <vladv@kth.se>
Teacher
Sarunas Girdzijauskas <sarunasg@kth.se>
Vladimir Vlassov <vladv@kth.se>
Target group
Open to all programmes.
Part of programme
- Master's Programme, ICT Innovation, 120 credits, year 2, DAMO, Optional
- Master's Programme, ICT Innovation, 120 credits, year 2, DASC, Mandatory
- Master's Programme, ICT Innovation, 120 credits, year 2, DASE, Mandatory
- Master's Programme, Machine Learning, 120 credits, year 1, Conditionally Elective
- Master's Programme, Machine Learning, 120 credits, year 2, Conditionally Elective
- Master's Programme, Software Engineering of Distributed Systems, 120 credits, year 2, Optional
- Master's Programme, Software Engineering of Distributed Systems, 120 credits, year 2, DASC, Conditionally Elective
- Master's Programme, Software Engineering of Distributed Systems, 120 credits, year 2, PVT, Optional
Autumn 18 for programme students
-
Periods
Autumn 18 P2 (7.5 credits)
-
Application code
50603
Start date
29/10/2018
End date
14/01/2019
Language of instruction
English
Campus
KTH Kista
Tutoring time
Daytime
Form of study
Normal
-
Number of places *
Min. 25
*) The Course date may be cancelled if number of admitted are less than minimum of places.
Schedule
Planned timeslots
P2: B1, D2, H2. more info
Course responsible
Vladimir Vlassov <vladv@kth.se>
Teacher
Sarunas Girdzijauskas <sarunasg@kth.se>
Vladimir Vlassov <vladv@kth.se>
Target group
Open to all programmes.
Part of programme
- Master's Programme, ICT Innovation, 120 credits, year 2, DAMO, Optional
- Master's Programme, ICT Innovation, 120 credits, year 2, DASC, Mandatory
- Master's Programme, Machine Learning, 120 credits, year 1, Conditionally Elective
- Master's Programme, Machine Learning, 120 credits, year 2, Conditionally Elective
- Master's Programme, Medical Engineering, 120 credits, year 1, Conditionally Elective
- Master's Programme, Medical Engineering, 120 credits, year 2, Conditionally Elective
- Master's Programme, Software Engineering of Distributed Systems, 120 credits, year 2, Optional
- Master's Programme, Software Engineering of Distributed Systems, 120 credits, year 2, DASC, Conditionally Elective
- Master's Programme, Software Engineering of Distributed Systems, 120 credits, year 2, PVT, Optional
Intended learning outcomes
The course studies fundamentals of data mining, data stream processing, and machine learning algorithms for analyzing very large amounts of data. We will use big data processing platforms, such as MapReduce, Spark and Apache Flink, for implementing parallel algorithms, as well as computation systems for data stream processing, such as Storm and InfoSphere.
After this course, students will be able to mine different types of data, e.g., high-dimensional data, graph data, and infinite/never-ending data (data streams); as well as to program and build data-mining applications. They are also expected to know how to solve problems in real-world applications, e.g., recommender systems, association rules, link analysis, and duplicate detection. Moreover, they will master various mathematical techniques, e.g., linear algebra, optimization, and dynamic programming.
Course main content
- Introduction to Data Mining
- Frequent Itemsets
- Finding Similar Items
- Clustering
- Recommendation Systems
- Mining Data Streams
- Dimensionality Reduction
- Large-Scale Machine Learning
Eligibility
Literature
The contents of the course are derived from the following two textbooks:
A. Rajaraman and J. D. Ullman, Mining of massive datasets. Cambridge University Press, 2012 (alternative: J. Han, M. Kamber, J. Pei, Data Mining: Concepts and Techniques, 3-rd Ed., Morgan Kaufmann, 2012)
Examination
- LAB1 - Programming Assignments, 3.0, grading scale: P, F
- TEN1 - Examination, 4.5, grading scale: A, B, C, D, E, FX, F
Written examination. Laboratory tasks.
Offered by
EECS/Computer Science
Examiner
Vladimir Vlassov <vladv@kth.se>
Version
Course syllabus valid from: Spring 2019.
Examination information valid from: Spring 2019.