FDT3303 Critical Perspectives on Data Science and Machine Learning 7.5 credits

This course prepares students for critical reflection upon developments in the disciplines of data science and machine learning, within both the commercial and academic spheres. The course can be taken by PhD students with sufficient experience in statistics, data science, and/or machine learning and artificial intelligence.

Upon successful completion of this course, the student will be able to:

describe and explain problems and pitfalls when interpreting standard experiments performed in these disciplines
interpret existing work based on fundamental principles (e.g., no free lunch, bias-variance tradeoff, information theory, etc.)
identify weaknesses and limitations of an existing work, and assess the claims made from the evidence presented
analyse the reproducibility and replicability of an existing work, and propose improvements
think broadly about the ethical implications of specific applications of machine learning and data science.

The main content of the course is through the presentation of a series of articles (new and “classic”) that reflect upon research in data science and machine learning, and related disciplines, e.g., applied statistics. Each student will select and present a paper, and help lead discussion about the topic.

Information per course offering

Course offerings are missing for current or upcoming semesters.

Course syllabus as PDF

Please note: all information from the Course syllabus is available on this page in an accessible format.

Course syllabus FDT3303 (Autumn 2019–)

Information for research students about course offerings

October – December 2019

Headings with content from the Course syllabus FDT3303 (Autumn 2019–) are denoted with an asterisk ( )

Content and learning outcomes

Course contents

The main content of the course is through the presentation of a series of “classic” articles that critically reflect upon work in data science and machine learning, and related disciplines, e.g., applied statistics.

Intended learning outcomes

Upon successful completion of this course, the student will be able to:

• describe and explain problems and pitfalls when interpreting standard experiments performed in these disciplines

• interpret existing work based on fundamental principles (e.g., no free lunch, bias-variance tradeoff, information theory, etc.)

• identify weaknesses and limitations of an existing work, and assess the claims made from the evidence presented

• analyse the reproducibility and replicability of an existing work

• assess the ethical implications of an existing work

• propose improvements to an existing work

Literature and preparations

Specific prerequisites

The course can be taken by PhD students with sufficient experience in statistics, data science, and/or machine learning and artificial intelligence.

Recommended prerequisites

The course can be taken by PhD students with sufficient experience in statistics, data science, and/or machine learning and artificial intelligence.

Equipment

None

Literature

Example literature for review include:

• W. Kimball, “Errors of the third kind in statistical consulting,” J. American Statistical Assoc., vol. 52, pp. 133–142, June 1957.

• D. J. Hand, “Deconstructing statistical questions,” J. Royal Statist. Soc. A (Statistics in Society), vol. 157, no. 3, pp. 317–356, 1994.

• D. J. Hand, “Classifier technology and the illusion of progress,” Statistical Science, vol. 21, no. 1, pp. 1–15, 2006.

• K. L. Wagstaff, “Machine learning that matters,” in Proc. Int. Conf. Machine Learning, pp. 529–536, 2012.

• C. Drummond and N. Japkowicz, “Warning: Statistical benchmarking is addictive. Kicking the habit in machine learning,” J. Experimental Theoretical Artificial Intell., vol. 22, pp. 67–80, 2010.

• P. Langley, “Advice to authors of machine learning papers,” Machine Learning, vol. 5, pp. 233–237, 1990.

• R. Holte, “Very simple classification rules perform well on most commonly used datasets,” Machine Learning, vol. 11, pp. 63–91, 1993.

• E. Keogh and J. Lin, “Clustering of time series subsequences is meaningless: Implications for past and future research,” in Knowledge and Information Systems, Springer-Verlag, 2004.

• E. R. Dougherty and L. A. Dalton, “Scientific knowledge is possible with small-sample classification,” EURASIP J. Bioinformatics and Systems Biology, vol. 2013:10, 2013.

• J. Bryson and A. Winfield, “Standardizing ethical design for artificial intelligence and autonomous systems,” Computer, vol. 50, pp. 116–119, May 2017.

• A.-L. Boulesteix, “Ten simple rules for reducing overoptimistic reporting in methodological computational research,” PLoS Comput Biol, vol. 11, p. e1004191, 04 2015.

Examination and completion

If the course is discontinued, students may request to be examined during the following two academic years.

Grading scale

P, F

Examination

EXA1 - Examination, 7.5 credits, grading scale: P, F

Based on recommendation from KTH’s coordinator for disabilities, the examiner will decide how to adapt an examination for students with documented disability.

The examiner may apply another examination format when re-examining individual students.

Examination includes a short research project that has to be documented in a written report and an oral presentation.

Other requirements for final grade

20 minute oral presentation during one seminar

80% of seminar preparations (homework)

Approved project report

Opportunity to complete the requirements via supplementary examination

No information inserted

Opportunity to raise an approved grade via renewed examination

No information inserted

Examiner

Bobby Lee Townsend Sturm JR

Ethical approach

All members of a group are responsible for the group's work.
In any assessment, every student shall honestly disclose any help received and sources used.
In an oral assessment, every student shall be able to present and answer questions about the entire assignment and solution.

Further information

Course room in Canvas

Registered students find further information about the implementation of the course in the course room in Canvas. A link to the course room can be found under the tab Studies in the Personal menu at the start of the course.

Offered by

EECS/Speech, Music and Hearing

Main field of study

This course does not belong to any Main field of study.

Education cycle

Third cycle

Add-on studies

No information inserted

Contact

Bob Sturm (bobs@kth.se)

Postgraduate course

Postgraduate courses at EECS/Speech, Music and Hearing

Studies

Support and guidance

IT and digital services

Contact

FDT3303 Critical Perspectives on Data Science and Machine Learning 7.5 credits

Information per course offering

Course syllabus as PDF

Information for research students about course offerings

Content and learning outcomes

Course contents

Intended learning outcomes

Literature and preparations

Specific prerequisites

Recommended prerequisites

Equipment

Literature

Examination and completion

Grading scale

Examination

Other requirements for final grade

Opportunity to complete the requirements via supplementary examination

Opportunity to raise an approved grade via renewed examination

Examiner

Ethical approach

Further information

Course room in Canvas

Offered by

Main field of study

Education cycle

Add-on studies

Contact

Postgraduate course