DD2475 Information Retrieval 9.0 credits

Informationssökning

Please note

This course has been cancelled.

A course in computer science focusing on basic theory, models, and methods for information retrieval.

  • Education cycle

    Second cycle
  • Main field of study

    Computer Science and Engineering
  • Grading scale

    A, B, C, D, E, FX, F

Last planned examination: autumn 13.

At present this course is not scheduled to be offered.

Intended learning outcomes

After completing the course you will be able to:

  • Explain the concepts of indexing, vocabulary, normalization and dictionary in Information Retrieval
  • Define a boolean model and a vector space model, and explain the differences between them
  • Explain the differences between classification and clustering
  • Discuss the differences between different classification and clustering methods
  • Choose a suitable classification or clustering method depending on the problem constraints at hand
  • Implement classification in a boolean model and a vector space model
  • Implement a basic clustering method
  • Give account of a basic spectral method
  • Evaluate information retrieval algorithms, and give an account of the difficulties of evaluation
  • Explain the basics of XML and Web search.

Course main content

Basic and advanced techniques for information systems: information extraction; efficient text indexing; indexing of non-text data; Boolean and vector space retrieval models; evaluation and interface issues; XML, structure of Web search engines; clustering, classification; spectral methods, random indexing; data mining.

Eligibility

Single course students: 90 university credits including 45 university credits in Mathematics or Information Technology. English B, or equivalent.

Recommended prerequisites

A level in Mathematics corresponding to at least 30 credits, including courses in Linear Algebra, Calculus in one and several variables, Mathematical Statistics, and a level in Computer Science corresponding to at least 15 credits. It is also beneficial to have taken courses in Machine Learning, Artificial intelligence, Language Engineering and/or Database Technology.

Literature

C. D. Manning, P. Raghavan and H. Schütze: Introduction to Information Retrieval, Cambridge University Press, 2008.

Examination

  • LAB1 - Laboratory Works, 3.0, grading scale: P, F
  • LAB2 - Project, 3.0, grading scale: A, B, C, D, E, FX, F
  • TEN1 - Exam, 3.0, grading scale: A, B, C, D, E, FX, F

In this course all the regulations of the code of honor at the School of Computer science and Communication apply, see: http://www.kth.se/csc/student/hederskodex/1.17237?l=en_UK.

Requirements for final grade

The students participating in the course are expected to take part in all activities in the course with a particular emphasis on the exercises and laboratories. In addition the course focuses on training:
    * independently acquiring knowledge
    * oral and written presentation
Examination by one written exam (TEN1; 3.0 credits), laboratory assignments (LAB1; 3.0 credits), and a project assigment assessed orally and in writing (LAB2; 3.0 credits).

Offered by

CSC/Computer Science

Examiner

Hedvig Kjellström <hedvig@kth.se>

Supplementary information

This course is replaced by DD2476 Search Engines and Information Retrieval Systems from the year 11/12.

Version

Course syllabus valid from: Autumn 2010.
Examination information valid from: Autumn 2010.