Literature and examination

Literature

In the lecture plan (see VT 2016 ir16, Schedule and course plan in the menu to the left), we have stated the chapters and articles that should be read before each lecture.

Text Book

C. D. Manning, P. Raghavan and H. Schütze, Introduction to Information Retrieval, Cambridge University Press, 2008.

The book can be bought at Kårbokhandeln or ordered from your favorite internet bookstore, and found using ISBN-10 0521865719 / ISBN-13 9780521865715. Virtually all material from the book, including material from the Stanford and Coursera courses by Manning, is also available online at nlp.stanford.edu/IR-book/information-retrieval-book.html.

Articles

K. Avrachenkov, N. Litvak, D. Nemirovsky and N. Osipova, Monte Carlo methods in PageRank computation: When one iteration is sufficient, SIAM Journal on Numerical Analysis 45(2), 2007.
S. E. Robertson and K. Spärck Jones, Simple, Proven Approaches to Text Retrieval, 1994, www.cl.cam.ac.uk/techreports/UCAM-CL-TR-356.pdf.
M. Sahlgren, An Introduction to Random Indexing, 2005, www.sics.se/~mange/papers/RI_intro.pdf.

Other Resources

To get an idea of state-of-the-art in Information Retrieval research and development, take a look at the program of the annual conference ACM SIGIR.

Different benchmark datasets for evaluation of information retrieval systems can be found at:

Examination

Assignments

The examination in the course is performed through:

Three computer assignments (6 credits). The computer assignments are performed individually, and presented orally by the computer. Grade: A - F(fail).
A project assignment (3 credits). The projects are performed in groups of 4-5 students, and presented with a short written report, as well as an oral poster presentation. Grade (normally the same for all group members): A - F(fail).

Details about the assignments themselves can be found under Computer Assignments and Project in the menu.

Grading

The course grade is the weighted average of the computer assignment grade and the project grade, according to the following:

Computer Assignment \ Project	A	B	C	D	E
A	A	A	B	B	B
B	B	B	B	C	C
C	B	C	C	C	D
D	C	C	D	D	D
E	D	D	D	E	E

How much of the "Monte Carlo methods in PageRank computation: When one iteration is sufficient" paper should we understand? I have not had much of any statistics training, but I can understand that we want to iterate until we are under some error threshold, but of course we need to make sure we are confident that we are under this error threshold, so we consider the confidence interval in our "error being acceptable" conclusion.

The main point is directly in the title, where you can get acceptable results without iterating in a computationally expensive way relative to a single iteration. Are we supposed to conclude that computing in one iteration is acceptable but also computationally much more tractable and then use the paper to implement our own Monte Carlo methods?

Cheers,

Nick

You should understand the nature of the approximation and be able to explain the monte carlo principle. You should also understand the differences between the five methods and be able to readon a bit on what differences in behavior you would expect in the convergence.