40324 Modern Information Retrieval

Course Description

Information retrieval is the process through which a computer system can respond to a user's query for text-based information on a specific topic. Information retrieval was one of the first and remains one of the most important problems in the domain of natural language processing. Web search is the application of information retrieval techniques to the largest corpus of text anywhere and it is the area in which most people interact with information retrieval systems most frequently. In this course, we will cover basic and advanced techniques for building text-based information systems, including efficient text indexing, Boolean and vector-space retrieval models, evaluation and interface, issues, information retrieval techniques for the web, including crawling, link-based algorithms, and metadata usage, document clustering and classification, traditional and machine learning-based ranking approaches, questiona and answering systems, and recommender systems.

Course Information

Required Texts

  1. [MRS] Christopher D. Manning and Prabhakar Raghavan, and Hinrich Schutze, Introduction to Information Retrieval, Cambridge University Press, 2008.

  2. [HNG] Hang Li, Learning to Rank for Information Retrieval and Natural Language Processing, Morgan & Claypool, 2011.

  3. [MC] Bhaskar Mitra and Nick Craswell, An Introduction to Neural Information Retrieval, Foundations and Trends in Information Retrieval, Vol. 13, No. 1, pp. 1-126, 2018.

Grading Policy

  1. 25%: Mid-term exam (1404/01/30).

  2. 30%: Final exam

  3. 35%: Homeworks.

  4. 15%: Quiz.

Lecture Schedule


Lecture Lecture Date Topics Related Readings and Links Homeworks & Assignments Quizes
1 1403-11-20 Introduction Chapter 1 of MRS
2 1403-11-27 Boolean information retrieval
and document preprocessing
Chapters 1 & 2 of MRS
3 1403-11-29 Dictionaries and tolerant retrieval Chapter 3 of MRS
4 1403-12-04 Index Construction Chapter 4 of MRS
5 1403-12-06 Index compression Chapter 5 of MRS
6 1403-12-11 Vector space modelChapter 6 of MRS
7 1403-12-13 Scores in a complete search system Chapter 7 of MRS
8 1403-12-18 Evaluation in information retrieval Chapter 8 of MRS
9
10
1403-12-20
1403-12-25
Relevance feedback and query expansion Chapter 9 of MRS
11 1404-01-16Probabilistic Information Retrieval Chapter 11 of MRS
12 1404-01-18Language Models for Information RetrievalChapter 12 of MRS
13
14
1404-01-23
1404-01-25
Probabilistic text classification
Vector space text classification
Chapters 13-15 of MRS
15 1404-01-30 Mid-term exam
161404-02-01Vector space text classificationChapters 13-15 of MRS
17
18
19
1404-02-06
1404-02-08
1404-02-13
Text clusteringChapters 16 & 17 of MRS
20 1404-09-15Dimensionality reduction and feature selectionChapter 13 of MRS
21 1404-09-20 Latent Semantic Indexing Chapter 18 of MRS
22
23
1404-02-22
1404-02-27
Web cralwing and searchChapters 19 & 20 of MRS
24 1404-02-29 Link Analysis Chapter 21 of MRS
25
26
1404-03-03
1404-03-05
Neural information retrieval MC
27 1404-03-10 Retreival Agmented Generation
28 1404-03-12 Some IR applications
1404-03-28 Final exam At 9:00