Course Unit Profile

<--- Back to Course Diagram

Basic Information

Course Unit Title: INFORMATION RETRIEVAL

Course Unit Code: 289AA

Level of course unit

Second Cycle, Degree Programme in Computer Science

Year of study

Second year

Semester when the course is delivered

First semester

Number of ECTS credits allocated: 6

Name of Lecturer(s):

Prof.: Paolo Ferragina
Email: paolo.ferragina@unipi.it

Language of instruction

English

General Information

Learning outcomes

The student who successfully completes the course will have the ability to design a simple search engine or one of the numerous text mining tools which are at the core of modern Web applications.


Course contents

Study, design and analysis of IR systems which are efficient and effective to process, mine, search, cluster and classify documents, coming from textual, html or XML data collections. In particular, we will:

- describe the main components of a modern search engine: Crawler, Parser, Compressor, Indexer, Query resolver, Results Ranker, Results Classifier/Clusterer;

- present and use in the Lab some interesting Open-Source Tools for IR applications, such as Lucene and Web graph;

- introduce some basic algorithmic techniques which are now ubiquitous in any IR application for data classification, compression, clustering, projection, and sketching.


Specific Information

Prerequisites, co-requisites, as a prerequisite for further study

Prerequisites

For this course the prerequisite/s is/are

Co-requisites

None.

Prerequisite for

None.

Mode of delivery

Delivery

face to face

Attendance

Advised

Teaching methods

Learning activities

Recommended or required reading

C.D. Manning, P. Raghavan, H. Schutze. Introduction to Information Retrieval. Cambridge University Press, 2008

Chapter 2 “Text compression” of Managing Gigabytes, I.H. Witten and A. Moffat and T.C. Bell, Morgan Kauffman, Second edition, 1999.


Assessment methods and criteria

Assessment methods

Further information

Assessment criteria

The student will be assessed on his/her demonstrated ability to discuss the main course contents using the appropriate terminology.

Work placement

No

<--- Back to Course Diagram