198:541 Database Systems

Thursdays 3:20-6:20pm, CoRE A

Instructor: Amélie Marian
Office hours: Mondays 3-4pm, CoRE 324

TA: Minji Wu (minji-wu@cs)
Office hours: Wednesdays 1-3pm, Hill 402


Announcements

2/1: Class Sakai worksite is set up. Announcements, lecture notes and homework will now be posted through Sakai.
1/28: Class mailing list is set up. If you haven't received an email, please contact Minji.


Course Description

Data Management systems theory and implementation. We will focus on the design of a relational database systems, including query processing issues, as well as on advanced data management topics from recent database research: advanced query processing, information retrieval, XML, data mining.


Textbook

Raghu Ramakrishnan, Johannes Gehrke: Database Management Systems, 3rd edition, McGraw-Hill, 2002.
Plus additional readings from recent database research papers.


Grading

15% Homeworks
30% Project
25% Midterm Exam
30% Final Exam


Schedule (tentative)

Date

Lectures and Readings (Chapter numbers refer to the class textbook.)

January 24

Introduction to DBMS, ER Model. (Chapters 1 and 2.)

January 31

Relational  Model and Algebra. (Chapters 3 and 4.)

February 7

Relational Calculus and SQL. (Chapters 4 and 5.)

February 14
Storage and Indexing. (Chapters 8 to 11.)
February 21

External Sorting. (Chapter 13.)
Query Evaluation. (Chapter 14.)

February 28
Query optimization. (Chapter 15.)
March 6
Top-k Query Processing.
Readings and Project Resources (available on Sakai):
Evaluating Top-k Queries over Web-Accessible Databases, Bruno, Gravano, Marian, ICDE, 2002
Shooting Stars in the Sky: An Online Algorithm for Skyline Queries, Kossmann, Ramsak, Rost, VLDB, 2002
March 13

Midterm Exam (in class)

March 20 Spring break
March 27

Blog Analysis (Guest Speaker: Smriti Bhagat)

Readings:
Applying Link-based Classification to Label Blogs. S.Bhagat, G.Cormode, I.Rozenbaum. WebKDD/SNA-KDD, Aug 2007.
No Blog is an Island - Analyzing Connections Across Information Networks . S. Bhagat, G. Cormode, S. Muthukrishnan, I. Rozenbaum, H. Xue. International Conference on Weblogs and Social Media (ICWSM), Mar 2007.
 

April 3
Multidimensional indexing. (Chapter 28.)
Boolean Information Retrieval.

Readings and Project Resources (available on Sakai):
Multidimensional Access Methods, Gaede, Gunther, ACM Computing Surveys, 1998
Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, Introduction to Information Retrieval, Cambridge University Press. 2007. Available electronically at http://www-csli.stanford.edu/~schuetze/information-retrieval-book.html. (Chapter 1 "Information Retrieval Using the Boolean Model").

Justin Zobel, Alistair Moffat, and Kotagiri Ramamohanarao: Inverted Files Versus Signature Files for Text Indexing, in ACM Transactions on Database Systems, Vol. 23, No. 4 (1998), Pages 453-490.

April 10
Information Retrieval: Scoring and Ranking.

Readings:
Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, Introduction to Information Retrieval, Cambridge University Press. 2007. Available electronically at http://www-csli.stanford.edu/~schuetze/information-retrieval-book.html. (Chapter 6 "Scoring & term weighting", Chapter 7 "Vector space retrieval", Chapter 8 "Evaluation in information retrieval").

April 17
Web Search

Readings:
The Anatomy of a Large-Scale Hypertextual Web Search Engine, Brin, Page, 1998
Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, Introduction to Information Retrieval, Cambridge University Press. 2007. Available electronically at http://www-csli.stanford.edu/~schuetze/information-retrieval-book.html. (Chapter 21 "Link Analysis")

 
April 24
XML and Web Data.

Readings:
Textbook
(Chapter 27.6 to 27.8)
Holistic twig joins: Optimal XML pattern matching, Bruno, Koudas, Srivastava, SIGMOD 2002

May 1
Project Presentations
May 8

Final Exam (in class)