Query Optimization Overview

In this class we'll start a week's focus on query optimization. There are two readings:

Surajit Chaudhuri: An Overview of Query Optimization in Relational Systems. PODS 1998: 34-43. We're reading this paper because it does a good sampling of the research on query optimization as of 1998.
P. Selinger, M. Astrahan, D. Chamberlin, R. Lorie, T. Price. Access Path Selection in a Relational Database Management System. SIGMOD 1979. We're reading this paper because it is the basis of almost all modern query optimizers, and is a good introduction to how to think about them.

To successfully complete this reading, be ready to write a good response and be prepared for a good in class discussion, I suggest the following strategy:

Read the overview first. It will help give you an idea of what to focus on when reading Selinger et al (especially since it has a section on Selinger et al.) For the overview:
- For those of you who don't know what a group-by query is, a group-by query is a query where you divide tuples into groups and apply aggregate operators to each group. For example, you might look through a relation about sailors and their ages, and count the number of sailors who were each age.
- A "block" in a query is approximately equal to one select from where block. A "nested query" is one that consists of more than one block.
- For a basic definition of semi-join, check out the Wikipedia article. It's worth noting that this term has also been co-opted by a slightly different one in the case of distributed databases, which is the definition given in some books. You don't really need to know what it means, but now you have a reference if you're curious.
For Selinger et al:
- Don't worry about the details of the cost formulas, just get the main idea
- Don't worry about the details of the join algorithms; we'll study those a bit more when we study query execution methods. The important thing to know is that there are several ways of performing a join, and that some of them leave tuples sorted, and others leave the tuples unsorted.

[504 home] [grading] [schedule][project] [WebCT]

Rachel Pottinger
E-mail Address: rap [at] cs [dot] ubc [dot] ca

Office Location: CICSR 345
Phone: (604)822-0436
Fax:(604)822-5485
Postal/Courier address:
The Department of Computer Science
University of British Columbia
201-2366 Main Mall
Vancouver, B.C. V6T 1Z4
Canada