Databases II

Topics: redundancy, operations and queries on more than one table, views, "physical" and "logical" database.

A. More Data

Extend the pet shop database from Databases I to show which department each pet is in and who manages each department:

Pet Table — Additional Data. petID is still the primary key.
petID name price dept deptMgr
P1 (no change) CanineFred
P2 Feline Maxine
P3 Rodent Harry
P4 Rodent Harry
P5 Canine Fred
P6 Canine Fred
P7 Feline Maxine
P8 Avian Harry

B. Review

  1. We must define the columns in the table definition, then enter the data.
  2. Write an SQL query to direct a customer to departments with small pets: which departments have pets weighing less than 7 pounds?

C. Redundancy

What happens if …

  1. Harry quits, Fred becomes Rodent Department manager, and Maxine becomes Avian Department manager?
  2. We change the name of the "Canine" Department to "K-9"?

How do we feel about these changes? What problems might result?

The cause of the problems is redundancy, the storage of the same information in multiple places. (Address/telephone changes)

D. Eliminating Redundancy

The cure is to divide our single table database into multiple tables (at least two, possibly three). We then will have a new problem: our query must draw data from 2 or 3 tables. Before changing the pet database, let's see how we could operate on two tables to combine their information.

E. Operations on Two Tables

The product (×) and natural join ("bowtie" shape) operations.

Consider two tables, A and B:

Table B: size
size
L
S
Table A: color
color
yellow
blue
red

What are all the combinations of rows of A and B? The × operation gives us the answer: A × B =

A × B
color size
yellow L
yellow S
… (4 more rows)

But, frequently, × gives us more information than we want. Consider tables C and D:

Table D: Shirt Size
ID size
shirt1 L
dress2 S
Table C: Shirt Color
ID color
shirt1 yellow
dress2 blue
socks3 red

Here is the product:

C × D
C.ID color D.ID size
shirt1 yellow shirt1 L
shirt1 yellow dress2 S
… (4 more rows)

(Note that since we have ID columns from both tables, we rename them as C.ID and D.ID to avoid ambiguity.)

If we want to know what color and size each item is, we need to pick the rows where C.ID = D.ID:

select(C.ID = D.ID) from (C × D)
C.ID color D.ID size
shirt1 yellow shirt1 L
dress2 blue dress2 S

This combination of operations is so important that relational databases give us a new operation, called the natural join (bowtie symbol), that does this — and a little more: it keeps only one of the identical columns C.ID and D.ID and calls it just "ID":

C natural join D
ID color size
shirt1 yellow L
dress2 blue S

F. SQL Queries Involving Product and Natural Join

A × B is written in SQL as select * from A, B;

A natural join B is written in SQL as select * from A natural join B;

G. Back at the Pet Shop

We now split our data into two tables: the pet table and the department table.

The dept table (primary key: deptID)
deptID deptName deptMgr
DCAN Canine Fred
DFEL Feline Maxine
DROD Rodent Harry
DAVI Avian Harry
The pet table (primary key: petID)
petID name price deptID
P1 (no change) DCAN
P2 DFEL
P3 DROD
P4 DROD
P5 DCAN
P6 DCAN
P7 DFEL
P8 DAVI

Now it's easy to make those managerial changes….

H. Product and Join Operations in the Pet Shop Database

Before going back to our original query, let's try just the × and natural join operations in SQL on our pet shop database:

The × query:

select * from pet, dept;

The natural join query:

select * from pet natural join dept;

Some database systems, including Microsoft Access (and Openoffice.org) do not implement the standard SQL operation natural join, so you must work around the omission. Here is one way:

select petID, name, breed, sex, weight, price, 
       dept.deptID as deptID, deptName, deptMgr
  from pet, dept
  where pet.deptID = dept.deptID;

Snyder covers another work-around, using inner join … on …, which may be less easy to understand.

I. Views

Now that's a mouthful of typing, so to avoid complexity in our queries we can set up the natural join of our two tables as a view. Define the view "petDeptView" based on the query above.

A view is a "virtual relation" which is not physically stored as a table; instead it is stored as a query, which is recomputed whenever we need the view.

J. Finally

Back to our query, which departments have pets weighing 7 pounds or less? To answer that, We can use natural join —

select deptName
  from pet natural join dept
  where weight < 7;

or the petDeptView to answer:

select deptName
  from petDeptView
  where weight < 7;

K. Levels of Abstraction

Snyder distinguishes between the physical database — the collection of all it tables that are stored as data — and the logical database — which includes the views as well as the physical tables.