Now we want to look at joins.
To do joins correctly in SQL
requires many of the elements we
have introduced so far. Let's
assume that we have the following
two tables,
Table Store_Information
| store_name |
Sales |
Date |
| Los Angeles |
$1500 |
Jan-05-1999 |
| San Diego |
$250 |
Jan-07-1999 |
| Los Angeles |
$300 |
Jan-08-1999 |
| Boston |
$700 |
Jan-08-1999 |
Table Geography
| region_name |
store_name |
| East |
Boston |
| East |
New York |
| West |
Los Angeles |
| West |
San Diego |
and we want to find out sales
by region. We see that table
Geography includes information
on regions and stores, and table
Store_Information contains sales
information for each store.
To get the sales information
by region, we have to combine
the information from the two
tables. Examining the two tables,
we find that they are linked
via the common field, "store_name".
We will first present the SQL
statement and explain the use
of each segment later:
SELECT
A1.region_name REGION, SUM(A2.Sales)
SALES
FROM Geography A1, Store_Information
A2
WHERE A1.store_name = A2.store_name
GROUP BY A1.region_name
Result:
REGION SALES
East $700
West $2050
The first two lines tell SQL to
select two fields, the first one
is the field "region_name" from
table Geography (aliased as
REGION), and the second one is the
sum of the field "Sales" from
table Store_Information (aliased
as SALES). Notice how the table
aliases are used here: Geography
is aliased as A1, and
Store_Information is aliased as
A2. Without the aliasing, the
first line would become
SELECT
Geography.region_name REGION,
SUM(Store_Information.Sales)
SALES
which is much more cumbersome. In
essence, table aliases make the
entire SQL statement easier to
understand, especially when
multiple tables are included.
Next, we turn our attention to
line 3, the WHERE statement.
This is where the condition of the
join is specified. In this case,
we want to make sure that the
content in "store_name" in table
Geography matches that in table
Store_Information, and the way to
do it is to set them equal. This WHEREstatement
is essential in making sure you
get the correct output. Without
the correct WHERE statement,
a Cartesian Join will result.
Cartesian joins will result in the
query returning every possible
combination of the two (or
whatever the number of tables in
the FROM statement) tables. In
this case, a Cartesian join would
result in a total of 4 x 4 = 16
rows being returned.