Thursday, May 5, 2011

SQL: Joining Tables


SQL: Joining Tables





Unit 1. SQL Overview



In this course you will be presented with introductory information that is helpful while you're learning SQL — IBM's Structured Query Language.
You briefly go over concepts about relational databases in this section. In the next section, you see the data used in this course. The final section in this unit covers general topics about SQL.

After completing this unit, you should be able to:
  • Understand relational concepts
  • Use course tables
  • Comprehend SQL concepts



Topic 1.1: Relational Concepts

*Storing Information in Tables
In a relational database model, the data is stored in one or more tables.
Each table consists of a specific number of columns and some number of unordered rows.
One row of a sample table named STUDENT is displayed above.

STUDENTNO  FIRSTNAME     MIDINIT  LASTNAME         MAJOR       SEX
—————————  ————————————  ———————  ———————————————  ——————————  ———
S10        GERALD        A        ATTRIX           MATH          M
S20        PHYLISS       A        LUTZ             PHYSICS       F
S30        DELBERT       A        SPRAY            MATH          M
S40        JAMES         E        SHUE                           M
S50        BEVERLY       A        BOGGS            HISTORY       F
S60        DAVE          I        HICKS            MATH          M
S70        JACKIE        I        RICTER           BUSINESS      F
S80        MARGE         E        GILHOULY                       F
S90        IRMA          A        SPRAY            HOME ECON     F
*Viewing a Complete Table
Above is the complete table. As you can see, the table consists of six columns: STUDENTNO, FIRSTNAME, MIDINIT, LASTNAME, MAJOR and SEX.
It also consists of nine rows, one for each student.
The next section goes over this table and the others used in this course.




*Columns and Rows
Columns of a table have the following characteristics:
  • Appear vertically when displayed on a report
  • Contain the same kind of data
  • Have names which are used as column titles when they are displayed on a report
Rows of a table have the following characteristics:
  • Appear horizontally when displayed on a report
  • Contain one or more columns of data
  • Unlike columns, they have no names


*How Large Can a Table Be?
The tables you work with in this course are relatively small for ease of use and reference. However, this does not mean you are limited to working with small tables.
A table can be virtually any number of rows with one or more columns.
There is a limitation, but it is unlikely to be reached under normal conditions.
A table with zero rows is called an empty table.




*Relational Databases
Relational databases are based on a model formulated by Dr. E. F. Codd of IBM in 1968. He used very formal terms to describe the concepts you just saw. Here are the equivalent terms he used:
Examine the following table
Formal Relational Term   Informal Equivalents
relation table
tuple   row or record
attribute   column or field

So you can see where the terms relational model and relational database come from. Tables were called relations.
In this course, you see the less formal terms — table, row and column.


Topic 1.2: Course Tables

*Tables
Throughout this course several tables are used for examples and exercises. This section describes each and shows the table entries.
Throughout this course, the tables may be modified slightly to illustrate a point. Also, due to the size of the screen, only portions are displayed at times.

ORG Table
The ORG table contains data that identifies each department within an organization. Each row describes one department. The columns are
DEPTNUMB—department number
DEPTNAME—department name
MANAGER—identification number
DIVISION—section identification
LOCATION—city
Examine the following table
DEPTNUMB DEPTNAME MANAGER DIVISION LOCATION
10 HEAD OFFICE 160 CORPORATE NEW YORK
15 NEW –GLAND 50 EASTERN BOSTON
20 MID ATLANTIC 10 EASTERN WASHINGTON
38 SOUTH ATLANTIC 30 EASTERN ATLANTA
42 GREAT LAKES 100 MIDWEST CHICAGO
51 PLAINS 140 MIDWEST DALLAS
66 PACIFIC 270 WESTERN SAN FRANCISCO
84 MOUNTAIN 290 WESTERN DENVER


APPLICANT Table
The APPLICANT table displays data on job applicants. Each row depicts one applicant. The columns are
TEMPID          temporary identification number
NAME            last name
ADDRESS         city, state
EDLEVEL         education level
COMMENTS        comments about applicant
Examine the following table
TEMPID NAME ADDRESS EDLEVEL COMMENTS        
400 FROMMHERZ SAN JOSE, CA 12 NO SALES EXPERIENCE
410 JACOBS POUGHKEEPSIE, NY 16 GOOD CANDIDATE FOR
420 MONTEZ DALLAS, TX 13 OFFER SALES POSITION
430 RICHOWSKI TUCSON, AZ 14 CAN'T START WORK UNT
440 REID ENDICOTT, NY 14 1 YEAR SALES EXPERIE
450 JEFFREYS PHILADELPHIA, PA 12 GOOD CLERICAL BACKGR
460 STANLEY CHICAGO, IL 11 WANTS PARTIME JOB
470 CASALS PALO ALTO, CA 14 EXPERIENCED SALESMA
480 LEEDS EAST FISHKILL, NY 12 NEEDS INTERVIEW WIT
490 GASPARD PARIS, TX 16 WORKED HERE FROM


STAFF Table
The STAFF table holds information related to staff members. Each row identifies one person on the staff. The columns are
ID            identification number                   
NAME          last name
DEPT          department number
JOB           job title/area
YEARS         years on staff
SALARY        salary amount
COMM          commission amount
Examine the following table
ID NAME DEPT JOB YEARS SALARY COMM
10 SANDERS   20 MGR    7 18357.50     -
20 PERNAL   20 SALES    8 18171.25 612.45
30 MARENGHI   38 MGR    5 17505.75     -
40 O'BRIEN   38 SALES    6 18006.00 846.55
50 HANES   15 MGR   10 20659.80     -
60 QUIGLEY   38 SALES    - 16808.30 650.2511
70 ROTHMAN   15 SALES    7 16502.83 52.00
80 JAMES   20 CLERK    - 13504.60 128.2013
90 KOONITZ   42 SALES    6 18001.75 86.70
100 PLOTZ        42 MGR      7   18352.80     -  
110 NGAN         15 CLERK    5   12508.20 206.60
120 NAUGHTON     38 CLERK    -   12954.75 180.00
130 YAMAGUCHI   42 CLERK    6   10505.90   75.60
140 FRAYE        51 MGR      6   21150.00     -  
150 WILLIAMS     51 SALES    6   19456.50 637.65
160 MOLINARE     10 MGR      7   22959.20     -  
170 KERMISCH     15 CLERK    4   12258.50 110.10
180 ABRAHAMS     38 CLERK    3   12009.75 236.50
190 SNEIDER      20 CLERK    8   14252.75 126.50
200 SCOUTTEN     42 CLERK    -   11508.60   84.20
210 LU           10 MGR     10   20010.00     -  
220 SMITH        51 SALES    7   17654.50 992.80
230 LUNDQUIST   51 CLERK    3   13369.80 189.65
240 DANIELS      10 MGR      5   19260.25     -  
250 WHEELER      51 CLERK    6   14460.00 513.30
260 JONES        10 MGR     12   21234.00     -  
270 LEA          66 MGR      9   18555.50     -  
280 WILSON       66 SALES    9   18674.50 811.50
290 QUILL        84 MGR     10   19818.00     -  
300 DAVIS        84 SALES    5   15454.50 806.10
310 GRAHAM       66 SALES   13   21000.00 200.30
320 GONZALES     66 SALES    4   16858.20 844.00
330 BURKE        66 CLERK    1   10988.00   55.50
340 EDWARDS      84 SALES    7   17844.00 1285.00
350 GAFNEY      84 CLERK    5   13030.50 188.00


PRESIDENT_TABLE
PRESIDENT_TABLE contains data about the presidents of the U.S. Each row describes one of the presidents. The columns are

PRES_NAME         last name, initial                      
BIRTH_YR          year born
YRS_SERVED        years served
DEATH_AGE         age at death
PARTY             political party
STATE_BORN        state where born
                                                                          
Examine the following table
PRES_NAME      BIRTH_YR YRS_SERVED DEATH_AGE PARTY STATE_BORN         
WASHINGTON G     1732     7         67     FEDERALIST VIRGINIA     
ADAMS J          1735     4         90     FEDERALIST MASSACHUSETTS
JEFFERSON T      1743     8         83     DEMO-REP    VIRGINIA     
MADISON J        1751     8         85     DEMO-REP    VIRGINIA     
MONROE J         1758     8         73     DEMO-REP    VIRGINIA     
ADAMS J Q        1767     4         80     DEMO-REP    MASSACHUSETTS
JACKSON A        1767     8         78     DEMOCRATIC SOUTH CAROLINA
VAN BUREN M      1782     4         79     DEMOCRATIC NEW YORK     
HARRISON W H     1773     0         68     WHIG        VIRGINIA     
TYLER J          1790     3         71     WHIG        VIRGINIA     
POLK J K         1795     4         53     DEMOCRATIC NORTH CAROLINA
TAYLOR Z         1784     1         65     WHIG        VIRGINIA     
FILLMORE M       1800     2         74     WHIG        NEW YORK     
PIERCE F         1801     4         64     DEMOCRATIC NEW HAMPSHIRE
BUCHANAN J       1791     4         77     DEMOCRATIC PENNSYLVANIA
LINCOLN A        1809     4         56     REPUBLICAN KENTUCKY     
JOHNSON A        1808     3         66     DEMOCRATIC NORTH CAROLINA
GRANT U S        1822     8         63     REPUBLICAN OHIO         
HAYES R B        1822     4         70     REPUBLICAN OHIO         
GARFIELD J A     1831     0         49     REPUBLICAN OHIO         
ARTHUR C A       1830     3         56     REPUBLICAN VERMONT      
CLEVELAND G      1837     8         71     DEMOCRATIC NEW JERSEY   
HARRISON B       1833     4         67     REPUBLICAN OHIO         
MCKINLEY W       1843     4         58     REPUBLICAN OHIO         
ROOSEVELT T      1858     7         60     REPUBLICAN NEW YORK     
TAFT W H         1857     4         72     REPUBLICAN OHIO         
WILSON W         1856     8         67     DEMOCRATIC VIRGINIA     
HARDING W G      1865     2         57     REPUBLICAN OHIO         
COOLIDGE C       1872     5         60     REPUBLICAN VERMONT      
HOOVER H C       1871     4         90     REPUBLICAN IOWA         
ROOSEVELT F D   1882    12         63     DEMOCRATIC NEW YORK     
TRUMAN H S       1884     7         88     DEMOCRATIC MISSOURI     
EISENHOWER D D   1890     8         79     REPUBLICAN TEXAS        
KENNEDY J F      1917     2         46     DEMOCRATIC MASSACHUSETTS
JOHNSON L B      1908     5         65     DEMOCRATIC TEXAS        
NIXON R M        1913     5         81      REPUBLICAN CALIFORNIA   
FORD G R         1913     2          -     REPUBLICAN NEBRASKA     
CARTER J E       1924     4          -     DEMOCRATIC GEORGIA      
REAGAN R         1911     8          -     REPUBLICAN ILLINOIS     
BUSH G H W       1924     4          -     REPUBLICAN MASSACHUSETTS
CLINTON W J     1946     8         -     DEMOCRATIC ARKANSAS     


PRES_MARRIAGE
The PRES_MARRIAGE table lists information about the marriage(s) of each president. Every row represents one marriage. The columns are

PRES_NAME          president's last name, initial             
SPOUSE_NAME        spouses last name, initial
PR_AGE             president's age at marriage
SP_AGE             spouse's age at marriage
NR_CHILDREN        number of children
MARRIAGE-YR        year when married
Examine the following table
PRES_NAME       SPOUSE_NAME     PR_AGE SP_AGE NR_CHILDREN MARRIAGE_YEAR
WASHINGTON G   CUSTIS M D     26   27      0        1759       
ADAMS J        SMITH A        28   19      5        1764       
JEFFERSON T    SKELTON M W   28   23      6        1772       
MADISON J      TODD D D P     43   26      0        1794       
MONROE J       KORTRIGHT E   27   17      3        1786       
ADAMS J Q      JOHNSON L C   30   22      4        1797       
JACKSON A      ROBARDS R D   26   26      0        1794       
VAN BUREN M    HOES H         24   23      4        1807       
HARRISON W H   SYMMOS A T     22   20     10        1795       
TYLER J        CHRISTIAN L   23   22      8        1813       
TYLER J        GARDINER J     54   24      7        1844       
POLK J K       CHILDROSS S   28   20      0        1824       
TAYLOR Z       SMITH M M      25   21      6        1810       
FILLMORE M     POWERS A       26   27      2        1826       
FILLMORE M     MCINTOSH C C   58   44      0        1858       
PIERCE F       APPLETON J M   29   28      3        1834       
LINCOLN A      TODD M         33   23      4        1842       
JOHNSON A      MCCARDLE E     18   16      5        1827       
GRANT U S      DENT J B       26   22      4        1848       
HAYES R B      WEBB L W       30   21      8        1852       
GARFIELD J A   RUDOLPH L      26   26      7        1858       
ARTHUR C A     HORNDON E L   29   22      3        1859       
CLEVELAND G    FOLSON F       49   21      5        1886       
HARRISON B     SCOTT C L      20   21      2        1853       
HARRISON B     DIMMICK M S L   62   37      1        1896       
MCKINLEY W     SAXTON I       27   23      2        1871       
ROOSEVELT T    LEE A H        22   19      1        1880       
ROOSEVELT T    CARROW E K     28   25      5        1886       
TAFT W H       HORRON H       28   25      3        1886       
WILSON W       AXSON E L      28   25      3        1885       
WILSON W       GALT E B       58   43      0        1915       
HARDING W G    DE WOLFE F K   25   30      0        1891       
COOLIDGE C     GOODHUE G A   33   26      2        1905       
HOOVER H C     HENRY L        24   23      2        1899       
ROOSEVELT F D ROOSEVELT A E   23   20      6        1905       
TRUMAN H S     WALLACE E V   35   34      1        1918       
EISENHOWER D D DOUD G         25   19      2        1916       
KENNEDY J F    BOUVIER J L   36   24      2        1953       
JOHNSON L B    TAYLOR C A     26   21      2        1934       
NIXON R M      RYAN T C       27   28      2        1940       
FORD G R       WARREN E B     35   30      4        1948       
CARTER J E     SMITH R        21   18      4        1946       
REAGAN R       WYMAN J        28   25      2        1940       
REAGAN R       DAVIS N        41   28      2        1952       
BUSH G H W     PIERCE B       20   19      6        1945       
CLINTON W J    RODHAM H D     28   27      1        1975       


PRES_HOBBY
The PRES_HOBBY table displays data about a president's hobbies. One row identifies one hobby. Multiple rows can occur for one president. The columns are

PRES_NAME        last name, initial
HOBBY            hobby
Examine the following table
PRES_NAME         HOBBY              
ADAMS J Q BILLARDS
ADAMS J Q SWIMMING
ADAMS J Q WALKING
ARTHUR C A FISHING
BUSH G W    BASEBALL
BUSH G W SKYDIVING
CLEVELAND G FISHING
CLINTON W J JOGGING
CLINTON W J SAXOPHONE
COOLIDGE C FISHING
COOLIDGE C GOLF
COOLIDGE C INDIAN CLUBS
COOLIDGE C MECHANICAL HORSE
COOLIDGE C PITCHING HAY
EISENHOWER D D BRIDGE
EISENHOWER D D GOLF
EISENHOWER D D HUNTING
EISENHOWER D D PAINTING
EISENHOWER D D FISHING
GARFIELD J A BILLIARDS
HARDING W G GOLF
HARDING W G POKER
HARDING W G RIDING
HARRISON B HUNTING
HAYES R B CROQUET
HAYES R B DRIVING
HAYES R B SHOOTING
HOOVER H C FISHING
HOOVER H C MEDICINE BALL
JACKSON A RIDING
JEFFERSON T FISHING
JEFFERSON T SAILING
JEFFERSON T RIDING
JOHNSON L B RIDING
KENNEDY J F SAILING
KENNEDY J F SWIMMING
KENNEDY J F TOUCH FOOTBALL
LINCOLN A WALKING
MCKINLEY W RIDING
MCKINLEY W SWIMMING
MCKINLEY W WALKING
NIXON R M GOLF
REAGAN R RIDING
ROOSEVELT F D FISHING
ROOSEVELT F D SAILING
ROOSEVELT F D SWIMMING
ROOSEVELT T BOXING
ROOSEVELT T HUNTING
ROOSEVELT T JUJITSU
ROOSEVELT T GOLF
ROOSEVELT T RIDING
ROOSEVELT T WRESTLING
TAFT W H GOLF
TAFT W H RIDING
TAYLOR Z RIDING
TRUMAN H S FISHING
TRUMAN H S POKER
TRUMAN H S WALKING
VAN BUREN M RIDING
WASHINGTON G FISHING
WASHINGTON G RIDING
WILSON W GOLF
WILSON W RIDING
WILSON W WALKING


STATE_TABLE
STATE_TABLE lists information about when each state entered the Union. Each row depicts one state. The columns are
STATE_NAME           full name
ADMIN_ENTERED        presidential administration
YEAR_ENTERED         year
Examine the following table
STATE_NAME         ADMIN_ENTERED YEAR_ENTERED
MASSACHUSETTS       -         1776    
PENNSYLVANIA         -         1776    
VIRGINIA             -         1776    
CONNECTICUT          -         1776    
SOUTH CAROLINA       -         1776    
MARYLAND             -         1776    
NEW JERSEY           -         1776    
GEORGIA              -         1776    
NEW HAMPSHIRE       -         1776    
DELAWARE             -         1776    
NEW YORK             -         1776    
NORTH CAROLINA       -         1776    
RHODE ISLAND         -         1776    
VERMONT              1         1791    
KENTUCKY             1         1792    
TENNESSEE            2         1796    
OHIO                 4         1803    
LOUISIANA            6         1812    
INDIANA              7         1816    
MISSISSIPPI          8         1817    
ILLINOIS             8         1818    
ALABAMA              8         1819    
MAINE                8         1820    
MISSOURI             9         1821    
ARKANSAS            12         1836    
MICHIGAN            12         1837    
FLORIDA             14         1845    
TEXAS               15         1845    
IOWA                15         1846    
WISCONSIN           15         1848    
CALIFORNIA          16         1850    
MINNESOTA           18         1858    
OREGON              18         1859    
KANSAS              18         1861    
WEST VIRGINIA      19         1863    
NEVADA              19         1864    
NEBRASKA            20         1867    
COLORADO            22         1876    
NORTH DAKOTA        26         1889    
SOUTH DAKOTA        26         1889    
MONTANA             26         1889    
WASHINGTON          26         1889    
IDAHO               26         1890    
WYOMING             26         1890    
UTAH                27         1896    
OKLAHOMA            30         1907    
NEW MEXICO          31         1912    
ARIZONA             31         1912    
ALASKA              43         1959    
HAWAII              43         1959    


ELECTION Table
The ELECTION table contains data about recent elections. Each row identifies the results for one candidate in the election. The columns are

ELECTION_YEAR             year of the election
CANDIDATE                 last name, initial
VOTES                     number of votes received
WINNER_LOSER_INDIC        won/lost indicator
Examine the following table
ELECTION_YEAR CANDIDATE       VOTES    WINNER_LOSER_INDIC
     1960      NIXON R M              219          L         
     1960      KENNEDY J F            303          W         
     1960      BYRD                    15          L         
     1964      GOLDWATER B             52          L         
     1964      JOHNSON L B            486          W         
     1968      NIXON R M              302          W         
     1968      HUMPHREY H H           191          L         
     1968      WALLACE G C             46          L         
     1972      MCGOVERN G S            17          L         
     1972      NIXON R M              520          W         
     1972      HOSPERS J                1          L         
     1976      CARTER J E             297          W         
     1976      FORD G R               240          L         
     1980      CARTER J E              49          L         
     1980      REAGAN R               489          W         
     1984      REAGAN R               523          W         
     1984      MONDALE W F        13          L      
     1988      BUSH G H W   426          W      
     1988      DUKAKIS M S   111          L      
     1992      BUSH G H W   168          L      
     1992      CLINTON W J   370          W      
     1996      CLINTON W J   379          W      
     1996      DOLE R                159          L      


STUDENT Table
The STUDENT table contains student information. Each row describes one student. The columns are

STUDENTNO         student number
FIRSTNAME         first name
MIDINIT           middle initial
LASTNAME          last name
MAJOR             major subject
SEX               M or F
Examine the following table
STUDENTNO FIRSTNAME MIDINIT LASTNAME MAJOR SEX
S10    GERALD   A ATTRIX MATH M
S20 PHYLISS     A LUTZ   PHYSICS F
S30 DELBERT   A SPRAY MATH M
S40    JAMES      E     SHUE        M
S50 BEVERLY   A BOGGS   HISTORY F
S60 DAVE      I   HICKS MATH    M
S70   JACKIE   I    RICTER BUSINESS F
S80 MARGE   E   GILHOULY    F
S90 IRMA   A   SPRAY HOME ECON F



Topic 1.3: SQL Concepts

*Introducing SQL Concepts
SQL is a high-level language that allows you to access and maintain data stored in relational databases, such as DB2 and SQL/DS. Products like FOCUS, SAS and Oracle, which create and maintain relational tables, also use SQL. SQL is one interface to products in these relational systems.
SQL is designed for programmers, system administrators and data processing professionals. Yet, it is especially nice for users who have little training in computer programming. With SQL you specify what you want rather than how to get it. You do not have to be concerned with how the data is stored, or how it is retrieved.

*Learning to Use SQL
SQL is rapidly becoming the industry standard for fourth generation query languages. SQL is easy to learn and easy to use.
Power and simplicity are both features of SQL. You can use single SQL statements to perform functions that require many lines of conventional code.
SQL has these three basic components:
  • Data Definition Language (DDL)
  • Data Manipulation Language (DML)
  • Authorization Language





*Data Definition Language
DDL (Data Definition Language) is used to define new tables. The columns to be included in your table are defined using the CREATE statement. Here is a sample CREATE statement for a STUDENT table.
CREATE TABLE STUDENT
  (STUDENTNO CHAR(3) NOT NULL,
   FIRSTNAME VARCHAR(20),
   MIDINIT CHAR(1),
   LASTNAME VARCHAR(25) NOT NULL,
   MAJOR CHAR(15),
   SEX CHAR(1) NOT NULL)
As you can see, the table consists of 6 columns: STUDENTNO, FIRSTNAME, MIDINIT, LASTNAME, MAJOR and SEX.

*Using DDL Statements
DDL statements permit you to create, change and delete various components of the relational database. In fact, you can define and modify the database online without special utility programs.
Thus, you can interactively:
* create and manipulate tables or
* test SQL statements you plan on using in programs
and not affect the ongoing processes and data already established in your database.




*Data Manipulation Language
DML (Data Manipulation Language) has two basic functions:
  • Creating reports
  • Maintaining tables
One of the simplest, and most common report requests is to view a complete table. All you need is the name of the table.
Let's look at a sample report request.

*Requesting Reports
For example, the statement below requests a report of all columns and all rows for the table named STUDENT.
SELECT * FROM STUDENT
Examine the following table
STUDENTNO FIRSTNAME MIDINIT LASTNAME MAJOR SEX
S10 GERALD A ATTRIX MATH M
S20 PHYLISS A   LUTZ PHYSICS F
S30 DELBERT A SPRAY MATH M
S40 JAMES E SHUE    M
S50 BEVERLY A BOGGS HISTORY F
S60 DAVE I HICKS MATH           M
S70 JACKIE I   RICTER   BUSINESS F
S80 MARGE E   GILHOULY    F
S90 IRMA A SPRAY HOME ECON F


SELECT Request
SELECT * is the simplest retrieval request. Other functions of the SELECT statement include:
  • Limit the request to specific columns.
  • Limit the request to specific rows.
  • Order the rows presented on the request.
  • Develop new column values through calculations.
  • Accumulate numeric column values within specific groupings.
  • Combine data from two or more tables through a common key.


*Maintaining Tables
The other function of DML is to maintain tables.
Here's a simple request to create a new row in the STUDENT table.
INSERT INTO STUDENT
  VALUES ('S02', 'RALPH', 'A', 'LARSON'    , 'PHYSICS', 'M')
This request adds the new row containing the values listed in parentheses.




*Authorization Language
The function of the Authorization Language is to provide effective control over the database resources.
Security in relational databases like DB2 and SQL/DS can be summarized in one sentence.
Nothing can be done unless you have been given the authority.
Authority is controlled by the data administration area through SQL commands.

*Obtaining Authority
Although you need authority to do anything with the database, it is a relatively simple matter to obtain it.
Authority is given through the GRANT command. For example, the command below gives user PROG1 authority to create a table in the database named SKILLS.
GRANT CREATETAB ON DATABASE SKILLS TO PROG1
Authority may also be removed with the REVOKE command. Here's an example removing CREATE authority for tables in database SKILLS from PROG1.
REVOKE CREATETAB ON DATABASE SKILLS FROM PROG1




*Uses for SQL Statements
You can use SQL statements interactively. For example, via IBM's Query Management Facility (QMF) or an interactive facility provided with the relational database.
You can also use SQL statements embedded in application programs such as APL2, BASIC, COBOL, PL/I, FORTRAN, LISP or assembler language programs
These various implementations of SQL, however, are semantically and syntactically the same. What you learn here can be carried to ANY SQL-based database management system.

*Rules for Coding SQL Statements
When coding SQL statements, follow these syntax rules:
  • Variables and keywords must be separated by one or more spaces.
  • Parentheses are often required around a keyword's values. Commas must separate values in a list (e.g. column names, table names, data values).
  • Spacing around commas, parentheses and operators (=,>,+,*,etc.) is for readability. Spaces are not required.
  • Character strings are sequences of characters that begin and end with a string delimiter. A string delimiter is either ' (apostrophe) or " (quotes) depending on the host language and the character chosen when the database system is installed. Use the same character (' or ") for the beginning and ending delimiter.



         INSERT INTO GRADE VALUES ('E01',0.000)

         INSERT INTO GRADE VALUES('E01',0.000)

         INSERT INTO GRADE VALUES ( 'E01', 0.000 )
*Looking at an Example
Above are three valid codings of the same statement showing the syntax rules.
Spaces are required between each keyword or variable (INSERT, INTO, GRADE, VALUES). In the VALUES portion, parentheses are needed around the values and commas must separate the values. The character string E01 is delimited by apostrophes. Spaces around parentheses and commas are optional.




*Following Conventions
In this course, you will see a few conventions followed for readability or emphasis. These conventions are
  • Each clause is entered on a separate line, for example

    SELECT * FROM ROSTER
      WHERE STUDENTNO = 'S10'

    rather than

    SELECT * FROM ROSTER WHERE STUDENTNO = 'S10'
  • Information coded between the symbols < > is optional
  • ... indicates information that repeats



Topic 1.4: Unit 1 Summary

The following points were covered in this unit:
  • Databases based on a relational model store data in one or more tables.
  • Each table in a relational database consists of a specific number of columns and some number of unordered rows.
  • Columns in a relational database appear vertically in a report, contain the same kind of data, and have names that are used as column titles.
  • Rows in a relational database appear horizontally in a report, contain one or more columns of data, and do not have names.
  • Formal terms to describe SQL concepts are: tables are named relations, columns are called attributes, and rows are known as tuples.
  • Structured Query Language (SQL) is a high-level language that allows you to access and maintain data stored in relational databases, such as DB2 and SQL/DS.
  • DDL (Data Definition Language) allows you to define new tables in the database. You can define and modify the database online without special utility programs.
  • DML (Data Manipulation Language) is used to create reports and maintain tables.
  • The Authorization Language provides control of database resources. Authority is controlled by the data administration area through SQL commands.



Unit 2. Combining Information From Two Tables



There may be times when the information you need is contained in two or more tables. To combine information from two tables into a single table, you use a technique called joining.
The objectives in this unit include the following topics:
  • Joining two tables
  • Types of joins
  • Joining a table to itself
  • Using other options, such as built-in functions and ORDER BY, in conjunction with a join


After completing this unit, you should be able to:
  • Comprehend theory about joining and merging tables



Topic 2.1: Theory

*Joining and Merging Tables
This section introduces the situations in which you need to join or merge tables.
Examples are provided of situations where you need to join or merge information from two tables.
On completion of the section you should be ready to learn how to create a join or merge query — to be covered in the next three objectives.

*Using Your Address Book as an Example
Being an efficient individual, you have automated your personal address book. The format of a standard address book lends itself well to creating a separate table for each section:
  • Addresses
  • Birthdays
  • Holiday card lists
Each month you print out a birthday list for the following month. It sounds simple, but it involves combining data from two tables.

*Birthday List
The first table is the birthday list, which contains only name and birthday.
Before you can mail cards, you also need to know addresses.

*Address Table
The address table contains the rest of the information you need. Therefore, you must combine these two tables.

*Combining Birthdays and Addresses
To produce a single table showing the names, birthdays and addresses of every friend with a January birthday, you have to combine information from the two tables.
To do this, there must be a column that is common to both tables.

*Finding a Common Match
These two tables have NAME in common. The system can match the row for JOHN BAKER in the BIRTHDAY table with the row for JOHN BAKER in the ADDRESS table.

*Viewing the Example
The information is combined in a single row of a temporary table created by the system.

*Combining Rows
All the rows from both tables are compared. For every NAME that appears on both tables, a row is created on the new table. Each row contains columns from both the original tables.
Any name that does not appear in both the original tables isn't included in this table, because of the type of join we chose.
Let's look now at the kinds of joins.

*Inner Joins and Outer Joins
There are two major kinds of joins — inner joins and outer joins.
When you code an inner join, you want the new table to include only rows where the values in the joined columns match.
In the example we just completed, we joined the tables using values in the NAME column. If a NAME appeared in one table but not the other, that row of information was excluded from the joined table.

*Beginning with Outer Join
However, you might want your new table to include all people in both tables, one row for each person.
To do this, you need to use an outer join. The outer join allows you to include the rows that don't have a match.

*Looking at the New Table
If we do a full outer join of the BIRTHDAY and ADDRESS tables, we get this result.
Any rows in the result tables that have missing values contain nulls. Note that the table includes all rows from both tables.

*Using Inner Joins to Create Tables
The table created by joining BIRTHDAY and ADDRESS with an inner join is like other tables: it can contain selected columns and rows (e.g., just for January birthdays) and you can order the data any way you want.
Let's look at an SQL query that could be used to join the BIRTHDAY and ADDRESS tables.

        SELECT     BIRTHDAY.NAME, BIRTH_DATE, STREET, TOWN, STATE, ZIP
          FROM     BIRTHDAY, ADDRESS
          WHERE    BIRTHDAY.NAME = ADDRESS.NAME
          ORDER BY BIRTH_DATE
WHERE Clause
The WHERE clause specifies that the rows be matched based on the value in the NAME column of both tables.
The ORDER BY clause orders the output by birth date.
How to code a query to join tables is covered in detail in the next two objectives.
Now try a couple of questions to check your understanding of joining...







*Joining and Merging Tables
Joining tables involves matching the rows of one table with related rows of another table. You produce a table consisting of rows with columns from both the original tables.
Merging tables also involves combining data from two tables, but the rows are not joined together. The rows from each table remain separate in the resulting table.

*Merging Two Tables
Here is an example of merging two tables is the baseball league tables.
By merging the two tables, you create one long table with all the teams on it. Let's see how you merge these two tables.


*Step One
First you select the data you want from the first table.

  SELECT     *                    
    FROM     AMERICAN_LEAGUE      

*Step Two
Then you join the first query to the second with the keyword UNION.

         SELECT     *                
           FROM     AMERICAN_LEAGUE     
         UNION                       
                                        

*Step Three
Then you select the data from the second table.

         SELECT     *                
           FROM     AMERICAN_LEAGUE     
         UNION                       
                                        
         SELECT     *                
           FROM     NATIONAL_LEAGUE     
                                          
*Step Four
Then you can order the result by any column you choose.

   SELECT     *                
     FROM     AMERICAN_LEAGUE     
   UNION                       
                                  
   SELECT     *                
     FROM     NATIONAL_LEAGUE     
     ORDER BY 1                
                                    
*Viewing the Results
The resulting table, in this example, contains all the rows from both tables in ascending order by team, assuming TEAM is the first column in the table.

*Restrictions for Merging Tables
There are several restrictions on when two tables can be merged.
These restrictions, and all rules for writing a query to merge two tables, are discussed in a later objective.




*Reviewing What You Have Learned
To summarize...
Joining is where two rows from two tables that have the same value in a common column are joined to create a single row.

*Reminder
To summarize...
Merging is mixing the rows from two tables, creating a table consisting of rows from both tables.


Topic 2.2: Unit 2 Summary

The following points were covered in this unit:
  • There may be times when the information you need is contained in two or more tables. To combine information from two tables into a single table, you use techniques called joining and merging.
  • Joining tables involves matching the rows of one table with related rows of another table. You produce a table consisting of rows with columns from both the original tables.
  • Inner joins include only rows where the values in the joined columns match. Outer joins allow you to include rows that don't have a match.
  • Merging tables also involves combining data from two tables, but the rows are not joined together. The rows from each table remain separate in the resulting table.



Unit 3. Inner Joins



The theory of joining tables was discussed in the first unit. This unit teaches the details of how to join two tables with an inner join. Some of the key points discussed are:
  • FROM clause format
  • SELECT statement
  • Specifying table names
  • The joining condition
  • Other WHERE clause options
  • ORDER BY clause
  • Alternatives to joining


After completing this unit, you should be able to:
  • Successfully join tables



Topic 3.1: Joining

FROM Clause Format
Another word for inner join is equi-join. Since any join operation is part of a FROM clause, let's look first at the FROM clause format.
The general format of the FROM clause for any join is
FROM table-expression-1 join-operation table-expression-2  
ON join-condition
For example,
FROM PRESIDENT_TABLE INNER JOIN PRES_MARRIAGE  
ON PRESIDENT_TABLE.PRES_NAME  =  PRES_MARRIAGE.PRES_NAME

Note: It's also valid to use just JOIN as the join operation for an inner join.


*Creating an Inner Join
For an inner join, you can optionally use a comma (,) to specify the join operation. When you do so, note that you use a WHERE clause to specify the condition.
FROM table-expression-1 join-operation table-expression-2  
WHERE join-condition
Here's an example:
FROM PRESIDENT_TABLE, PRES_MARRIAGE  
WHERE PRESIDENT_TABLE.PRES_NAME  =  PRES_MARRIAGE.PRES_NAME
Most of our examples will use this format since it's more widely in use.




*Viewing the Example
Columns from the following row in PRESIDENT_TABLE...

*Joining Columns in the Database
...are joined with columns from the following row of PRES_MARRIAGE...

*Producing a Single Output Line
...to produce a single output line.

*Exceptions
On the previous screen you saw that the two rows, one from each table, were joined side by side. But what happens in the case of Ronald Reagan, who appears once in the PRESIDENT_TABLE, but twice in the PRES_MARRIAGE table?

*Results
Two rows appear in the output table.




*Studying the Example
Here is the section of the created table for recent presidents.

*Coding to Join Tables
A properly coded query to join two tables, then, does the following:
  • It combines related information from two tables into a single row.
  • If a value appears once in the joining column of one table, but several times in the other, the higher number of occurrences appear in the output.
  • If a value exists in the joining column of one table, but not in the other, no row appears in the output for that value.
Let's look at how to code the query to join PRESIDENT_TABLE and PRES_MARRIAGE.

*Using SELECT
You can select as many or as few of the columns in either table as you want. All of the following are valid SELECTs joining PRESIDENT_TABLE and PRES_MARRIAGE.
SELECT PRESIDENT_TABLE.PRES_NAME, SPOUSE_NAME, STATE_BORN
  • PRES_NAME is the joining column found in both tables; the other two columns come from either table. Because PRES_NAME is in both tables, you must use a qualifier to indicate which table you want to select it from.

SELECT PRESIDENT_TABLE.PRES_NAME, MARRIAGE_YEAR, PARTY, NR_CHILDREN
  • This is another mix of columns from both tables.

SELECT *
  • All columns from both tables are selected, but the joining column -- the column the two tables have in common -- appears twice, once for each table.





FROM Keyword
The FROM keyword is followed by the two table names, separated by a comma or by the keyword expression INNER JOIN.
SELECT   PRESIDENT_TABLE.PRES_NAME, SPOUSE_NAME, NR_CHILDREN, STATE_BORN
  FROM   
The order in which you enter the table names is not important, as long as they are both there.
Before you go on to the WHERE clause, let's look at another example of joining two tables.

ELECTION Table
A table called ELECTION contains four columns:
ELECTION_YEAR, CANDIDATE, VOTES, WINNER_LOSER_INDIC
By combining this information with the PRESIDENT_TABLE, we can answer the question:  "Which election(s) did each president contest?"
The first question to answer is whether these two tables can be joined.




*Using a Joining Condition
You should include a joining condition in every query that joins two tables.
In an inner join, the joining condition ensures that only those presidents that appear in both tables are selected.
WHERE  PRESIDENT_TABLE.PRES_NAME = PRES_MARRIAGE.PRES_NAME
But, even more importantly, it saves you from producing a very long table that contains all possible combinations of rows for the tables identified in the FROM clause.
Look first at what happens when the join is done correctly.

*Correct Join
If you join them correctly...
SELECT  FIRST_TABLE.ITEM, PRICE, COLOR
  FROM   FIRST_TABLE, SECOND_TABLE
  WHERE  FIRST_TABLE.ITEM = SECOND_TABLE.ITEM

*Viewing the Output
... this output is produced.
SELECT  FIRST_TABLE.ITEM, PRICE, COLOR
  FROM   FIRST_TABLE, SECOND_TABLE
  WHERE  FIRST_TABLE.ITEM = SECOND_TABLE.ITEM

*Don't Forget the Joining Condition!
Should you forget it, as shown here...
SELECT  FIRST_TABLE.ITEM, PRICE, COLOR
  FROM   FIRST_TABLE, SECOND_TABLE

*What Will Happen Without Joining?
The first row of the first table was joined with every row from the second table. Then the second row in the first table was joined with every row from the second table.
And even though there weren't any entries for CRYSTAL in the second table, it was also joined with every row.
Imagine the length of the output if there are 1,000 rows in each table -- 1,000 * 1,000 rows. As well as being long, this data is totally invalid.




*Rules for the Joining Condition
You should be convinced of the importance of the joining condition. Let's go over the rules for coding the joining condition.
SELECT   PRESIDENT_TABLE.PRES_NAME,
         SPOUSE_NAME,NR_CHILDREN,STATE_BORN
  FROM   PRESIDENT_TABLE, PRES_MARRIAGE
  WHERE  
The joining condition for an inner join always checks that the values in the column common to both tables are equal.
The joining condition in this example means: join the information about Washington in the second table to the information about Washington in the first table. If a name appears in one table, but not the other, that row is not selected.

*Studying the Example
Notice that the table name, followed by a period, is put in front of each of the column names. This forces the system to compare PRES_NAME from the first table with
PRES_NAME in the second table.
Let's look at some of the data this query produces.
SELECT   PRESIDENT_TABLE.PRES_NAME,
         SPOUSE_NAME,NR_CHILDREN,STATE_BORN
  FROM   PRESIDENT_TABLE, PRES_MARRIAGE
  WHERE  

*Results of the Join
All recent presidents appear in the report because they all are in both tables. Reagan appears twice because he has two rows in the PRES_MARRIAGE table; the same information from the PRESIDENT_TABLE is joined to each of the rows from PRES_MARRIAGE.




*Limiting the Output
This time there are lots of duplicates because many presidents contested more than one election.
But what if we want to limit this output to showing the elections in which the presidents won, or just the Republican presidents?
That poses no great problem.

*Adding Search Conditions
In addition to the joining condition you can add other search conditions, which can pertain to either table.
For example, let's limit the election data to:
Living presidents     -  if PRESIDENT_TABLE.DEATH_AGE is NULL
Republicans            -  if PRESIDENT_TABLE.PARTY is 'REPUBLICAN'
Elections they won  -  if ELECTION.WINNER_LOSER_INDIC is 'W'
Let's look at how all these conditions are added to a joining query.

*Creating the Query
The query is
SELECT  PRES_NAME, ELECTION, VOTES, WINNER_LOSER_INDIC
  FROM  PRESIDENT_TABLE, ELECTION WHERE PRES_NAME = CANDIDATE
  AND   DEATH_AGE IS NULL
  AND   PARTY = 'REPUBLICAN'
AND   WINNER_LOSER_INDIC = 'W'
Note that the table name prefix is required only when the column names are identical. When the names are different, SQL knows where to find each column; SQL knows that DEATH_AGE only occurs in PRESIDENT_TABLE, for example.
Only if the same column name appears in both tables do you need to indicate which table is evaluated.

*Viewing the Output
The output from this query is much shorter.
SELECT  PRES_NAME, ELECTION, VOTES, WINNER_LOSER_INDIC
  FROM  PRESIDENT_TABLE, ELECTION
  WHERE PRES_NAME = CANDIDATE
  AND   DEATH_AGE IS NULL
  AND   PARTY = 'REPUBLICAN'
  AND   WINNER_LOSER_INDIC = 'W'
All dead presidents and Democrats are gone, as is Gerald Ford, who did not win an election. The query truly answers: "Which election(s) did living Republican presidents win?"

*What Happens Next?
Only rows meeting these two search conditions are selected from
 PRESIDENT_TABLE.SELECT  PRES_NAME, ELECTION, VOTES, WINNER_LOSER_INDIC
  FROM  PRESIDENT_TABLE, ELECTION
  WHERE PRES_NAME = CANDIDATE


  AND   WINNER_LOSER_INDIC = 'W'

*Results of the Exercise
And only winning candidates are selected from
ELECTION.SELECT  PRES_NAME, ELECTION, VOTES, WINNER_LOSER_INDIC
  FROM  PRESIDENT_TABLE, ELECTION
  WHERE PRES_NAME = CANDIDATE
  AND   DEATH_AGE IS NULL
  AND   PARTY = 'REPUBLICAN'





*Using a Joining Query
Like a standard query, a joining query gives you three different ways to enter your order by clause.            
        SELECT     TABLE_A.ITEM, PRICE, COLOR
          FROM     TABLE_A, TABLE_B
          WHERE    TABLE_A.ITEM = TABLE_B.ITEM
         ORDER BY DESC
      or
         ORDER BY DESC
      or
         ORDER BY DESC




*Built-In Functions
The last topic in this lesson addresses the use of built-in functions (AVG, MAX, etc.) when you join tables.
Question:  Can you use built-in functions when you join tables?
Answer:    Yes. It's easy.

*Creating an Example
Question:  How about an example? When would you want to use built-in functions with a join?
Answer:    How many years did Republican presidents serve on average -- and what are the maximum, minimum and average number of votes gained by Republican presidents in their winning campaigns?
Let's look at how that query can be coded.

*Coding Queries
Using built-in functions in join queries is no different from standard queries -- all the same rules apply.
SELECT  AVG(YRS_SERVED), MAX(VOTES), MIN(VOTES), AVG(VOTES)
       -- The built-in functions can refer to either table.
  FROM  PRESIDENT_TABLE, ELECTION
  WHERE  PRESIDENT_TABLE.PRES_NAME = ELECTION.CANDIDATE
  AND  PARTY = 'REPUBLICAN'
  AND  WINNER_LOSER_INDIC = 'W'

*Joining More Than Two Tables
Question:  Okay. You can use built-in functions with joined queries. But can you join more than two tables?
Answer:    Yes, but all the tables must have at least one column in common. (e.g., Table A and Table B must have a field in common and Table B and Table C must have a field in common.)

*Alternatives
Question:  Are there any alternatives to joining?
Answer:    Yes, there are several situations in which joining is not appropriate. Let's look at them.

*Point One
There are three important points to remember about joining.
1. Joining takes a lot of computer resources and can be complicated to code.
You might be able to save yourself and the system some time and effort if you can manage with two separate, non-joined tables.

*Point Two
2. Subselects are a more efficient alternative to join queries in certain circumstances.
For example, a joining query cannot tell you which elections were contested by presidents who served longest; a subselect can.
Subselects are covered later in this course.

*Point Three
3. Joining, along with subselects, can meet most of your your needs for extracting information from two tables.
You may recall that you can use nested table expressions within a join.


Topic 3.2: Unit 3 Summary

The following points were covered in this unit:
  • A properly coded query to join two tables with an inner join combines related information from two tables into a single row, values appear once in the joining column of one table, but several times in the other, the higher number of occurrences appear in the output, and values exist in the joining column of one table, but not in the other, and no row appears in the output for that value.
  • The SELECT for joining tables can select any number of columns in either table.
  • When doing an inner join, the FROM keyword is followed by the two table names, separated by a comma or by the keyword expression INNER JOIN.
  • You should include a joining condition in every query that joins two tables.
  • For inner joins, the joining condition always checks that the values in the column common to both tables are equal.
  • In addition to the joining condition, you can add other search conditions, which can pertain to either table.
  • Like a standard query, a joining query gives you three different ways to enter your ORDER BY clause.
  • Using built-in functions in join queries is no different from standard queries -- all the same rules apply.



Unit 4. Outer Joins



In the prior unit, you saw how to code inner joins.
This unit shows how to use the three types of outer joins:
  • Left outer
  • Right outer
  • Full outer


After completing this unit, you should be able to:
  • Use outer joins



Topic 4.1: Outer Joins

*Reviewing What You Have Learned
Here again is the general format of the FROM clause:
FROM table-expression-1 join-operation table-expression-2  
ON join-condition
The operands of a join (table expression) can be any result table.
The join operations for outer joins are RIGHT OUTER JOIN, LEFT OUTER JOIN and FULL OUTER JOIN.
The join condition specifies the columns to be compared.
Note: It's also valid to use just JOIN as the join operation for a full outer join. The ON keyword tells SQL it's not an inner join.

*Outer Joins
The example of an outer join shown earlier when we joined the BIRTHDAY and ADDRESS tables was of a full outer join. It included all unmatched rows from both tables.
The other versions of the outer join limit the unmatched rows to be based on only one table.
These types of joins give you a lot of flexibility in creating queries, but this flexibility creates many challenges in identifying the best way to code your queries.
Let's see how to code each type of outer join and what the results table looks like.

*Defining Outer Joins
When you are defining an outer join, you must use
  • The ON keyword
  • The equal sign (=) when comparing expressions for a full outer join
You have more flexibility with the left and right outer joins, because you can use all of the comparison operators. You can also combine join expressions using the AND keyword.

*Looking at Some Examples
As we explore outer joins, let's first use these example tables.

*Coding a Full Outer Join
If you code a full outer join of the BIRTHDAY and ADDRESS tables...
SELECT BIRTHDAY.NAME, BIRTH_DATE, STREET, CITY, STATE
  FROM BIRTHDAY ADDRESS
  ON BIRTHDAY.NAME = ADDRESS.NAME;
you get this result...

LEFT OUTER JOIN
The operation LEFT OUTER JOIN includes only unmatched rows from the table named before it, in this case BIRTHDAY.
SELECT BIRTHDAY.NAME, BIRTH_DATE, STREET, CITY, STATE
  FROM BIRTHDAY ADDRESS
  ON BIRTHDAY.NAME = ADDRESS.NAME;
Any unmatched rows from the ADDRESS table are excluded. Jean Friar and Gladys Howe are unmatched rows in the ADDRESS table, so they don't appear in the result table.

RIGHT OUTER JOIN
The operation RIGHT OUTER JOIN includes only unmatched rows from the table named after it, in this case ADDRESS.
SELECT ADDRESS.NAME, BIRTH_DATE, STREET, CITY, STATE
  FROM BIRTHDAY ADDRESS
  ON BIRTHDAY.NAME = ADDRESS.NAME;
Any unmatched rows from the BIRTHDAY table are excluded. Kendra Taylor is an unmatched row in the BIRTHDAY table, so she doesn't appear in the result table.










*Specifying the Columns
Now that you have some understanding of the types of outer joins, let's look next at some ways to better control the results you get.
When the columns you choose have the same name, you must add a qualifier to make the column name unique.
When the query is an outer join, you must be careful to specify the correct column in the SELECT. Otherwise, some rows may have nulls you don't want.

*Viewing an Example
Suppose, for example, you want to create a table that includes Buchanan, an unmarried president. If you use this query:
SELECT PRES_MARRIAGE.PRES_NAME, NR_CHILDREN, STATE_BORN
   FROM PRESIDENT_TABLE FULL OUTER JOIN PRES_MARRIAGE
   ON PRESIDENT_TABLE.PRES_NAME = PRES_MARRIAGE.PRES_NAME;
the row of information for Buchanan is selected, but the NAME information is null because it doesn't appear in the PRES_MARRIAGE table. Consequently, you have a row in the result table that has a value for only the STATE_BORN.

*Changing the Qualifier
If you change the qualifier for PRES_NAME, the query
SELECT ,  NR_CHILDREN, STATE_BORN
   FROM PRESIDENT_TABLE FULL OUTER JOIN PRES_MARRIAGE
   ON PRESIDENT_TABLE.PRES_NAME = PRES_MARRIAGE.PRES_NAME;
gives you a row with the NAME Buchanan and only the value for NR_CHILDREN is null.

VALUE Function
In the previous example, it would be convenient to not worry about having unwanted null columns accidentally.
The VALUE (or COALESCE) function is a neat solution.
The VALUE function returns the first non-null value in its arguments. (COALESCE is a synonym for VALUE, so the two can be used interchangeably.)
This function is allowed in full outer joins and provides a way to integrate data from two columns into a single column. The two columns must contain the same data type.
You must assign a name to the resulting column using the AS clause.
Let's see how this works...

*Studying an Example
For example, the query
SELECT COALESCE(PRESIDENT_TABLE.PRES_NAME, PRES_MARRIAGE.PRES_NAME)
       AS NAME, NR_CHILDREN, STATE_BORN
  FROM PRESIDENT_TABLE FULL OUTER JOIN PRES_MARRIAGE
  ON PRESIDENT_TABLE.PRES_NAME = PRES_MARRIAGE.PRES_NAME
  ORDER BY NAME;
assures that all names from both tables are shown.

*Using the AS Clause
In the result, notice that the AS clause (AS NAME), provides a name for the result of the COALESCE function and NAME is used to order the results. The AS clause is required for the COALESCE function.










*A Helpful Reminder
By the way, you are not limited to the name of a single table as an operand of a join. The operand may also be a subselect.
When you use a subselect in any FROM clause, it is called a nested table expression. When you use a subselect in a WHERE clause, it is called a subquery.
Both nested table expressions and subqueries are covered later in the course.


Topic 4.2: Unit 4 Summary

The following points were covered in this unit:
  • The three types of outer joins are left right, and full.
  • When you are defining an outer join you must use the keyword ON.
  • A full outer join includes all unmatched rows from both tables.
  • A left outer join includes only unmatched rows from the table named before the join operation keywords.
  • A right outer join includes only unmatched rows from the table named after the join operation keywords.
  • When doing an outer join of tables with identical column names, you must be careful to specify the correct column in the SELECT. Otherwise, some rows may have nulls you don't want.
  • The COALESCE (or VALUE) function provides a way to merge data from two columns into a single column. You must use the AS clause to name the new column.



Unit 5. Merging Tables



The theory of merging tables was discussed earlier. This unit teaches the details of how to merge two tables.
Some of the key points discussed in this lesson are:
  • SELECT from each table
  • The UNION and UNION ALL keywords
  • Restrictions on merging
  • ORDERing merged data


After completing this unit, you should be able to:
  • Understand how to merge tables
  • Practice merging tables in a workshop


*Illustrating  MERGE
To illustrate a MERGE, let's take two very short tables. The query to merge the two tables has three elements.
1. Select data from one table.
2. Specify UNION to indicate a merge.
3. Select data from the second table.
Let's look at an example.

*Viewing an Example
Now let's see how the system handles this query.
Examine the following table
SELECT ITEM, PRICE
FROM TABLE_A       
Select data from first table
UNION             
Union must follow the first SELECT statement
SELECT ITEM, PRICE
FROM TABLE_B       
Select data from second table


*Step One
First the rows selected from the first table are retrieved.

*Step Two
Then the rows selected from the second table are sorted with the rows from the first table. Duplicate rows are removed.

*Reviewing the Steps
To repeat the steps in a merge query:
1.  SELECT data from Table 1
2.  UNION - The UNION keyword must be entered after the first SELECT statement
3.  SELECT data from Table 2




UNION ALL Keyword
So far, you've seen an example of the UNION keyword in action. You can also use the UNION ALL keyword, which merges tables according to different rules.
Look at how the two keywords handle the same query.

*Results of UNION ALL
Remember the results from the merge with UNION as the keyword. Now look at what happens when UNION ALL is used.

*Studying the Example
On the right is the output resulting from the UNION ALL keyword. Notice that the items are in a different order and that duplicates are included.

*Reviewing
Let's go over the differences between UNION and UNION ALL.
  • UNION sorts the results table; UNION ALL does not. In other words, UNION mixes rows together in the output. UNION ALL will output all the rows from the first table that met the search condition, followed by all the rows in the second table that met the search condition.
  • UNION removes duplicate rows; UNION ALL does not.
  • UNION prohibits selecting any LONG VARCHAR columns; UNION ALL allows selecting LONG VARCHAR columns.





*Differences Between UNION and UNION ALL
You may have guessed some of the implications of the difference between UNION and UNION ALL.
  • Sorting increases system overhead, especially when you are sorting thousands of rows.
  • In addition, UNION may not sort by the columns you want or in the order you want.

Therefore, the recommendations are...

*First Recommendation
1. If you know that the output you are producing has no duplicates,
  • Use UNION ALL as the keyword to reduce system overhead, and

  • Sort the output by adding an ORDER BY clause


*Second Recommendation
2. Add literals to your query ('WAREHOUSE A' and 'WAREHOUSE B' in the query shown here). These expressions create a column with a fixed value. They show you which table the row is from and, if you do use UNION as the keyword, no rows are duplicates.
SELECT ITEM, 'WAREHOUSE A', PRICE, VALUE FROM TABLE_A
UNION ALL
SELECT ITEM, 'WAREHOUSE B', PRICE, VALUE FROM TABLE_B ORDER BY 4 DESC




*Looking at the Difference
Next, look at some points about merges that are true for both UNION and UNION ALL.
Examine the following table
UNION UNION ALL
Sorts results table. Does not sort results table.
Removes duplicate rows. Keeps duplicate rows.
Prohibits selecting LONG VARCHAR columns. Allows selecting LONG VARCHAR columns.


         SELECT   NAME, ADDRESS, CITY, BALANCE
           FROM   EAST_REGION
           WHERE  BALANCE >= 10000.00
         UNION
         SELECT   NAME, ADDRESS, CITY, BALANCE
            FROM  WEST_REGION
            WHERE BALANCE >= 10000.00
            ORDER BY 4 DESC
*Some Questions About Merging
This query addresses some of the key questions you may have about merging:
  • Does it matter which columns you select?
  • What can be specified in the WHERE clause?
  • How do you ORDER merged data?

Let's start with the columns you select.

*Matching the Columns
The columns from each table must match each other.
If your first query specifies:
SELECT NAME, ADDRESS, BALANCE
the second query must select columns in the same order...
SELECT NAME, ADDRESS, BALANCE
 ...though it is okay if the names of the columns do not match exactly:
SELECT CUSTOMER_NAME, ADDRESS, BALANCE_DUE

*Do Columns Have to Match?
The columns from each table must match each other.
    
  • Corresponding columns must have compatible data types.
  • A numeric column must correspond to a numeric column, a character column to a character column, and so on.
  • Numeric data types need not be the same. For example, an integer column can correspond to a decimal column.
  • Column lengths need not be the same.








              SELECT   NAME, ADDRESS, CITY, BALANCE
                FROM   EAST_REGION
                WHERE  BALANCE >= 10000.00
              UNION
              SELECT   NAME, ADDRESS, CITY, BALANCE
                 FROM  WEST_REGION
                 WHERE BALANCE >= 10000.00
                 ORDER BY 4 DESC
*One Question Down!
We've answered the first key question about merging.
  • Does it matter which columns you select?
  • What can be specified in the WHERE clause?
  • How do you ORDER merged data?

Let's go on to the WHERE clause and ordering merged data.

*Moving on to the WHERE Clause
You can think of the two queries linked by UNION or UNION ALL as being completely independent regarding the WHERE clause.
Rows are selected from the first table by the first query, according to whatever search criteria.
Then rows are selected from the second table, again according to whatever search criteria are specified in the second query.
There is no need for any kind of joining condition.

ORDER BY
Always put the order by clause in the last query.
Placing the ORDER BY clause in the last query causes all the output data to be ordered.

*Using a Column Number
Use a column number in the order by clause.
The two tables might not have the same column names, so you must use column numbers to avoid confusion.
The exception to this is that you may use a column name that was defined in a SELECT with an AS clause.




   SELECT     MAX(SALARY)
     FROM     WEST_REGION
   UNION
   SELECT     MAX(SALARY)
     FROM     EAST_REGION
   ORDER BY 1 DESC
*Another Question
This query addresses another key question you may have about merging:
How are built-in functions and arithmetic expressions handled? Let's see.

*Finding Answers in the Table
Assume that each region has its own employee table. Who is the highest paid employee in each region?
SELECT     MAX(SALARY)
  FROM     WEST_REGION
UNION
SELECT     MAX(SALARY)
  FROM     EAST_REGION  
  ORDER BY 1 DESC         
The maximum from each table is output.
OUTPUT MAX(SALARY)
-----------------
252000.00
228000.00
But who's who...? Read on.

*Creating Columns
As suggested earlier, you can identify the table that a row is from by creating a column containing an expression.
SELECT     MAX(SALARY), 'WEST' -- Outputs 'WEST' in second column
  FROM     WEST_REGION            of rows selected from WEST_REGION 
UNION
SELECT     MAX(SALARY), 'EAST' -- Outputs 'EAST' as second column
  FROM     EAST_REGION            of all rows selected from EAST_REGION
  ORDER BY 1 DESC
        Let's look at the output.

*Discovering the Answer
The highest paid employee is in the West Region.
Expressions can be used with any type of query to create a column with a fixed value.


Topic 5.2: Workshop

*Getting Started with a Workshop
This workshop includes problems and questions that you can resolve by joining or merging the data in two tables.
You will work on the following situations:
  • Output information about the different departments, including LOCATION and DEPTNAME (from ORG) and manager NAME and SALARY (from STAFF).
  • There is a new job opening. Current employees with less than 2 years of experience and applicants with between 12 and 14 years of education are eligible. This involves merging rows from APPLICANT and STAFF.


*Trying a Simulation
First try a simulation that uses what you learned in this unit to build a complete SQL request.
Normally when you code an SQL query, you enter all the clauses one after another until the SELECT statement is complete. In the following simulation, you'll be prompted to enter one clause at a time.
Good luck!
*Problem 1
Output information about the different departments, including LOCATION and DEPTNAME (from ORG) and manager NAME and SALARY (from STAFF).
The output is to include only departments in the Eastern Division.
Let's look first at the two tables your query joins.

ORG
ORG contains the following columns. The MANAGER column in ORG contains the managers' IDs, not their names.

STAFF
STAFF contains:

*Producing Output
You need to produce the following output:




*Good Job!
Here is the output produced by your query.
Note that the output doesn't contain rows without a match. To get those rows, you can use an outer join, which you'll do in the next problem.
   SELECT     DEPTNAME, LOCATION, NAME, SALARY
     FROM     ORG, STAFF
     WHERE    ORG.MANAGER = STAFF.ID
       AND    DIVISION = 'EASTERN'
     ORDER BY 1
   SELECT     DEPTNAME, LOCATION, NAME, SALARY
     FROM     ORG, STAFF
     WHERE    ORG.MANAGER = STAFF.ID
       AND    DIVISION = 'EASTERN'
     ORDER BY 1

*Making Alterations
For the next few questions, we are going to cut the STAFF table down to half so you will be using only the rows shown.
We will be using the ORG table as is.

*Moving On
Using the same tables, ORG and STAFF, now perform a join that will include the unmatched rows from the ORG table.
Complete the coding on the next screen.




*Results of Your Work
This is the result table from the left outer join.




*Viewing the Full Outer Join
This is the result table from the full outer join.

*Another Question
There is a new job opening. Current employees with less than 2 years of experience and applicants with between 12 and 14 years of education are eligible. This involves merging rows from APPLICANT and STAFF.
A query was already created to select the current employees. The first step is to decide whether applicants from the APPLICANT table can be added at the end.
Note: Like Problem 1, this problem is set up as a simulation, so if your entry is correct, you'll move immediately to the next question.










         SELECT   NAME, ID
           FROM   STAFF
           WHERE  YEARS < 2
         UNION ALL
         SELECT   NAME, TEMPID
           FROM   APPLICANT
           WHERE  EDLEVEL BETWEEN 12 AND 14
*Reviewing the Final Version
Here's the final version of the query, merging data from two tables.
The output produced is a list of qualified applicants from both tables.


Topic 5.3: Unit 5 Summary

The following points were covered in this unit:
  • A query to merge two tables has selects data from one table, specifies UNION to indicate a merge, and selects data from the second table.
  • The differences between UNION and UNION ALL are UNION sorts the results table, and UNION ALL does not, UNION removes duplicate rows, and UNION ALL does not, and UNION prohibits selecting any LONG VARCHAR columns, and UNION ALL allows selecting LONG VARCHAR columns.
  • If you know that the output you are producing has no duplicates use UNION ALL as the keyword to reduce system overhead, and sort the output by adding an ORDER BY clause.
  • Add literals to your query. These expressions create a column with a fixed value. Literals show you which table the row is from and, if you do use UNION as the keyword, no rows are duplicates.
  • When you merge two tables the data type of each column selected from the first table must be compatible with that of the corresponding column from the second table, and you must select the same number of columns from each table.
  • You can think of the two queries linked by UNION or UNION ALL as being completely independent regarding the WHERE clause.
  • Always put the ORDER BY clause in the last query. Always use a column number in the ORDER BY clause, unless you've defined a column name using the AS clause.



Unit 6. Subselects



Subselects are in some ways comparable to joining tables. They let you combine information from more than one table, and they let you compare different rows of the same table.
You will find that some tasks can be done by joins alone, others by either joins or subselects, and yet others by subselects alone.

After completing this unit, you should be able to:
  • Code a subquery
  • Code a nested table expression



Topic 6.1: What is a Subquery?

*Getting Started with Queries
Some typical queries you run using subselects are:
  • Which president served longest?
  • Which employees earn more than the company average?
  • Which presidents come from the thirteen original states?
  • Who is the manager of the Head Office?
  • Do any non-managers earn more than any managers?
  • Do any non-managers earn more than all managers?
  • Do any non-managers earn more than their own managers?
  • Does anybody earn more than the manager of the Head Office?


*What Is a Subselect?
A subselect is a second SELECT in a query. Its results are used as part of the search argument of the main SELECT.
When you use a subselect in any FROM clause, it is called a nested table expression.
When you use a subselect in a WHERE clause, it is called a subquery.
We'll look first at subqueries. This concept is best illustrated by an example:
Which president served longest?
Let's look first at a proposed solution.




*Asking a Question
Which president served longest?
Using two queries, you could run one query to find the maximum years served.
You can then note down the value returned by the query.

*Using a Subquery
Using the values from the first query, you can now code a second query.
If you use a subquery, the value of MAX(YRS_SERVED) is automatically passed to the second query.

*Using SELECT
The following is the query to select the name of the president who served longest.
The first SELECT uses the value returned by the second SELECT (the subselect) as part of its search condition.
SELECT     PRES_NAME, YRS_SERVED
  FROM     PRESIDENT_TABLE
  WHERE    YRS_SERVED = (SELECT   MAX(YRS_SERVED)
                           FROM   PRESIDENT_TABLE)

*Adding the WHERE Clause
First the subselect runs, returning a single value.
SELECT     PRES_NAME, YRS_SERVED
  FROM     PRESIDENT_TABLE
  WHERE    YRS_SERVED =

Then the main query runs, using the value returned by the subselect as the right operand of its WHERE clause.
WHERE YRS_SERVED = Whatever value was returned by the subselect.




*Asking Questions
You've already seen that you must use a subselect whenever you want to use aggregated and non-aggregated values together in a query to answer questions such as:
  • Which president served longest?
  • Which employees earn above the average salary for their job?
  • Which president had the most children?
In all these examples, the information is on the same table — you are not joining tables. But subselects can also be used where two separate tables are involved.

*Posing a Question
Problem: Which presidents were born in the original thirteen states?
Tables:  STATE     :  STATE_NAME, YEAR_ENTERED
         PRESIDENT :  PRES_NAME, STATE_BORN
Solution 1.  Run two queries.
Solution 2.  Use a joining query.
Solution 3.  Use a subselect.
a.  Use a subselect to select the states from the STATE table, based on the year they entered the Union. b.  Then select PRES_NAME from PRESIDENT_TABLE if STATE_BORN matches one of the states selected by the subselect.
Let's see how solution 1 works.

*Solving the Problem
Problem:  Which presidents were born in the original thirteen states?
Solution 1.  Run two queries.
STEP 1:
SELECT     STATE_NAME
  FROM     STATE_TABLE
  WHERE    YEAR_ENTERED = 1776
The result of this query is a list of thirteen states.
Note them down, and you are ready for step 2.

*Another Solution
Problem:  Which presidents were born in the original thirteen states?
Solution 1.  Run two queries.
STEP 2:
SELECT     PRES_NAME
  FROM     PRESIDENT_TABLE
  WHERE    STATE_BORN IN ('MASSACHUSETTS', 'PENNSYLVANIA',
                          'VIRGINIA', 'CONNECTICUT',
                          'SOUTH CAROLINA', 'MARYLAND',
                          'NEW JERSEY', 'GEORGIA',
                          'NEW HAMPSHIRE', 'DELAWARE',
                          'NEW YORK', 'NORTH CAROLINA',
                          'RHODE ISLAND' )
Provided you spelled all thirteen names correctly, your output will list all presidents born in the original thirteen states.
Solution 2, a joining query, automates this same process.

*An Alternate Solution
Problem:  Which presidents were born in the original thirteen states?
Solution 2.  Join the two tables:
SELECT     PRES_NAME, STATE_BORN
  FROM     PRESIDENT_TABLE, STATE_TABLE
  WHERE    STATE_BORN = STATE_NAME
  AND      YEAR_ENTERED = 1776
Joining tables was covered earlier in this course and will not be covered in detail in this unit.

*Another Solution
Problem:  Which presidents were born in the original thirteen states?
Solution 3.  Use a subselect:
SELECT     PRES_NAME,
  FROM     PRESIDENT_TABLE
  WHERE    STATE_BORN IN


The subselect selects the original thirteen states.

*Using a Subselect
Problem:  Which presidents were born in the original thirteen states?
Solution 3.  Use a subselect:


(SELECT  STATE_NAME
                            FROM  STATE_TABLE
                            WHERE YEAR_ENTERED = 1776)
The main query then selects all presidents born in any one of the thirteen states.




*Summarizing What You Have Learned
The following table summarizes when you might use subqueries.

Note: Another use of subqueries, to select rows from a table based on values contained in different rows from that same table, is covered later.



Topic 6.2: Nested Table Expressions

*Another Type of Subselect
Now let's look at another type of subselect, the nested table expression.
While you use a subquery to specify search criteria on a single column, a nested table expression returns more than one column.

*Using Nested Table Expression
A nested table expression is a subselect in a FROM clause.
A nested table expression allows you to select specific rows from tables BEFORE you join them. Using this feature can greatly improve the performance of the join.
In addition, you can use a join where you might otherwise have merged data, a more costly operation.

*Following the Rules
You should follow these rules in defining your nested table expression:
  • Enclose the subselect in parentheses.
  • Assign a new name to the nested table expression. You can use this name as the qualifier of a column name, but you cannot refer to the name within the same FROM clause.
  • Rename the columns of the result table that you plan on referencing, so they are unique.


*Joining Tables
Nested table expressions can be used in any type of join.
To create a table of the number of children that were born to presidents from Ohio, this nested table expression selects only rows for Ohio presidents to be joined with the marriage information:
SELECT NAME, NR_CHILDREN
  FROM

AS PRES_OHIO
      LEFT OUTER JOIN PRES_MARRIAGE
      ON NAME = PRES_MARRIAGE.PRES_NAME;
The WHERE clause in the nested table expression eliminates any presidents not born in Ohio before the join takes place. This allows the join to be as efficient as possible.

*Viewing the Results
Each instance of a President's marriage appears whenever there is a match for an Ohio President.
Here is the result table.

*Studying the Example
Note that in this example a correlation name (PRES_OHIO) is assigned. The AS clause is required for an operand that is a nested table expression, even if the name is not referenced in the query.
SELECT NAME, NR_CHILDREN
  FROM (SELECT PRESIDENT_TABLE.PRES_NAME AS NAME
          FROM PRESIDENT_TABLE
          WHERE STATE_BORN = 'OHIO')
      LEFT OUTER JOIN PRES_MARRIAGE
      ON NAME = PRES_MARRIAGE.PRES_NAME;
You can also just append the new name after the table name, leaving one or more blanks between them. Using the AS keyword is optional, but it makes your code more readable.




*Joining More Than Two Tables
You can join more than two tables, and you can use different types of joins in the same statement.
Let's look at an example where you need to join more than two tables and you use more than one join type in the FROM clause.
Suppose you wanted a result table that showed which presidents had a hobby of swimming, in addition to the number of children and birth state. This scenario requires you to join three tables to get all the information. The next page shows the result table.  

SELECT NAME, NR_CHILDREN, STATE_BORN
  FROM (SELECT PRES_HOBBY.PRES_NAME AS NAME FROM PRES_HOBBY
               WHERE HOBBY = 'SWIMMING') AS SWIM_TABLE
        INNER JOIN PRES_TABLE
                ON NAME = PRES_TABLE.PRES_NAME
        LEFT OUTER JOIN PRES_MARRIAGE
                ON NAME = PRES_MARRIAGE.PRES_NAME
  ORDER BY NAME;





Topic 6.3: Unit 6 Summary

The following points were covered in this unit:
  • Subselects are in some ways comparable to joining tables. They let you combine information from more than one table, and they let you compare different rows of the same table.
  • A subquery is a query where the result from one SELECT (the subselect) is used as part of the search argument of another SELECT.
  • Subqueries can be used to combine aggregated columns with non-aggregated columns.
  • Subqueries can be used to select rows from one table based on values selected from another table.
  • A nested table expression is a subselect in a FROM clause. It allows you to select specific rows from tables BEFORE you join them.
  • When the join operand is a nested table expression, you must assign a correlation name with the AS clause.
  • You can join more than two tables, and you can use different types of joins in the same statement.



Unit 7. Basic Subqueries



In this unit you will learn how to create basic subqueries. You will create several subqueries that access one table. You will also create subqueries that access two tables.
  • Select information from PRESIDENT_TABLE about presidents who had more than two children.
  • Who is the manager of the Head Office?

In writing these queries, you will also learn the basic rules governing subqueries.
  • Format of a SUBSELECT
  • How the main SELECT and SUBSELECT are connected
  • The WHERE clause
  • Use of ORDER BY


After completing this unit, you should be able to:
  • Code a simple subquery
  • Handle subqueries returning several rows
  • Practice what you have learned in a workshop



Topic 7.1: Simple Subquery

*Getting Started
Let's start with the problem introduced in the previous objective — which president served longest.
A subquery is required because you need to compare a column value from individual rows — YRS_SERVED — with an aggregated value — MAX(YRS_SERVED). This can only be done with a subquery.
  • The SUBSELECT finds MAX(YRS_SERVED)
  • The SELECT compares each president's YRS_SERVED value with the result of the subselect.
Let's look at the query.

*Looking at the Query
The columns selected by this query are PRES_NAME and YRS_SERVED for any president whose YRS_SERVED are equal to MAX(YRS_SERVED).
Let's analyze the elements that make up this query.
SELECT     PRES_NAME, YRS_SERVED
  FROM     PRESIDENT_TABLE
  WHERE    YRS_SERVED = (SELECT    MAX(YRS_SERVED)
                           FROM    PRESIDENT_TABLE)

*Viewing the Example
Though the subselect is entered last, it is executed first by the system. This subselect returns a single value — MAX(YRS_SERVED), which is passed to the main SELECT.
SELECT     PRES_NAME, YRS_SERVED
  FROM     PRESIDENT_TABLE
  WHERE    YRS_SERVED = (SELECT    MAX(YRS_SERVED)
                           FROM    PRESIDENT_TABLE)

*Using the Returned Value
The value returned by the subselect is used as the right operand of the main SELECT's WHERE clause.
SELECT     PRES_NAME, YRS_SERVED
  FROM     PRESIDENT_TABLE
  WHERE    YRS_SERVED = (SELECT    MAX(YRS_SERVED)
                           FROM    PRESIDENT_TABLE)




*Rules for Coding
The subselect replaces the right operand of the main SELECT's WHERE clause. Therefore, it must be entered following the operator, which is = in this example.
SELECT     PRES_NAME, YRS_SERVED
  FROM     PRESIDENT_TABLE
  WHERE    YRS_SERVED (SELECT    MAX(YRS_SERVED)
                           FROM    PRESIDENT_TABLE)

*A Helpful Hint
Remember that the queries are free-form, so you can start the SUBSELECT on a new line if you want to.
SELECT     PRES_NAME, YRS_SERVED
  FROM     PRESIDENT_TABLE
  WHERE    YRS_SERVED =


*Enclosing the Subselect
The whole subselect must be enclosed in parentheses.
SELECT     PRES_NAME, YRS_SERVED
  FROM     PRESIDENT_TABLE
  WHERE    YRS_SERVED =

Examine the following table


*Using SUBSELECT
A SUBSELECT can select only one column.
SELECT     PRES_NAME, YRS_SERVED
  FROM     PRESIDENT_TABLE
  WHERE    YRS_SERVED = (SELECT    
                           FROM    PRESIDENT_TABLE)
Examine the following table


Note: The column selected does not have to be a built-in function.


*One Last Rule
In addition to the three rules you have seen, there is a fourth rule.
Examine the following table
Do not use ORDER BY in the subselect. You can, however, order the main SELECT.
SELECT     PRES_NAME, YRS_SERVED
  FROM     PRESIDENT_TABLE
  WHERE    YRS_SERVED = (SELECT    MAX(YRS_SERVED)
                           FROM    PRESIDENT_TABLE)





*Using Main SELECT
The main SELECT of a subquery is like any regular SELECT, except that the right operand of one of its search conditions is a SUBSELECT instead of a value or a column name.

Note: When you use GROUP BY, the right operand of the HAVING clause can also be the subselect.


*Looking at an Example
Here's an example showing a fairly complex main SELECT.
SELECT     NAME, SALARY, COMM, SALARY+COMM
  FROM     STAFF



                       FROM   STAFF)
  ORDER BY SALARY DESC
  • This query will output information about clerks from department 10 whose salary is above the average salary for the company.


*Another Example
Here's an example showing a fairly complex main SELECT.
SELECT     NAME, SALARY, COMM, SALARY+COMM
  FROM     STAFF
  WHERE    JOB = 'CLERK'
    AND    DEPT = 10
    AND    SALARY > (SELECT   AVG(SALARY)
                       FROM   STAFF)

  • The output is to be in descending order by SALARY.


*Limiting Values
Here's an example showing a fairly complex main SELECT.
SELECT     NAME, SALARY, COMM, SALARY+COMM
  FROM     STAFF
  WHERE    JOB = 'CLERK'
    AND    DEPT = 10
    AND    SALARY >

  ORDER BY SALARY DESC
  • There are no limits on the values that can be selected — though you cannot output AVG(SALARY), which was selected by the subselect.


*Using the WHERE Clause
Here's an example showing a fairly complex main SELECT.
SELECT     NAME, SALARY, COMM, SALARY+COMM
  FROM     STAFF


(SELECT   AVG(SALARY)
                       FROM   STAFF)
  ORDER BY SALARY DESC
  • There can be several conditions in the WHERE clause.


Note: When more than one row is returned by the subselect, the standard relational operations (e.g., = or < ) may not be used. More about that in the next objective.


*Using the ORDER BY Clause
Here's an example showing a fairly complex main SELECT.
SELECT     NAME, SALARY, COMM, SALARY+COMM
  FROM     STAFF
  WHERE    JOB = 'CLERK'
    AND    DEPT = 10
    AND    SALARY > (SELECT   AVG(SALARY)
                       FROM   STAFF)

  • The ORDER BY clause comes after the SUBSELECT, but it refers to the columns selected by the main SELECT.





*Running the Subselect
First the subselect is run to find the lowest salary earned by a manager.
Then the main query is run to find out which non-management employees (WHERE JOB <> 'MGR') have a salary greater than the value returned by the subselect — AND SALARY > (SELECT MIN(SALARY). SELECT     *
  FROM     STAFF
  WHERE    JOB <> 'MGR'
    AND    SALARY > (SELECT    MIN(SALARY)
                         FROM    STAFF
                         WHERE   JOB = 'MGR')
  ORDER BY SALARY DESC

*Modifying the Query
Which presidents served longer than the average?
You already ran a query to select the president who served longest.
You can modify that query to find all the presidents who served longer than average.




*Remember
You saw in the previous query that you could use the > as well as = in the main SELECT to compare with the value returned by the subselect.
WHERE  YRS_SERVED > (SELECT  AVG(YRS_SERVED)
When a single value is returned by the subselect, you can use any operators in your main SELECT EXCEPT:
  • BETWEEN
  • LIKE
Let's try one more problem before going on to a subquery that involves two tables.




*Using a Subquery with Multiple Tables
If the SELECT just provides a value to be used in a WHERE clause, it doesn't matter which table it got that value from. Therefore, you can use a subquery with more than one table.
Let's look at an example.

*Posing a Question
How much does the manager who heads up the Home Office earn?
ORG has the employee ID of the department heads, while STAFF contains salary information.
SELECT     ID, NAME, SALARY
  FROM     STAFF
  WHERE    ID = (SELECT   MANAGER
                   FROM   ORG
                   WHERE  DEPTNAME = 'HEAD OFFICE')
The MANAGER value returned by the subselect is the employee ID of the Head Office Manager.

*Coding a Subselect
When coding the subselect...
  • Only one column, either aggregated or non-aggregated, can be selected by the subselect.
  • The value or values in that column are used by a main query as part of its search condition.



Topic 7.2: Subqueries Returning Several Rows

*Returning More Than One Value
Which presidents were born in the original 13 states? A subquery can be used to answer this question, but the subselect must return 13 values instead of just one.
How do you handle the WHERE clause of the main SELECT to compare 13 values? That is the only difference between this query and those you saw in the previous objective.
This section covers the use of the relational operators ALL, ANY, SOME and IN.

*Starting with a Query
Here's the query that lists all presidents born in the original thirteen states.
SELECT     PRES_NAME
  FROM     PRESIDENT_TABLE
  WHERE    STATE_BORN IN (SELECT     STATE_NAME
                              FROM      STATE_TABLE
                              WHERE     YEAR_ENTERED = 1776)
Think about what makes this subquery different from the others you have seen.

*Viewing the Results
When the SUBSELECT runs, it returns 13 rows.
(SELECT     STATE_NAME    
   FROM      STATE_TABLE     
   WHERE     YEAR_ENTERED = 1776)
The main query must now be able to compare each president's STATE_BORN with all thirteen values in the list. If it matches any one of them, then that row is selected.

*Using the IN Operator
The MAIN QUERY must check for a value included in this list. The IN operator can be used.
SELECT     PRES_NAME      
  FROM      PRESIDENT_TABLE
  WHERE     STATE_BORN IN   
The STATE_BORN value is compared with all the values in the list. If it matches any one of them, then that row is selected.

*An Alternative Method
Another way this query can be entered is
SELECT     PRES_NAME
  FROM     PRESIDENT_TABLE
  WHERE    STATE_BORN = ANY (SELECT      STATE_NAME
                               FROM      STATE_TABLE
                               WHERE     YEAR_ENTERED = 1776)
= ANY or = SOME can be used with subselects to mean the same as IN.
The query asks: Is STATE_BORN equal to one of the values in the list of values selected by the subquery? If so, the row is selected.




*Selecting Specific Rows
But you won't always want to check that a value is included in the list. Sometimes you may want to select rows whose value is greater or less than the values in the list. Did any salespeople earn less than clerks? Here are two queries that answer that question.
SELECT     NAME, SALARY
  FROM     STAFF
  WHERE    JOB = 'SALES'
     AND      SALARY < ANY (SELECT      SALARY
                               FROM      STAFF
                               WHERE     JOB = 'CLERK')
This first version lists all salespeople who earn less than any one or more clerks. Let's see how SQL calculates this.

*Using ANY
The condition stated:
WHERE SALARY < ANY (SELECT  SALARY
The first value, 30000.00, is compared with the subquery's values. It is not selected because it is too high.

ANY ONE
The condition stated:
WHERE SALARY < ANY (SELECT  SALARY
The second value is compared. To be selected, it only needs to be lower than ANY ONE of the values in the list. Therefore, it is selected.
Then the third value is also selected, since it is less than at least one of the values returned by the subselect.

*Using ALL
The previous version of the query listed all salespeople whose salary was less than ANY ONE of the clerks. But what if you want to know if any salesperson earns less than ALL of the clerks? — Instead of < ANY, you use < ALL.
SELECT     NAME, SALARY
  FROM     STAFF
  WHERE    JOB = 'SALES'
    AND      SALARY < ALL (SELECT      SALARY
                             FROM      STAFF
                             WHERE     JOB = 'CLERK')
Now let's see how SQL compares the results.

*Comparing the Results
The condition stated:
WHERE SALARY < ALL (SELECT  SALARY
The first value, 30000.00, is compared with the subquery's values. It is not selected because it is too high.

*Using ALL
The condition stated:
WHERE SALARY < ALL (SELECT  SALARY
The third value will not be selected either. It is not less than ALL the values from the subselect.





Topic 7.3: Workshop

*Getting Started with a Workshop
This workshop tests your understanding of subqueries. Instead of a single problem, this workshop presents a series of questions that require you to:
  • Identify appropriate applications for subqueries
  • Correct invalid subqueries
  • Complete partially written queries involving subselects
  • Show you understand when to use = ANY, = ALL and IN in a WHERE clause
All the problems involve STAFF, ORG and APPLICANT.





Topic 7.4: Unit 7 Summary

The following points were covered in this unit:
  • A subquery is required whenever you need to compare a column value from individual rows with an aggregated value.
  • Rules for coding a subselect are the subselect replaces the right operand of the main SELECT's WHERE clause, the whole subselect must be enclosed in parentheses, a SUBSELECT can select only one column, and do not use ORDER BY in the subselect.
  • The main SELECT of a subquery is like any regular SELECT, except that the right operand of one of its search conditions is a SUBSELECT instead of a value or a column name.
  • When a single value is returned by the subselect, you can use any operators in your main SELECT except BETWEEN and LIKE.
  • If the SELECT just provides a value to be used in a WHERE clause, it doesn't matter which table it got that value from. Therefore, you can use a subquery with more than one table.
  • When a subselect can select more than one row, the main concern is the relational operator comparing the values of the subselect.
  • When checking for equality, you can use IN, = ANY, or = SOME.
  • When using other operators to check for greater or less than the values in the list, you must add ALL, ANY, or SOME.