SQL: Joining Tables
Unit 1. SQL Overview
In this course you will be presented with introductory information that is helpful while you're learning SQL — IBM's Structured Query Language.
You briefly go over concepts about relational databases in this section. In the next section, you see the data used in this course. The final section in this unit covers general topics about SQL.
After completing this unit, you should be able to:
- Understand relational concepts
- Use course tables
- Comprehend SQL concepts
Topic 1.1: Relational Concepts
*Storing Information in TablesIn a relational database model, the data is stored in one or more tables.
Each table consists of a specific number of columns and some number of unordered rows.
One row of a sample table named STUDENT is displayed above.
STUDENTNO FIRSTNAME MIDINIT LASTNAME MAJOR SEX
————————— ———————————— ——————— ——————————————— —————————— ———
S10 GERALD A ATTRIX MATH M
S20 PHYLISS A LUTZ PHYSICS F
S30 DELBERT A SPRAY MATH M
S40 JAMES E SHUE M
S50 BEVERLY A BOGGS HISTORY F
S60 DAVE I HICKS MATH M
S70 JACKIE I RICTER BUSINESS F
S80 MARGE E GILHOULY F
S90 IRMA A SPRAY HOME ECON F
*Viewing a Complete Table
Above is the complete table. As you can see, the table consists of six columns: STUDENTNO, FIRSTNAME, MIDINIT, LASTNAME, MAJOR and SEX.
It also consists of nine rows, one for each student.
The next section goes over this table and the others used in this course.
*Columns and Rows
Columns of a table have the following characteristics:
- Appear vertically when displayed on a report
- Contain the same kind of data
- Have names which are used as column titles when they are displayed on a report
- Appear horizontally when displayed on a report
- Contain one or more columns of data
- Unlike columns, they have no names
*How Large Can a Table Be?
The tables you work with in this course are relatively small for ease of use and reference. However, this does not mean you are limited to working with small tables.
A table can be virtually any number of rows with one or more columns.
There is a limitation, but it is unlikely to be reached under normal conditions.
A table with zero rows is called an empty table.
*Relational Databases
Relational databases are based on a model formulated by Dr. E. F. Codd of IBM in 1968. He used very formal terms to describe the concepts you just saw. Here are the equivalent terms he used:
Formal Relational Term | Informal Equivalents |
---|---|
relation | table |
tuple | row or record |
attribute | column or field |
So you can see where the terms relational model and relational database come from. Tables were called relations.
In this course, you see the less formal terms — table, row and column.
Topic 1.2: Course Tables
*TablesThroughout this course several tables are used for examples and exercises. This section describes each and shows the table entries.
Throughout this course, the tables may be modified slightly to illustrate a point. Also, due to the size of the screen, only portions are displayed at times.
ORG Table
The ORG table contains data that identifies each department within an organization. Each row describes one department. The columns are
DEPTNUMB—department number
DEPTNAME—department name
MANAGER—identification number
DIVISION—section identification
LOCATION—city
DEPTNUMB | DEPTNAME | MANAGER | DIVISION | LOCATION |
---|---|---|---|---|
10 | HEAD OFFICE | 160 | CORPORATE | NEW YORK |
15 | NEW –GLAND | 50 | EASTERN | BOSTON |
20 | MID ATLANTIC | 10 | EASTERN | WASHINGTON |
38 | SOUTH ATLANTIC | 30 | EASTERN | ATLANTA |
42 | GREAT LAKES | 100 | MIDWEST | CHICAGO |
51 | PLAINS | 140 | MIDWEST | DALLAS |
66 | PACIFIC | 270 | WESTERN | SAN FRANCISCO |
84 | MOUNTAIN | 290 | WESTERN | DENVER |
APPLICANT Table
The APPLICANT table displays data on job applicants. Each row depicts one applicant. The columns are
TEMPID temporary identification number
NAME last name
ADDRESS city, state
EDLEVEL education level
COMMENTS comments about applicant
TEMPID | NAME | ADDRESS | EDLEVEL | COMMENTS |
---|---|---|---|---|
400 | FROMMHERZ | SAN JOSE, CA | 12 | NO SALES EXPERIENCE |
410 | JACOBS | POUGHKEEPSIE, NY | 16 | GOOD CANDIDATE FOR |
420 | MONTEZ | DALLAS, TX | 13 | OFFER SALES POSITION |
430 | RICHOWSKI | TUCSON, AZ | 14 | CAN'T START WORK UNT |
440 | REID | ENDICOTT, NY | 14 | 1 YEAR SALES EXPERIE |
450 | JEFFREYS | PHILADELPHIA, PA | 12 | GOOD CLERICAL BACKGR |
460 | STANLEY | CHICAGO, IL | 11 | WANTS PARTIME JOB |
470 | CASALS | PALO ALTO, CA | 14 | EXPERIENCED SALESMA |
480 | LEEDS | EAST FISHKILL, NY | 12 | NEEDS INTERVIEW WIT |
490 | GASPARD | PARIS, TX | 16 | WORKED HERE FROM |
STAFF Table
The STAFF table holds information related to staff members. Each row identifies one person on the staff. The columns are
ID identification number
NAME last name
DEPT department number
JOB job title/area
YEARS years on staff
SALARY salary amount
COMM commission amount
ID | NAME | DEPT | JOB | YEARS | SALARY | COMM |
---|---|---|---|---|---|---|
10 | SANDERS | 20 | MGR | 7 | 18357.50 | - |
20 | PERNAL | 20 | SALES | 8 | 18171.25 | 612.45 |
30 | MARENGHI | 38 | MGR | 5 | 17505.75 | - |
40 | O'BRIEN | 38 | SALES | 6 | 18006.00 | 846.55 |
50 | HANES | 15 | MGR | 10 | 20659.80 | - |
60 | QUIGLEY | 38 | SALES | - | 16808.30 | 650.2511 |
70 | ROTHMAN | 15 | SALES | 7 | 16502.83 | 52.00 |
80 | JAMES | 20 | CLERK | - | 13504.60 | 128.2013 |
90 | KOONITZ | 42 | SALES | 6 | 18001.75 | 86.70 |
100 | PLOTZ | 42 | MGR | 7 | 18352.80 | - |
110 | NGAN | 15 | CLERK | 5 | 12508.20 | 206.60 |
120 | NAUGHTON | 38 | CLERK | - | 12954.75 | 180.00 |
130 | YAMAGUCHI | 42 | CLERK | 6 | 10505.90 | 75.60 |
140 | FRAYE | 51 | MGR | 6 | 21150.00 | - |
150 | WILLIAMS | 51 | SALES | 6 | 19456.50 | 637.65 |
160 | MOLINARE | 10 | MGR | 7 | 22959.20 | - |
170 | KERMISCH | 15 | CLERK | 4 | 12258.50 | 110.10 |
180 | ABRAHAMS | 38 | CLERK | 3 | 12009.75 | 236.50 |
190 | SNEIDER | 20 | CLERK | 8 | 14252.75 | 126.50 |
200 | SCOUTTEN | 42 | CLERK | - | 11508.60 | 84.20 |
210 | LU | 10 | MGR | 10 | 20010.00 | - |
220 | SMITH | 51 | SALES | 7 | 17654.50 | 992.80 |
230 | LUNDQUIST | 51 | CLERK | 3 | 13369.80 | 189.65 |
240 | DANIELS | 10 | MGR | 5 | 19260.25 | - |
250 | WHEELER | 51 | CLERK | 6 | 14460.00 | 513.30 |
260 | JONES | 10 | MGR | 12 | 21234.00 | - |
270 | LEA | 66 | MGR | 9 | 18555.50 | - |
280 | WILSON | 66 | SALES | 9 | 18674.50 | 811.50 |
290 | QUILL | 84 | MGR | 10 | 19818.00 | - |
300 | DAVIS | 84 | SALES | 5 | 15454.50 | 806.10 |
310 | GRAHAM | 66 | SALES | 13 | 21000.00 | 200.30 |
320 | GONZALES | 66 | SALES | 4 | 16858.20 | 844.00 |
330 | BURKE | 66 | CLERK | 1 | 10988.00 | 55.50 |
340 | EDWARDS | 84 | SALES | 7 | 17844.00 | 1285.00 |
350 | GAFNEY | 84 | CLERK | 5 | 13030.50 | 188.00 |
PRESIDENT_TABLE
PRESIDENT_TABLE contains data about the presidents of the U.S. Each row describes one of the presidents. The columns are
PRES_NAME last name, initial
BIRTH_YR year born
YRS_SERVED years served
DEATH_AGE age at death
PARTY political party
STATE_BORN state where born
PRES_NAME | BIRTH_YR | YRS_SERVED | DEATH_AGE | PARTY | STATE_BORN |
---|---|---|---|---|---|
WASHINGTON G | 1732 | 7 | 67 | FEDERALIST | VIRGINIA |
ADAMS J | 1735 | 4 | 90 | FEDERALIST | MASSACHUSETTS |
JEFFERSON T | 1743 | 8 | 83 | DEMO-REP | VIRGINIA |
MADISON J | 1751 | 8 | 85 | DEMO-REP | VIRGINIA |
MONROE J | 1758 | 8 | 73 | DEMO-REP | VIRGINIA |
ADAMS J Q | 1767 | 4 | 80 | DEMO-REP | MASSACHUSETTS |
JACKSON A | 1767 | 8 | 78 | DEMOCRATIC | SOUTH CAROLINA |
VAN BUREN M | 1782 | 4 | 79 | DEMOCRATIC | NEW YORK |
HARRISON W H | 1773 | 0 | 68 | WHIG | VIRGINIA |
TYLER J | 1790 | 3 | 71 | WHIG | VIRGINIA |
POLK J K | 1795 | 4 | 53 | DEMOCRATIC | NORTH CAROLINA |
TAYLOR Z | 1784 | 1 | 65 | WHIG | VIRGINIA |
FILLMORE M | 1800 | 2 | 74 | WHIG | NEW YORK |
PIERCE F | 1801 | 4 | 64 | DEMOCRATIC | NEW HAMPSHIRE |
BUCHANAN J | 1791 | 4 | 77 | DEMOCRATIC | PENNSYLVANIA |
LINCOLN A | 1809 | 4 | 56 | REPUBLICAN | KENTUCKY |
JOHNSON A | 1808 | 3 | 66 | DEMOCRATIC | NORTH CAROLINA |
GRANT U S | 1822 | 8 | 63 | REPUBLICAN | OHIO |
HAYES R B | 1822 | 4 | 70 | REPUBLICAN | OHIO |
GARFIELD J A | 1831 | 0 | 49 | REPUBLICAN | OHIO |
ARTHUR C A | 1830 | 3 | 56 | REPUBLICAN | VERMONT |
CLEVELAND G | 1837 | 8 | 71 | DEMOCRATIC | NEW JERSEY |
HARRISON B | 1833 | 4 | 67 | REPUBLICAN | OHIO |
MCKINLEY W | 1843 | 4 | 58 | REPUBLICAN | OHIO |
ROOSEVELT T | 1858 | 7 | 60 | REPUBLICAN | NEW YORK |
TAFT W H | 1857 | 4 | 72 | REPUBLICAN | OHIO |
WILSON W | 1856 | 8 | 67 | DEMOCRATIC | VIRGINIA |
HARDING W G | 1865 | 2 | 57 | REPUBLICAN | OHIO |
COOLIDGE C | 1872 | 5 | 60 | REPUBLICAN | VERMONT |
HOOVER H C | 1871 | 4 | 90 | REPUBLICAN | IOWA |
ROOSEVELT F D | 1882 | 12 | 63 | DEMOCRATIC | NEW YORK |
TRUMAN H S | 1884 | 7 | 88 | DEMOCRATIC | MISSOURI |
EISENHOWER D D | 1890 | 8 | 79 | REPUBLICAN | TEXAS |
KENNEDY J F | 1917 | 2 | 46 | DEMOCRATIC | MASSACHUSETTS |
JOHNSON L B | 1908 | 5 | 65 | DEMOCRATIC | TEXAS |
NIXON R M | 1913 | 5 | 81 | REPUBLICAN | CALIFORNIA |
FORD G R | 1913 | 2 | - | REPUBLICAN | NEBRASKA |
CARTER J E | 1924 | 4 | - | DEMOCRATIC | GEORGIA |
REAGAN R | 1911 | 8 | - | REPUBLICAN | ILLINOIS |
BUSH G H W | 1924 | 4 | - | REPUBLICAN | MASSACHUSETTS |
CLINTON W J | 1946 | 8 | - | DEMOCRATIC | ARKANSAS |
PRES_MARRIAGE
The PRES_MARRIAGE table lists information about the marriage(s) of each president. Every row represents one marriage. The columns are
PRES_NAME president's last name, initial
SPOUSE_NAME spouses last name, initial
PR_AGE president's age at marriage
SP_AGE spouse's age at marriage
NR_CHILDREN number of children
MARRIAGE-YR year when married
PRES_NAME | SPOUSE_NAME | PR_AGE | SP_AGE | NR_CHILDREN | MARRIAGE_YEAR |
---|---|---|---|---|---|
WASHINGTON G | CUSTIS M D | 26 | 27 | 0 | 1759 |
ADAMS J | SMITH A | 28 | 19 | 5 | 1764 |
JEFFERSON T | SKELTON M W | 28 | 23 | 6 | 1772 |
MADISON J | TODD D D P | 43 | 26 | 0 | 1794 |
MONROE J | KORTRIGHT E | 27 | 17 | 3 | 1786 |
ADAMS J Q | JOHNSON L C | 30 | 22 | 4 | 1797 |
JACKSON A | ROBARDS R D | 26 | 26 | 0 | 1794 |
VAN BUREN M | HOES H | 24 | 23 | 4 | 1807 |
HARRISON W H | SYMMOS A T | 22 | 20 | 10 | 1795 |
TYLER J | CHRISTIAN L | 23 | 22 | 8 | 1813 |
TYLER J | GARDINER J | 54 | 24 | 7 | 1844 |
POLK J K | CHILDROSS S | 28 | 20 | 0 | 1824 |
TAYLOR Z | SMITH M M | 25 | 21 | 6 | 1810 |
FILLMORE M | POWERS A | 26 | 27 | 2 | 1826 |
FILLMORE M | MCINTOSH C C | 58 | 44 | 0 | 1858 |
PIERCE F | APPLETON J M | 29 | 28 | 3 | 1834 |
LINCOLN A | TODD M | 33 | 23 | 4 | 1842 |
JOHNSON A | MCCARDLE E | 18 | 16 | 5 | 1827 |
GRANT U S | DENT J B | 26 | 22 | 4 | 1848 |
HAYES R B | WEBB L W | 30 | 21 | 8 | 1852 |
GARFIELD J A | RUDOLPH L | 26 | 26 | 7 | 1858 |
ARTHUR C A | HORNDON E L | 29 | 22 | 3 | 1859 |
CLEVELAND G | FOLSON F | 49 | 21 | 5 | 1886 |
HARRISON B | SCOTT C L | 20 | 21 | 2 | 1853 |
HARRISON B | DIMMICK M S L | 62 | 37 | 1 | 1896 |
MCKINLEY W | SAXTON I | 27 | 23 | 2 | 1871 |
ROOSEVELT T | LEE A H | 22 | 19 | 1 | 1880 |
ROOSEVELT T | CARROW E K | 28 | 25 | 5 | 1886 |
TAFT W H | HORRON H | 28 | 25 | 3 | 1886 |
WILSON W | AXSON E L | 28 | 25 | 3 | 1885 |
WILSON W | GALT E B | 58 | 43 | 0 | 1915 |
HARDING W G | DE WOLFE F K | 25 | 30 | 0 | 1891 |
COOLIDGE C | GOODHUE G A | 33 | 26 | 2 | 1905 |
HOOVER H C | HENRY L | 24 | 23 | 2 | 1899 |
ROOSEVELT F D | ROOSEVELT A E | 23 | 20 | 6 | 1905 |
TRUMAN H S | WALLACE E V | 35 | 34 | 1 | 1918 |
EISENHOWER D D | DOUD G | 25 | 19 | 2 | 1916 |
KENNEDY J F | BOUVIER J L | 36 | 24 | 2 | 1953 |
JOHNSON L B | TAYLOR C A | 26 | 21 | 2 | 1934 |
NIXON R M | RYAN T C | 27 | 28 | 2 | 1940 |
FORD G R | WARREN E B | 35 | 30 | 4 | 1948 |
CARTER J E | SMITH R | 21 | 18 | 4 | 1946 |
REAGAN R | WYMAN J | 28 | 25 | 2 | 1940 |
REAGAN R | DAVIS N | 41 | 28 | 2 | 1952 |
BUSH G H W | PIERCE B | 20 | 19 | 6 | 1945 |
CLINTON W J | RODHAM H D | 28 | 27 | 1 | 1975 |
PRES_HOBBY
The PRES_HOBBY table displays data about a president's hobbies. One row identifies one hobby. Multiple rows can occur for one president. The columns are
PRES_NAME last name, initial
HOBBY hobby
PRES_NAME | HOBBY |
---|---|
ADAMS J Q | BILLARDS |
ADAMS J Q | SWIMMING |
ADAMS J Q | WALKING |
ARTHUR C A | FISHING |
BUSH G W | BASEBALL |
BUSH G W | SKYDIVING |
CLEVELAND G | FISHING |
CLINTON W J | JOGGING |
CLINTON W J | SAXOPHONE |
COOLIDGE C | FISHING |
COOLIDGE C | GOLF |
COOLIDGE C | INDIAN CLUBS |
COOLIDGE C | MECHANICAL HORSE |
COOLIDGE C | PITCHING HAY |
EISENHOWER D D | BRIDGE |
EISENHOWER D D | GOLF |
EISENHOWER D D | HUNTING |
EISENHOWER D D | PAINTING |
EISENHOWER D D | FISHING |
GARFIELD J A | BILLIARDS |
HARDING W G | GOLF |
HARDING W G | POKER |
HARDING W G | RIDING |
HARRISON B | HUNTING |
HAYES R B | CROQUET |
HAYES R B | DRIVING |
HAYES R B | SHOOTING |
HOOVER H C | FISHING |
HOOVER H C | MEDICINE BALL |
JACKSON A | RIDING |
JEFFERSON T | FISHING |
JEFFERSON T | SAILING |
JEFFERSON T | RIDING |
JOHNSON L B | RIDING |
KENNEDY J F | SAILING |
KENNEDY J F | SWIMMING |
KENNEDY J F | TOUCH FOOTBALL |
LINCOLN A | WALKING |
MCKINLEY W | RIDING |
MCKINLEY W | SWIMMING |
MCKINLEY W | WALKING |
NIXON R M | GOLF |
REAGAN R | RIDING |
ROOSEVELT F D | FISHING |
ROOSEVELT F D | SAILING |
ROOSEVELT F D | SWIMMING |
ROOSEVELT T | BOXING |
ROOSEVELT T | HUNTING |
ROOSEVELT T | JUJITSU |
ROOSEVELT T | GOLF |
ROOSEVELT T | RIDING |
ROOSEVELT T | WRESTLING |
TAFT W H | GOLF |
TAFT W H | RIDING |
TAYLOR Z | RIDING |
TRUMAN H S | FISHING |
TRUMAN H S | POKER |
TRUMAN H S | WALKING |
VAN BUREN M | RIDING |
WASHINGTON G | FISHING |
WASHINGTON G | RIDING |
WILSON W | GOLF |
WILSON W | RIDING |
WILSON W | WALKING |
STATE_TABLE
STATE_TABLE lists information about when each state entered the Union. Each row depicts one state. The columns are
STATE_NAME full name
ADMIN_ENTERED presidential administration
YEAR_ENTERED year
STATE_NAME | ADMIN_ENTERED | YEAR_ENTERED |
---|---|---|
MASSACHUSETTS | - | 1776 |
PENNSYLVANIA | - | 1776 |
VIRGINIA | - | 1776 |
CONNECTICUT | - | 1776 |
SOUTH CAROLINA | - | 1776 |
MARYLAND | - | 1776 |
NEW JERSEY | - | 1776 |
GEORGIA | - | 1776 |
NEW HAMPSHIRE | - | 1776 |
DELAWARE | - | 1776 |
NEW YORK | - | 1776 |
NORTH CAROLINA | - | 1776 |
RHODE ISLAND | - | 1776 |
VERMONT | 1 | 1791 |
KENTUCKY | 1 | 1792 |
TENNESSEE | 2 | 1796 |
OHIO | 4 | 1803 |
LOUISIANA | 6 | 1812 |
INDIANA | 7 | 1816 |
MISSISSIPPI | 8 | 1817 |
ILLINOIS | 8 | 1818 |
ALABAMA | 8 | 1819 |
MAINE | 8 | 1820 |
MISSOURI | 9 | 1821 |
ARKANSAS | 12 | 1836 |
MICHIGAN | 12 | 1837 |
FLORIDA | 14 | 1845 |
TEXAS | 15 | 1845 |
IOWA | 15 | 1846 |
WISCONSIN | 15 | 1848 |
CALIFORNIA | 16 | 1850 |
MINNESOTA | 18 | 1858 |
OREGON | 18 | 1859 |
KANSAS | 18 | 1861 |
WEST VIRGINIA | 19 | 1863 |
NEVADA | 19 | 1864 |
NEBRASKA | 20 | 1867 |
COLORADO | 22 | 1876 |
NORTH DAKOTA | 26 | 1889 |
SOUTH DAKOTA | 26 | 1889 |
MONTANA | 26 | 1889 |
WASHINGTON | 26 | 1889 |
IDAHO | 26 | 1890 |
WYOMING | 26 | 1890 |
UTAH | 27 | 1896 |
OKLAHOMA | 30 | 1907 |
NEW MEXICO | 31 | 1912 |
ARIZONA | 31 | 1912 |
ALASKA | 43 | 1959 |
HAWAII | 43 | 1959 |
ELECTION Table
The ELECTION table contains data about recent elections. Each row identifies the results for one candidate in the election. The columns are
ELECTION_YEAR year of the election
CANDIDATE last name, initial
VOTES number of votes received
WINNER_LOSER_INDIC won/lost indicator
ELECTION_YEAR | CANDIDATE | VOTES | WINNER_LOSER_INDIC |
---|---|---|---|
1960 | NIXON R M | 219 | L |
1960 | KENNEDY J F | 303 | W |
1960 | BYRD | 15 | L |
1964 | GOLDWATER B | 52 | L |
1964 | JOHNSON L B | 486 | W |
1968 | NIXON R M | 302 | W |
1968 | HUMPHREY H H | 191 | L |
1968 | WALLACE G C | 46 | L |
1972 | MCGOVERN G S | 17 | L |
1972 | NIXON R M | 520 | W |
1972 | HOSPERS J | 1 | L |
1976 | CARTER J E | 297 | W |
1976 | FORD G R | 240 | L |
1980 | CARTER J E | 49 | L |
1980 | REAGAN R | 489 | W |
1984 | REAGAN R | 523 | W |
1984 | MONDALE W F | 13 | L |
1988 | BUSH G H W | 426 | W |
1988 | DUKAKIS M S | 111 | L |
1992 | BUSH G H W | 168 | L |
1992 | CLINTON W J | 370 | W |
1996 | CLINTON W J | 379 | W |
1996 | DOLE R | 159 | L |
STUDENT Table
The STUDENT table contains student information. Each row describes one student. The columns are
STUDENTNO student number
FIRSTNAME first name
MIDINIT middle initial
LASTNAME last name
MAJOR major subject
SEX M or F
STUDENTNO | FIRSTNAME | MIDINIT | LASTNAME | MAJOR | SEX |
---|---|---|---|---|---|
S10 | GERALD | A | ATTRIX | MATH | M |
S20 | PHYLISS | A | LUTZ | PHYSICS | F |
S30 | DELBERT | A | SPRAY | MATH | M |
S40 | JAMES | E | SHUE | M | |
S50 | BEVERLY | A | BOGGS | HISTORY | F |
S60 | DAVE | I | HICKS | MATH | M |
S70 | JACKIE | I | RICTER | BUSINESS | F |
S80 | MARGE | E | GILHOULY | F | |
S90 | IRMA | A | SPRAY | HOME ECON | F |
Topic 1.3: SQL Concepts
*Introducing SQL ConceptsSQL is a high-level language that allows you to access and maintain data stored in relational databases, such as DB2 and SQL/DS. Products like FOCUS, SAS and Oracle, which create and maintain relational tables, also use SQL. SQL is one interface to products in these relational systems.
SQL is designed for programmers, system administrators and data processing professionals. Yet, it is especially nice for users who have little training in computer programming. With SQL you specify what you want rather than how to get it. You do not have to be concerned with how the data is stored, or how it is retrieved.
*Learning to Use SQL
SQL is rapidly becoming the industry standard for fourth generation query languages. SQL is easy to learn and easy to use.
Power and simplicity are both features of SQL. You can use single SQL statements to perform functions that require many lines of conventional code.
SQL has these three basic components:
- Data Definition Language (DDL)
- Data Manipulation Language (DML)
- Authorization Language
*Data Definition Language
DDL (Data Definition Language) is used to define new tables. The columns to be included in your table are defined using the CREATE statement. Here is a sample CREATE statement for a STUDENT table.
CREATE TABLE STUDENT
(STUDENTNO CHAR(3) NOT NULL,
FIRSTNAME VARCHAR(20),
MIDINIT CHAR(1),
LASTNAME VARCHAR(25) NOT NULL,
MAJOR CHAR(15),
SEX CHAR(1) NOT NULL)
As you can see, the table consists of 6 columns: STUDENTNO, FIRSTNAME, MIDINIT, LASTNAME, MAJOR and SEX.
*Using DDL Statements
DDL statements permit you to create, change and delete various components of the relational database. In fact, you can define and modify the database online without special utility programs.
Thus, you can interactively:
* create and manipulate tables or
* test SQL statements you plan on using in programs
and not affect the ongoing processes and data already established in your database.
*Data Manipulation Language
DML (Data Manipulation Language) has two basic functions:
- Creating reports
- Maintaining tables
Let's look at a sample report request.
*Requesting Reports
For example, the statement below requests a report of all columns and all rows for the table named STUDENT.
SELECT * FROM STUDENT
STUDENTNO | FIRSTNAME | MIDINIT | LASTNAME | MAJOR | SEX |
---|---|---|---|---|---|
S10 | GERALD | A | ATTRIX | MATH | M |
S20 | PHYLISS | A | LUTZ | PHYSICS | F |
S30 | DELBERT | A | SPRAY | MATH | M |
S40 | JAMES | E | SHUE | M | |
S50 | BEVERLY | A | BOGGS | HISTORY | F |
S60 | DAVE | I | HICKS | MATH | M |
S70 | JACKIE | I | RICTER | BUSINESS | F |
S80 | MARGE | E | GILHOULY | F | |
S90 | IRMA | A | SPRAY | HOME ECON | F |
SELECT Request
SELECT * is the simplest retrieval request. Other functions of the SELECT statement include:
- Limit the request to specific columns.
- Limit the request to specific rows.
- Order the rows presented on the request.
- Develop new column values through calculations.
- Accumulate numeric column values within specific groupings.
- Combine data from two or more tables through a common key.
*Maintaining Tables
The other function of DML is to maintain tables.
Here's a simple request to create a new row in the STUDENT table.
INSERT INTO STUDENT
VALUES ('S02', 'RALPH', 'A', 'LARSON' , 'PHYSICS', 'M')
This request adds the new row containing the values listed in parentheses.
*Authorization Language
The function of the Authorization Language is to provide effective control over the database resources.
Security in relational databases like DB2 and SQL/DS can be summarized in one sentence.
Nothing can be done unless you have been given the authority.
Authority is controlled by the data administration area through SQL commands.
*Obtaining Authority
Although you need authority to do anything with the database, it is a relatively simple matter to obtain it.
Authority is given through the GRANT command. For example, the command below gives user PROG1 authority to create a table in the database named SKILLS.
GRANT CREATETAB ON DATABASE SKILLS TO PROG1
Authority may also be removed with the REVOKE command. Here's an example removing CREATE authority for tables in database SKILLS from PROG1.
REVOKE CREATETAB ON DATABASE SKILLS FROM PROG1
*Uses for SQL Statements
You can use SQL statements interactively. For example, via IBM's Query Management Facility (QMF) or an interactive facility provided with the relational database.
You can also use SQL statements embedded in application programs such as APL2, BASIC, COBOL, PL/I, FORTRAN, LISP or assembler language programs
These various implementations of SQL, however, are semantically and syntactically the same. What you learn here can be carried to ANY SQL-based database management system.
*Rules for Coding SQL Statements
When coding SQL statements, follow these syntax rules:
- Variables and keywords must be separated by one or more spaces.
- Parentheses are often required around a keyword's values. Commas must separate values in a list (e.g. column names, table names, data values).
- Spacing around commas, parentheses and operators (=,>,+,*,etc.) is for readability. Spaces are not required.
- Character strings are sequences of characters that begin and end with a string delimiter. A string delimiter is either ' (apostrophe) or " (quotes) depending on the host language and the character chosen when the database system is installed. Use the same character (' or ") for the beginning and ending delimiter.
INSERT INTO GRADE VALUES ('E01',0.000)
INSERT INTO GRADE VALUES('E01',0.000)
INSERT INTO GRADE VALUES ( 'E01', 0.000 )
*Looking at an Example
Above are three valid codings of the same statement showing the syntax rules.
Spaces are required between each keyword or variable (INSERT, INTO, GRADE, VALUES). In the VALUES portion, parentheses are needed around the values and commas must separate the values. The character string E01 is delimited by apostrophes. Spaces around parentheses and commas are optional.
*Following Conventions
In this course, you will see a few conventions followed for readability or emphasis. These conventions are
- Each clause is entered on a separate line, for example
SELECT * FROM ROSTER
WHERE STUDENTNO = 'S10'
rather than
SELECT * FROM ROSTER WHERE STUDENTNO = 'S10' - Information coded between the symbols < > is optional
- ... indicates information that repeats
Topic 1.4: Unit 1 Summary
The following points were covered in this unit:- Databases based on a relational model store data in one or more tables.
- Each table in a relational database consists of a specific number of columns and some number of unordered rows.
- Columns in a relational database appear vertically in a report, contain the same kind of data, and have names that are used as column titles.
- Rows in a relational database appear horizontally in a report, contain one or more columns of data, and do not have names.
- Formal terms to describe SQL concepts are: tables are named relations, columns are called attributes, and rows are known as tuples.
- Structured Query Language (SQL) is a high-level language that allows you to access and maintain data stored in relational databases, such as DB2 and SQL/DS.
- DDL (Data Definition Language) allows you to define new tables in the database. You can define and modify the database online without special utility programs.
- DML (Data Manipulation Language) is used to create reports and maintain tables.
- The Authorization Language provides control of database resources. Authority is controlled by the data administration area through SQL commands.
Unit 2. Combining Information From Two Tables
There may be times when the information you need is contained in two or more tables. To combine information from two tables into a single table, you use a technique called joining.
The objectives in this unit include the following topics:
- Joining two tables
- Types of joins
- Joining a table to itself
- Using other options, such as built-in functions and ORDER BY, in conjunction with a join
After completing this unit, you should be able to:
- Comprehend theory about joining and merging tables
Topic 2.1: Theory
*Joining and Merging TablesThis section introduces the situations in which you need to join or merge tables.
Examples are provided of situations where you need to join or merge information from two tables.
On completion of the section you should be ready to learn how to create a join or merge query — to be covered in the next three objectives.
*Using Your Address Book as an Example
Being an efficient individual, you have automated your personal address book. The format of a standard address book lends itself well to creating a separate table for each section:
- Addresses
- Birthdays
- Holiday card lists
*Birthday List
The first table is the birthday list, which contains only name and birthday.
Before you can mail cards, you also need to know addresses.
*Address Table
The address table contains the rest of the information you need. Therefore, you must combine these two tables.
*Combining Birthdays and Addresses
To produce a single table showing the names, birthdays and addresses of every friend with a January birthday, you have to combine information from the two tables.
To do this, there must be a column that is common to both tables.
*Finding a Common Match
These two tables have NAME in common. The system can match the row for JOHN BAKER in the BIRTHDAY table with the row for JOHN BAKER in the ADDRESS table.
*Viewing the Example
The information is combined in a single row of a temporary table created by the system.
*Combining Rows
All the rows from both tables are compared. For every NAME that appears on both tables, a row is created on the new table. Each row contains columns from both the original tables.
Any name that does not appear in both the original tables isn't included in this table, because of the type of join we chose.
Let's look now at the kinds of joins.
*Inner Joins and Outer Joins
There are two major kinds of joins — inner joins and outer joins.
When you code an inner join, you want the new table to include only rows where the values in the joined columns match.
In the example we just completed, we joined the tables using values in the NAME column. If a NAME appeared in one table but not the other, that row of information was excluded from the joined table.
*Beginning with Outer Join
However, you might want your new table to include all people in both tables, one row for each person.
To do this, you need to use an outer join. The outer join allows you to include the rows that don't have a match.
*Looking at the New Table
If we do a full outer join of the BIRTHDAY and ADDRESS tables, we get this result.
Any rows in the result tables that have missing values contain nulls. Note that the table includes all rows from both tables.
*Using Inner Joins to Create Tables
The table created by joining BIRTHDAY and ADDRESS with an inner join is like other tables: it can contain selected columns and rows (e.g., just for January birthdays) and you can order the data any way you want.
Let's look at an SQL query that could be used to join the BIRTHDAY and ADDRESS tables.
SELECT BIRTHDAY.NAME, BIRTH_DATE, STREET, TOWN, STATE, ZIP
FROM BIRTHDAY, ADDRESS
WHERE BIRTHDAY.NAME = ADDRESS.NAME
ORDER BY BIRTH_DATE
WHERE Clause
The WHERE clause specifies that the rows be matched based on the value in the NAME column of both tables.
The ORDER BY clause orders the output by birth date.
How to code a query to join tables is covered in detail in the next two objectives.
Now try a couple of questions to check your understanding of joining...
*Joining and Merging Tables
Joining tables involves matching the rows of one table with related rows of another table. You produce a table consisting of rows with columns from both the original tables.
Merging tables also involves combining data from two tables, but the rows are not joined together. The rows from each table remain separate in the resulting table.
*Merging Two Tables
Here is an example of merging two tables is the baseball league tables.
By merging the two tables, you create one long table with all the teams on it. Let's see how you merge these two tables.
*Step One
First you select the data you want from the first table.
SELECT *
FROM AMERICAN_LEAGUE
*Step Two
Then you join the first query to the second with the keyword UNION.
SELECT *
FROM AMERICAN_LEAGUE
UNION
*Step Three
Then you select the data from the second table.
SELECT *
FROM AMERICAN_LEAGUE
UNION
SELECT *
FROM NATIONAL_LEAGUE
*Step Four
Then you can order the result by any column you choose.
SELECT *
FROM AMERICAN_LEAGUE
UNION
SELECT *
FROM NATIONAL_LEAGUE
ORDER BY 1
*Viewing the Results
The resulting table, in this example, contains all the rows from both tables in ascending order by team, assuming TEAM is the first column in the table.
*Restrictions for Merging Tables
There are several restrictions on when two tables can be merged.
These restrictions, and all rules for writing a query to merge two tables, are discussed in a later objective.
*Reviewing What You Have Learned
To summarize...
Joining is where two rows from two tables that have the same value in a common column are joined to create a single row.
*Reminder
To summarize...
Merging is mixing the rows from two tables, creating a table consisting of rows from both tables.
Topic 2.2: Unit 2 Summary
The following points were covered in this unit:- There may be times when the information you need is contained in two or more tables. To combine information from two tables into a single table, you use techniques called joining and merging.
- Joining tables involves matching the rows of one table with related rows of another table. You produce a table consisting of rows with columns from both the original tables.
- Inner joins include only rows where the values in the joined columns match. Outer joins allow you to include rows that don't have a match.
- Merging tables also involves combining data from two tables, but the rows are not joined together. The rows from each table remain separate in the resulting table.
Unit 3. Inner Joins
The theory of joining tables was discussed in the first unit. This unit teaches the details of how to join two tables with an inner join. Some of the key points discussed are:
- FROM clause format
- SELECT statement
- Specifying table names
- The joining condition
- Other WHERE clause options
- ORDER BY clause
- Alternatives to joining
After completing this unit, you should be able to:
- Successfully join tables
Topic 3.1: Joining
FROM Clause FormatAnother word for inner join is equi-join. Since any join operation is part of a FROM clause, let's look first at the FROM clause format.
The general format of the FROM clause for any join is
FROM table-expression-1 join-operation table-expression-2
ON join-condition
For example,
FROM PRESIDENT_TABLE INNER JOIN PRES_MARRIAGE
ON PRESIDENT_TABLE.PRES_NAME = PRES_MARRIAGE.PRES_NAME
Note: It's also valid to use just JOIN as the join operation for an inner join.
*Creating an Inner Join
For an inner join, you can optionally use a comma (,) to specify the join operation. When you do so, note that you use a WHERE clause to specify the condition.
FROM table-expression-1 join-operation table-expression-2
WHERE join-condition
Here's an example:
FROM PRESIDENT_TABLE, PRES_MARRIAGE
WHERE PRESIDENT_TABLE.PRES_NAME = PRES_MARRIAGE.PRES_NAME
Most of our examples will use this format since it's more widely in use.
*Viewing the Example
Columns from the following row in PRESIDENT_TABLE...
*Joining Columns in the Database
...are joined with columns from the following row of PRES_MARRIAGE...
*Producing a Single Output Line
...to produce a single output line.
*Exceptions
On the previous screen you saw that the two rows, one from each table, were joined side by side. But what happens in the case of Ronald Reagan, who appears once in the PRESIDENT_TABLE, but twice in the PRES_MARRIAGE table?
*Results
Two rows appear in the output table.
*Studying the Example
Here is the section of the created table for recent presidents.
*Coding to Join Tables
A properly coded query to join two tables, then, does the following:
- It combines related information from two tables into a single row.
- If a value appears once in the joining column of one table, but several times in the other, the higher number of occurrences appear in the output.
- If a value exists in the joining column of one table, but not in the other, no row appears in the output for that value.
*Using SELECT
You can select as many or as few of the columns in either table as you want. All of the following are valid SELECTs joining PRESIDENT_TABLE and PRES_MARRIAGE.
SELECT PRESIDENT_TABLE.PRES_NAME, SPOUSE_NAME, STATE_BORN
- PRES_NAME is the joining column found in both tables; the other two columns come from either table. Because PRES_NAME is in both tables, you must use a qualifier to indicate which table you want to select it from.
SELECT PRESIDENT_TABLE.PRES_NAME, MARRIAGE_YEAR, PARTY, NR_CHILDREN
- This is another mix of columns from both tables.
SELECT *
- All columns from both tables are selected, but the joining column -- the column the two tables have in common -- appears twice, once for each table.
FROM Keyword
The FROM keyword is followed by the two table names, separated by a comma or by the keyword expression INNER JOIN.
SELECT PRESIDENT_TABLE.PRES_NAME, SPOUSE_NAME, NR_CHILDREN, STATE_BORN
FROM
The order in which you enter the table names is not important, as long as they are both there.
Before you go on to the WHERE clause, let's look at another example of joining two tables.
ELECTION Table
A table called ELECTION contains four columns:
ELECTION_YEAR, CANDIDATE, VOTES, WINNER_LOSER_INDIC
By combining this information with the PRESIDENT_TABLE, we can answer the question: "Which election(s) did each president contest?"
The first question to answer is whether these two tables can be joined.
*Using a Joining Condition
You should include a joining condition in every query that joins two tables.
In an inner join, the joining condition ensures that only those presidents that appear in both tables are selected.
WHERE PRESIDENT_TABLE.PRES_NAME = PRES_MARRIAGE.PRES_NAME
But, even more importantly, it saves you from producing a very long table that contains all possible combinations of rows for the tables identified in the FROM clause.
Look first at what happens when the join is done correctly.
*Correct Join
If you join them correctly...
SELECT FIRST_TABLE.ITEM, PRICE, COLOR
FROM FIRST_TABLE, SECOND_TABLE
WHERE FIRST_TABLE.ITEM = SECOND_TABLE.ITEM
*Viewing the Output
... this output is produced.
SELECT FIRST_TABLE.ITEM, PRICE, COLOR
FROM FIRST_TABLE, SECOND_TABLE
WHERE FIRST_TABLE.ITEM = SECOND_TABLE.ITEM
*Don't Forget the Joining Condition!
Should you forget it, as shown here...
SELECT FIRST_TABLE.ITEM, PRICE, COLOR
FROM FIRST_TABLE, SECOND_TABLE
*What Will Happen Without Joining?
The first row of the first table was joined with every row from the second table. Then the second row in the first table was joined with every row from the second table.
And even though there weren't any entries for CRYSTAL in the second table, it was also joined with every row.
Imagine the length of the output if there are 1,000 rows in each table -- 1,000 * 1,000 rows. As well as being long, this data is totally invalid.
*Rules for the Joining Condition
You should be convinced of the importance of the joining condition. Let's go over the rules for coding the joining condition.
SELECT PRESIDENT_TABLE.PRES_NAME,
SPOUSE_NAME,NR_CHILDREN,STATE_BORN
FROM PRESIDENT_TABLE, PRES_MARRIAGE
WHERE
The joining condition for an inner join always checks that the values in the column common to both tables are equal.
The joining condition in this example means: join the information about Washington in the second table to the information about Washington in the first table. If a name appears in one table, but not the other, that row is not selected.
*Studying the Example
Notice that the table name, followed by a period, is put in front of each of the column names. This forces the system to compare PRES_NAME from the first table with
PRES_NAME in the second table.
Let's look at some of the data this query produces.
SELECT PRESIDENT_TABLE.PRES_NAME,
SPOUSE_NAME,NR_CHILDREN,STATE_BORN
FROM PRESIDENT_TABLE, PRES_MARRIAGE
WHERE
*Results of the Join
All recent presidents appear in the report because they all are in both tables. Reagan appears twice because he has two rows in the PRES_MARRIAGE table; the same information from the PRESIDENT_TABLE is joined to each of the rows from PRES_MARRIAGE.
*Limiting the Output
This time there are lots of duplicates because many presidents contested more than one election.
But what if we want to limit this output to showing the elections in which the presidents won, or just the Republican presidents?
That poses no great problem.
*Adding Search Conditions
In addition to the joining condition you can add other search conditions, which can pertain to either table.
For example, let's limit the election data to:
Living presidents - if PRESIDENT_TABLE.DEATH_AGE is NULL
Republicans - if PRESIDENT_TABLE.PARTY is 'REPUBLICAN'
Elections they won - if ELECTION.WINNER_LOSER_INDIC is 'W'
Let's look at how all these conditions are added to a joining query.
*Creating the Query
The query is
SELECT PRES_NAME, ELECTION, VOTES, WINNER_LOSER_INDIC
FROM PRESIDENT_TABLE, ELECTION WHERE PRES_NAME = CANDIDATE
AND DEATH_AGE IS NULL
AND PARTY = 'REPUBLICAN'
AND WINNER_LOSER_INDIC = 'W'
Note that the table name prefix is required only when the column names are identical. When the names are different, SQL knows where to find each column; SQL knows that DEATH_AGE only occurs in PRESIDENT_TABLE, for example.
Only if the same column name appears in both tables do you need to indicate which table is evaluated.
*Viewing the Output
The output from this query is much shorter.
SELECT PRES_NAME, ELECTION, VOTES, WINNER_LOSER_INDIC
FROM PRESIDENT_TABLE, ELECTION
WHERE PRES_NAME = CANDIDATE
AND DEATH_AGE IS NULL
AND PARTY = 'REPUBLICAN'
AND WINNER_LOSER_INDIC = 'W'
All dead presidents and Democrats are gone, as is Gerald Ford, who did not win an election. The query truly answers: "Which election(s) did living Republican presidents win?"
*What Happens Next?
Only rows meeting these two search conditions are selected from
PRESIDENT_TABLE.SELECT PRES_NAME, ELECTION, VOTES, WINNER_LOSER_INDIC
FROM PRESIDENT_TABLE, ELECTION
WHERE PRES_NAME = CANDIDATE
AND WINNER_LOSER_INDIC = 'W'
*Results of the Exercise
And only winning candidates are selected from
ELECTION.SELECT PRES_NAME, ELECTION, VOTES, WINNER_LOSER_INDIC
FROM PRESIDENT_TABLE, ELECTION
WHERE PRES_NAME = CANDIDATE
AND DEATH_AGE IS NULL
AND PARTY = 'REPUBLICAN'
*Using a Joining Query
Like a standard query, a joining query gives you three different ways to enter your order by clause.
SELECT TABLE_A.ITEM, PRICE, COLOR
FROM TABLE_A, TABLE_B
WHERE TABLE_A.ITEM = TABLE_B.ITEM
ORDER BY DESC
or
ORDER BY DESC
or
ORDER BY DESC
*Built-In Functions
The last topic in this lesson addresses the use of built-in functions (AVG, MAX, etc.) when you join tables.
Question: Can you use built-in functions when you join tables?
Answer: Yes. It's easy.
*Creating an Example
Question: How about an example? When would you want to use built-in functions with a join?
Answer: How many years did Republican presidents serve on average -- and what are the maximum, minimum and average number of votes gained by Republican presidents in their winning campaigns?
Let's look at how that query can be coded.
*Coding Queries
Using built-in functions in join queries is no different from standard queries -- all the same rules apply.
SELECT AVG(YRS_SERVED), MAX(VOTES), MIN(VOTES), AVG(VOTES)
-- The built-in functions can refer to either table.
FROM PRESIDENT_TABLE, ELECTION
WHERE PRESIDENT_TABLE.PRES_NAME = ELECTION.CANDIDATE
AND PARTY = 'REPUBLICAN'
AND WINNER_LOSER_INDIC = 'W'
*Joining More Than Two Tables
Question: Okay. You can use built-in functions with joined queries. But can you join more than two tables?
Answer: Yes, but all the tables must have at least one column in common. (e.g., Table A and Table B must have a field in common and Table B and Table C must have a field in common.)
*Alternatives
Question: Are there any alternatives to joining?
Answer: Yes, there are several situations in which joining is not appropriate. Let's look at them.
*Point One
There are three important points to remember about joining.
1. Joining takes a lot of computer resources and can be complicated to code.
You might be able to save yourself and the system some time and effort if you can manage with two separate, non-joined tables.
*Point Two
2. Subselects are a more efficient alternative to join queries in certain circumstances.
For example, a joining query cannot tell you which elections were contested by presidents who served longest; a subselect can.
Subselects are covered later in this course.
*Point Three
3. Joining, along with subselects, can meet most of your your needs for extracting information from two tables.
You may recall that you can use nested table expressions within a join.
Topic 3.2: Unit 3 Summary
The following points were covered in this unit:- A properly coded query to join two tables with an inner join combines related information from two tables into a single row, values appear once in the joining column of one table, but several times in the other, the higher number of occurrences appear in the output, and values exist in the joining column of one table, but not in the other, and no row appears in the output for that value.
- The SELECT for joining tables can select any number of columns in either table.
- When doing an inner join, the FROM keyword is followed by the two table names, separated by a comma or by the keyword expression INNER JOIN.
- You should include a joining condition in every query that joins two tables.
- For inner joins, the joining condition always checks that the values in the column common to both tables are equal.
- In addition to the joining condition, you can add other search conditions, which can pertain to either table.
- Like a standard query, a joining query gives you three different ways to enter your ORDER BY clause.
- Using built-in functions in join queries is no different from standard queries -- all the same rules apply.
Unit 4. Outer Joins
In the prior unit, you saw how to code inner joins.
This unit shows how to use the three types of outer joins:
- Left outer
- Right outer
- Full outer
After completing this unit, you should be able to:
- Use outer joins
Topic 4.1: Outer Joins
*Reviewing What You Have LearnedHere again is the general format of the FROM clause:
FROM table-expression-1 join-operation table-expression-2
ON join-condition
The operands of a join (table expression) can be any result table.
The join operations for outer joins are RIGHT OUTER JOIN, LEFT OUTER JOIN and FULL OUTER JOIN.
The join condition specifies the columns to be compared.
Note: It's also valid to use just JOIN as the join operation for a full outer join. The ON keyword tells SQL it's not an inner join.
*Outer Joins
The example of an outer join shown earlier when we joined the BIRTHDAY and ADDRESS tables was of a full outer join. It included all unmatched rows from both tables.
The other versions of the outer join limit the unmatched rows to be based on only one table.
These types of joins give you a lot of flexibility in creating queries, but this flexibility creates many challenges in identifying the best way to code your queries.
Let's see how to code each type of outer join and what the results table looks like.
*Defining Outer Joins
When you are defining an outer join, you must use
- The ON keyword
- The equal sign (=) when comparing expressions for a full outer join
*Looking at Some Examples
As we explore outer joins, let's first use these example tables.
*Coding a Full Outer Join
If you code a full outer join of the BIRTHDAY and ADDRESS tables...
SELECT BIRTHDAY.NAME, BIRTH_DATE, STREET, CITY, STATE
FROM BIRTHDAY ADDRESS
ON BIRTHDAY.NAME = ADDRESS.NAME;
you get this result...
LEFT OUTER JOIN
The operation LEFT OUTER JOIN includes only unmatched rows from the table named before it, in this case BIRTHDAY.
SELECT BIRTHDAY.NAME, BIRTH_DATE, STREET, CITY, STATE
FROM BIRTHDAY ADDRESS
ON BIRTHDAY.NAME = ADDRESS.NAME;
Any unmatched rows from the ADDRESS table are excluded. Jean Friar and Gladys Howe are unmatched rows in the ADDRESS table, so they don't appear in the result table.
RIGHT OUTER JOIN
The operation RIGHT OUTER JOIN includes only unmatched rows from the table named after it, in this case ADDRESS.
SELECT ADDRESS.NAME, BIRTH_DATE, STREET, CITY, STATE
FROM BIRTHDAY ADDRESS
ON BIRTHDAY.NAME = ADDRESS.NAME;
Any unmatched rows from the BIRTHDAY table are excluded. Kendra Taylor is an unmatched row in the BIRTHDAY table, so she doesn't appear in the result table.
*Specifying the Columns
Now that you have some understanding of the types of outer joins, let's look next at some ways to better control the results you get.
When the columns you choose have the same name, you must add a qualifier to make the column name unique.
When the query is an outer join, you must be careful to specify the correct column in the SELECT. Otherwise, some rows may have nulls you don't want.
*Viewing an Example
Suppose, for example, you want to create a table that includes Buchanan, an unmarried president. If you use this query:
SELECT PRES_MARRIAGE.PRES_NAME, NR_CHILDREN, STATE_BORN
FROM PRESIDENT_TABLE FULL OUTER JOIN PRES_MARRIAGE
ON PRESIDENT_TABLE.PRES_NAME = PRES_MARRIAGE.PRES_NAME;
the row of information for Buchanan is selected, but the NAME information is null because it doesn't appear in the PRES_MARRIAGE table. Consequently, you have a row in the result table that has a value for only the STATE_BORN.
*Changing the Qualifier
If you change the qualifier for PRES_NAME, the query
SELECT , NR_CHILDREN, STATE_BORN
FROM PRESIDENT_TABLE FULL OUTER JOIN PRES_MARRIAGE
ON PRESIDENT_TABLE.PRES_NAME = PRES_MARRIAGE.PRES_NAME;
gives you a row with the NAME Buchanan and only the value for NR_CHILDREN is null.
VALUE Function
In the previous example, it would be convenient to not worry about having unwanted null columns accidentally.
The VALUE (or COALESCE) function is a neat solution.
The VALUE function returns the first non-null value in its arguments. (COALESCE is a synonym for VALUE, so the two can be used interchangeably.)
This function is allowed in full outer joins and provides a way to integrate data from two columns into a single column. The two columns must contain the same data type.
You must assign a name to the resulting column using the AS clause.
Let's see how this works...
*Studying an Example
For example, the query
SELECT COALESCE(PRESIDENT_TABLE.PRES_NAME, PRES_MARRIAGE.PRES_NAME)
AS NAME, NR_CHILDREN, STATE_BORN
FROM PRESIDENT_TABLE FULL OUTER JOIN PRES_MARRIAGE
ON PRESIDENT_TABLE.PRES_NAME = PRES_MARRIAGE.PRES_NAME
ORDER BY NAME;
assures that all names from both tables are shown.
*Using the AS Clause
In the result, notice that the AS clause (AS NAME), provides a name for the result of the COALESCE function and NAME is used to order the results. The AS clause is required for the COALESCE function.
*A Helpful Reminder
By the way, you are not limited to the name of a single table as an operand of a join. The operand may also be a subselect.
When you use a subselect in any FROM clause, it is called a nested table expression. When you use a subselect in a WHERE clause, it is called a subquery.
Both nested table expressions and subqueries are covered later in the course.
Topic 4.2: Unit 4 Summary
The following points were covered in this unit:- The three types of outer joins are left right, and full.
- When you are defining an outer join you must use the keyword ON.
- A full outer join includes all unmatched rows from both tables.
- A left outer join includes only unmatched rows from the table named before the join operation keywords.
- A right outer join includes only unmatched rows from the table named after the join operation keywords.
- When doing an outer join of tables with identical column names, you must be careful to specify the correct column in the SELECT. Otherwise, some rows may have nulls you don't want.
- The COALESCE (or VALUE) function provides a way to merge data from two columns into a single column. You must use the AS clause to name the new column.
Unit 5. Merging Tables
The theory of merging tables was discussed earlier. This unit teaches the details of how to merge two tables.
Some of the key points discussed in this lesson are:
- SELECT from each table
- The UNION and UNION ALL keywords
- Restrictions on merging
- ORDERing merged data
After completing this unit, you should be able to:
- Understand how to merge tables
- Practice merging tables in a workshop
*Illustrating MERGE
To illustrate a MERGE, let's take two very short tables. The query to merge the two tables has three elements.
1. Select data from one table.
2. Specify UNION to indicate a merge.
3. Select data from the second table.
Let's look at an example.
*Viewing an Example
Now let's see how the system handles this query.
SELECT ITEM, PRICE FROM TABLE_A | Select data from first table |
UNION | Union must follow the first SELECT statement |
SELECT ITEM, PRICE FROM TABLE_B | Select data from second table |
*Step One
First the rows selected from the first table are retrieved.
*Step Two
Then the rows selected from the second table are sorted with the rows from the first table. Duplicate rows are removed.
*Reviewing the Steps
To repeat the steps in a merge query:
1. SELECT data from Table 1
2. UNION - The UNION keyword must be entered after the first SELECT statement
3. SELECT data from Table 2
UNION ALL Keyword
So far, you've seen an example of the UNION keyword in action. You can also use the UNION ALL keyword, which merges tables according to different rules.
Look at how the two keywords handle the same query.
*Results of UNION ALL
Remember the results from the merge with UNION as the keyword. Now look at what happens when UNION ALL is used.
*Studying the Example
On the right is the output resulting from the UNION ALL keyword. Notice that the items are in a different order and that duplicates are included.
*Reviewing
Let's go over the differences between UNION and UNION ALL.
- UNION sorts the results table; UNION ALL does not. In other words, UNION mixes rows together in the output. UNION ALL will output all the rows from the first table that met the search condition, followed by all the rows in the second table that met the search condition.
- UNION removes duplicate rows; UNION ALL does not.
- UNION prohibits selecting any LONG VARCHAR columns; UNION ALL allows selecting LONG VARCHAR columns.
*Differences Between UNION and UNION ALL
You may have guessed some of the implications of the difference between UNION and UNION ALL.
- Sorting increases system overhead, especially when you are sorting thousands of rows.
- In addition, UNION may not sort by the columns you want or in the order you want.
Therefore, the recommendations are...
*First Recommendation
1. If you know that the output you are producing has no duplicates,
- Use UNION ALL as the keyword to reduce system overhead, and
- Sort the output by adding an ORDER BY clause
*Second Recommendation
2. Add literals to your query ('WAREHOUSE A' and 'WAREHOUSE B' in the query shown here). These expressions create a column with a fixed value. They show you which table the row is from and, if you do use UNION as the keyword, no rows are duplicates.
SELECT ITEM, 'WAREHOUSE A', PRICE, VALUE FROM TABLE_A
UNION ALL
SELECT ITEM, 'WAREHOUSE B', PRICE, VALUE FROM TABLE_B ORDER BY 4 DESC
*Looking at the Difference
Next, look at some points about merges that are true for both UNION and UNION ALL.
UNION | UNION ALL |
---|---|
Sorts results table. | Does not sort results table. |
Removes duplicate rows. | Keeps duplicate rows. |
Prohibits selecting LONG VARCHAR columns. | Allows selecting LONG VARCHAR columns. |
SELECT NAME, ADDRESS, CITY, BALANCE
FROM EAST_REGION
WHERE BALANCE >= 10000.00
UNION
SELECT NAME, ADDRESS, CITY, BALANCE
FROM WEST_REGION
WHERE BALANCE >= 10000.00
ORDER BY 4 DESC
*Some Questions About Merging
This query addresses some of the key questions you may have about merging:
- Does it matter which columns you select?
- What can be specified in the WHERE clause?
- How do you ORDER merged data?
Let's start with the columns you select.
*Matching the Columns
The columns from each table must match each other.
If your first query specifies:
SELECT NAME, ADDRESS, BALANCE
the second query must select columns in the same order...
SELECT NAME, ADDRESS, BALANCE
...though it is okay if the names of the columns do not match exactly:
SELECT CUSTOMER_NAME, ADDRESS, BALANCE_DUE
*Do Columns Have to Match?
The columns from each table must match each other.
- Corresponding columns must have compatible data types.
- A numeric column must correspond to a numeric column, a character column to a character column, and so on.
- Numeric data types need not be the same. For example, an integer column can correspond to a decimal column.
- Column lengths need not be the same.
SELECT NAME, ADDRESS, CITY, BALANCE
FROM EAST_REGION
WHERE BALANCE >= 10000.00
UNION
SELECT NAME, ADDRESS, CITY, BALANCE
FROM WEST_REGION
WHERE BALANCE >= 10000.00
ORDER BY 4 DESC
*One Question Down!
We've answered the first key question about merging.
- Does it matter which columns you select?
- What can be specified in the WHERE clause?
- How do you ORDER merged data?
Let's go on to the WHERE clause and ordering merged data.
*Moving on to the WHERE Clause
You can think of the two queries linked by UNION or UNION ALL as being completely independent regarding the WHERE clause.
Rows are selected from the first table by the first query, according to whatever search criteria.
Then rows are selected from the second table, again according to whatever search criteria are specified in the second query.
There is no need for any kind of joining condition.
ORDER BY
Always put the order by clause in the last query.
Placing the ORDER BY clause in the last query causes all the output data to be ordered.
*Using a Column Number
Use a column number in the order by clause.
The two tables might not have the same column names, so you must use column numbers to avoid confusion.
The exception to this is that you may use a column name that was defined in a SELECT with an AS clause.
SELECT MAX(SALARY)
FROM WEST_REGION
UNION
SELECT MAX(SALARY)
FROM EAST_REGION
ORDER BY 1 DESC
*Another Question
This query addresses another key question you may have about merging:
How are built-in functions and arithmetic expressions handled? Let's see.
*Finding Answers in the Table
Assume that each region has its own employee table. Who is the highest paid employee in each region?
SELECT MAX(SALARY)
FROM WEST_REGION
UNION
SELECT MAX(SALARY)
FROM EAST_REGION
ORDER BY 1 DESC
The maximum from each table is output.
OUTPUT MAX(SALARY)
-----------------
252000.00
228000.00
But who's who...? Read on.
*Creating Columns
As suggested earlier, you can identify the table that a row is from by creating a column containing an expression.
SELECT MAX(SALARY), 'WEST' -- Outputs 'WEST' in second column
FROM WEST_REGION of rows selected from WEST_REGION
UNION
SELECT MAX(SALARY), 'EAST' -- Outputs 'EAST' as second column
FROM EAST_REGION of all rows selected from EAST_REGION
ORDER BY 1 DESC
Let's look at the output.
*Discovering the Answer
The highest paid employee is in the West Region.
Expressions can be used with any type of query to create a column with a fixed value.
Topic 5.2: Workshop
*Getting Started with a WorkshopThis workshop includes problems and questions that you can resolve by joining or merging the data in two tables.
You will work on the following situations:
- Output information about the different departments, including LOCATION and DEPTNAME (from ORG) and manager NAME and SALARY (from STAFF).
- There is a new job opening. Current employees with less than 2 years of experience and applicants with between 12 and 14 years of education are eligible. This involves merging rows from APPLICANT and STAFF.
*Trying a Simulation
First try a simulation that uses what you learned in this unit to build a complete SQL request.
Normally when you code an SQL query, you enter all the clauses one after another until the SELECT statement is complete. In the following simulation, you'll be prompted to enter one clause at a time.
Good luck!
*Problem 1
Output information about the different departments, including LOCATION and DEPTNAME (from ORG) and manager NAME and SALARY (from STAFF).
The output is to include only departments in the Eastern Division.
Let's look first at the two tables your query joins.
ORG
ORG contains the following columns. The MANAGER column in ORG contains the managers' IDs, not their names.
STAFF
STAFF contains:
*Producing Output
You need to produce the following output:
*Good Job!
Here is the output produced by your query.
Note that the output doesn't contain rows without a match. To get those rows, you can use an outer join, which you'll do in the next problem.
SELECT DEPTNAME, LOCATION, NAME, SALARY
FROM ORG, STAFF
WHERE ORG.MANAGER = STAFF.ID
AND DIVISION = 'EASTERN'
ORDER BY 1
SELECT DEPTNAME, LOCATION, NAME, SALARY
FROM ORG, STAFF
WHERE ORG.MANAGER = STAFF.ID
AND DIVISION = 'EASTERN'
ORDER BY 1
*Making Alterations
For the next few questions, we are going to cut the STAFF table down to half so you will be using only the rows shown.
We will be using the ORG table as is.
*Moving On
Using the same tables, ORG and STAFF, now perform a join that will include the unmatched rows from the ORG table.
Complete the coding on the next screen.
*Results of Your Work
This is the result table from the left outer join.
*Viewing the Full Outer Join
This is the result table from the full outer join.
*Another Question
There is a new job opening. Current employees with less than 2 years of experience and applicants with between 12 and 14 years of education are eligible. This involves merging rows from APPLICANT and STAFF.
A query was already created to select the current employees. The first step is to decide whether applicants from the APPLICANT table can be added at the end.
Note: Like Problem 1, this problem is set up as a simulation, so if your entry is correct, you'll move immediately to the next question.
SELECT NAME, ID
FROM STAFF
WHERE YEARS < 2
UNION ALL
SELECT NAME, TEMPID
FROM APPLICANT
WHERE EDLEVEL BETWEEN 12 AND 14
*Reviewing the Final Version
Here's the final version of the query, merging data from two tables.
The output produced is a list of qualified applicants from both tables.
Topic 5.3: Unit 5 Summary
The following points were covered in this unit:- A query to merge two tables has selects data from one table, specifies UNION to indicate a merge, and selects data from the second table.
- The differences between UNION and UNION ALL are UNION sorts the results table, and UNION ALL does not, UNION removes duplicate rows, and UNION ALL does not, and UNION prohibits selecting any LONG VARCHAR columns, and UNION ALL allows selecting LONG VARCHAR columns.
- If you know that the output you are producing has no duplicates use UNION ALL as the keyword to reduce system overhead, and sort the output by adding an ORDER BY clause.
- Add literals to your query. These expressions create a column with a fixed value. Literals show you which table the row is from and, if you do use UNION as the keyword, no rows are duplicates.
- When you merge two tables the data type of each column selected from the first table must be compatible with that of the corresponding column from the second table, and you must select the same number of columns from each table.
- You can think of the two queries linked by UNION or UNION ALL as being completely independent regarding the WHERE clause.
- Always put the ORDER BY clause in the last query. Always use a column number in the ORDER BY clause, unless you've defined a column name using the AS clause.
Unit 6. Subselects
Subselects are in some ways comparable to joining tables. They let you combine information from more than one table, and they let you compare different rows of the same table.
You will find that some tasks can be done by joins alone, others by either joins or subselects, and yet others by subselects alone.
After completing this unit, you should be able to:
- Code a subquery
- Code a nested table expression
Topic 6.1: What is a Subquery?
*Getting Started with QueriesSome typical queries you run using subselects are:
- Which president served longest?
- Which employees earn more than the company average?
- Which presidents come from the thirteen original states?
- Who is the manager of the Head Office?
- Do any non-managers earn more than any managers?
- Do any non-managers earn more than all managers?
- Do any non-managers earn more than their own managers?
- Does anybody earn more than the manager of the Head Office?
*What Is a Subselect?
A subselect is a second SELECT in a query. Its results are used as part of the search argument of the main SELECT.
When you use a subselect in any FROM clause, it is called a nested table expression.
When you use a subselect in a WHERE clause, it is called a subquery.
We'll look first at subqueries. This concept is best illustrated by an example:
Which president served longest?
Let's look first at a proposed solution.
*Asking a Question
Which president served longest?
Using two queries, you could run one query to find the maximum years served.
You can then note down the value returned by the query.
*Using a Subquery
Using the values from the first query, you can now code a second query.
If you use a subquery, the value of MAX(YRS_SERVED) is automatically passed to the second query.
*Using SELECT
The following is the query to select the name of the president who served longest.
The first SELECT uses the value returned by the second SELECT (the subselect) as part of its search condition.
SELECT PRES_NAME, YRS_SERVED
FROM PRESIDENT_TABLE
WHERE YRS_SERVED = (SELECT MAX(YRS_SERVED)
FROM PRESIDENT_TABLE)
*Adding the WHERE Clause
First the subselect runs, returning a single value.
SELECT PRES_NAME, YRS_SERVED
FROM PRESIDENT_TABLE
WHERE YRS_SERVED =
Then the main query runs, using the value returned by the subselect as the right operand of its WHERE clause.
WHERE YRS_SERVED = Whatever value was returned by the subselect.
*Asking Questions
You've already seen that you must use a subselect whenever you want to use aggregated and non-aggregated values together in a query to answer questions such as:
- Which president served longest?
- Which employees earn above the average salary for their job?
- Which president had the most children?
*Posing a Question
Problem: Which presidents were born in the original thirteen states?
Tables: STATE : STATE_NAME, YEAR_ENTERED
PRESIDENT : PRES_NAME, STATE_BORN
Solution 1. Run two queries.
Solution 2. Use a joining query.
Solution 3. Use a subselect.
a. Use a subselect to select the states from the STATE table, based on the year they entered the Union. b. Then select PRES_NAME from PRESIDENT_TABLE if STATE_BORN matches one of the states selected by the subselect.
Let's see how solution 1 works.
*Solving the Problem
Problem: Which presidents were born in the original thirteen states?
Solution 1. Run two queries.
STEP 1:
SELECT STATE_NAME
FROM STATE_TABLE
WHERE YEAR_ENTERED = 1776
The result of this query is a list of thirteen states.
Note them down, and you are ready for step 2.
*Another Solution
Problem: Which presidents were born in the original thirteen states?
Solution 1. Run two queries.
STEP 2:
SELECT PRES_NAME
FROM PRESIDENT_TABLE
WHERE STATE_BORN IN ('MASSACHUSETTS', 'PENNSYLVANIA',
'VIRGINIA', 'CONNECTICUT',
'SOUTH CAROLINA', 'MARYLAND',
'NEW JERSEY', 'GEORGIA',
'NEW HAMPSHIRE', 'DELAWARE',
'NEW YORK', 'NORTH CAROLINA',
'RHODE ISLAND' )
Provided you spelled all thirteen names correctly, your output will list all presidents born in the original thirteen states.
Solution 2, a joining query, automates this same process.
*An Alternate Solution
Problem: Which presidents were born in the original thirteen states?
Solution 2. Join the two tables:
SELECT PRES_NAME, STATE_BORN
FROM PRESIDENT_TABLE, STATE_TABLE
WHERE STATE_BORN = STATE_NAME
AND YEAR_ENTERED = 1776
Joining tables was covered earlier in this course and will not be covered in detail in this unit.
*Another Solution
Problem: Which presidents were born in the original thirteen states?
Solution 3. Use a subselect:
SELECT PRES_NAME,
FROM PRESIDENT_TABLE
WHERE STATE_BORN IN
The subselect selects the original thirteen states.
*Using a Subselect
Problem: Which presidents were born in the original thirteen states?
Solution 3. Use a subselect:
(SELECT STATE_NAME
FROM STATE_TABLE
WHERE YEAR_ENTERED = 1776)
The main query then selects all presidents born in any one of the thirteen states.
*Summarizing What You Have Learned
The following table summarizes when you might use subqueries.
Note: Another use of subqueries, to select rows from a table based on values contained in different rows from that same table, is covered later.
Topic 6.2: Nested Table Expressions
*Another Type of SubselectNow let's look at another type of subselect, the nested table expression.
While you use a subquery to specify search criteria on a single column, a nested table expression returns more than one column.
*Using Nested Table Expression
A nested table expression is a subselect in a FROM clause.
A nested table expression allows you to select specific rows from tables BEFORE you join them. Using this feature can greatly improve the performance of the join.
In addition, you can use a join where you might otherwise have merged data, a more costly operation.
*Following the Rules
You should follow these rules in defining your nested table expression:
- Enclose the subselect in parentheses.
- Assign a new name to the nested table expression. You can use this name as the qualifier of a column name, but you cannot refer to the name within the same FROM clause.
- Rename the columns of the result table that you plan on referencing, so they are unique.
*Joining Tables
Nested table expressions can be used in any type of join.
To create a table of the number of children that were born to presidents from Ohio, this nested table expression selects only rows for Ohio presidents to be joined with the marriage information:
SELECT NAME, NR_CHILDREN
FROM
AS PRES_OHIO
LEFT OUTER JOIN PRES_MARRIAGE
ON NAME = PRES_MARRIAGE.PRES_NAME;
The WHERE clause in the nested table expression eliminates any presidents not born in Ohio before the join takes place. This allows the join to be as efficient as possible.
*Viewing the Results
Each instance of a President's marriage appears whenever there is a match for an Ohio President.
Here is the result table.
*Studying the Example
Note that in this example a correlation name (PRES_OHIO) is assigned. The AS clause is required for an operand that is a nested table expression, even if the name is not referenced in the query.
SELECT NAME, NR_CHILDREN
FROM (SELECT PRESIDENT_TABLE.PRES_NAME AS NAME
FROM PRESIDENT_TABLE
WHERE STATE_BORN = 'OHIO')
LEFT OUTER JOIN PRES_MARRIAGE
ON NAME = PRES_MARRIAGE.PRES_NAME;
You can also just append the new name after the table name, leaving one or more blanks between them. Using the AS keyword is optional, but it makes your code more readable.
*Joining More Than Two Tables
You can join more than two tables, and you can use different types of joins in the same statement.
Let's look at an example where you need to join more than two tables and you use more than one join type in the FROM clause.
Suppose you wanted a result table that showed which presidents had a hobby of swimming, in addition to the number of children and birth state. This scenario requires you to join three tables to get all the information. The next page shows the result table.
SELECT NAME, NR_CHILDREN, STATE_BORN
FROM (SELECT PRES_HOBBY.PRES_NAME AS NAME FROM PRES_HOBBY
WHERE HOBBY = 'SWIMMING') AS SWIM_TABLE
INNER JOIN PRES_TABLE
ON NAME = PRES_TABLE.PRES_NAME
LEFT OUTER JOIN PRES_MARRIAGE
ON NAME = PRES_MARRIAGE.PRES_NAME
ORDER BY NAME;
Topic 6.3: Unit 6 Summary
The following points were covered in this unit:- Subselects are in some ways comparable to joining tables. They let you combine information from more than one table, and they let you compare different rows of the same table.
- A subquery is a query where the result from one SELECT (the subselect) is used as part of the search argument of another SELECT.
- Subqueries can be used to combine aggregated columns with non-aggregated columns.
- Subqueries can be used to select rows from one table based on values selected from another table.
- A nested table expression is a subselect in a FROM clause. It allows you to select specific rows from tables BEFORE you join them.
- When the join operand is a nested table expression, you must assign a correlation name with the AS clause.
- You can join more than two tables, and you can use different types of joins in the same statement.
Unit 7. Basic Subqueries
In this unit you will learn how to create basic subqueries. You will create several subqueries that access one table. You will also create subqueries that access two tables.
- Select information from PRESIDENT_TABLE about presidents who had more than two children.
- Who is the manager of the Head Office?
In writing these queries, you will also learn the basic rules governing subqueries.
- Format of a SUBSELECT
- How the main SELECT and SUBSELECT are connected
- The WHERE clause
- Use of ORDER BY
After completing this unit, you should be able to:
- Code a simple subquery
- Handle subqueries returning several rows
- Practice what you have learned in a workshop
Topic 7.1: Simple Subquery
*Getting StartedLet's start with the problem introduced in the previous objective — which president served longest.
A subquery is required because you need to compare a column value from individual rows — YRS_SERVED — with an aggregated value — MAX(YRS_SERVED). This can only be done with a subquery.
- The SUBSELECT finds MAX(YRS_SERVED)
- The SELECT compares each president's YRS_SERVED value with the result of the subselect.
*Looking at the Query
The columns selected by this query are PRES_NAME and YRS_SERVED for any president whose YRS_SERVED are equal to MAX(YRS_SERVED).
Let's analyze the elements that make up this query.
SELECT PRES_NAME, YRS_SERVED
FROM PRESIDENT_TABLE
WHERE YRS_SERVED = (SELECT MAX(YRS_SERVED)
FROM PRESIDENT_TABLE)
*Viewing the Example
Though the subselect is entered last, it is executed first by the system. This subselect returns a single value — MAX(YRS_SERVED), which is passed to the main SELECT.
SELECT PRES_NAME, YRS_SERVED
FROM PRESIDENT_TABLE
WHERE YRS_SERVED = (SELECT MAX(YRS_SERVED)
FROM PRESIDENT_TABLE)
*Using the Returned Value
The value returned by the subselect is used as the right operand of the main SELECT's WHERE clause.
SELECT PRES_NAME, YRS_SERVED
FROM PRESIDENT_TABLE
WHERE YRS_SERVED = (SELECT MAX(YRS_SERVED)
FROM PRESIDENT_TABLE)
*Rules for Coding
The subselect replaces the right operand of the main SELECT's WHERE clause. Therefore, it must be entered following the operator, which is = in this example.
SELECT PRES_NAME, YRS_SERVED
FROM PRESIDENT_TABLE
WHERE YRS_SERVED (SELECT MAX(YRS_SERVED)
FROM PRESIDENT_TABLE)
*A Helpful Hint
Remember that the queries are free-form, so you can start the SUBSELECT on a new line if you want to.
SELECT PRES_NAME, YRS_SERVED
FROM PRESIDENT_TABLE
WHERE YRS_SERVED =
*Enclosing the Subselect
The whole subselect must be enclosed in parentheses.
SELECT PRES_NAME, YRS_SERVED
FROM PRESIDENT_TABLE
WHERE YRS_SERVED =
*Using SUBSELECT
A SUBSELECT can select only one column.
SELECT PRES_NAME, YRS_SERVED
FROM PRESIDENT_TABLE
WHERE YRS_SERVED = (SELECT
FROM PRESIDENT_TABLE)
Note: The column selected does not have to be a built-in function.
*One Last Rule
In addition to the three rules you have seen, there is a fourth rule.
Do not use ORDER BY in the subselect. You can, however, order the main SELECT. SELECT PRES_NAME, YRS_SERVED FROM PRESIDENT_TABLE WHERE YRS_SERVED = (SELECT MAX(YRS_SERVED) FROM PRESIDENT_TABLE) |
*Using Main SELECT
The main SELECT of a subquery is like any regular SELECT, except that the right operand of one of its search conditions is a SUBSELECT instead of a value or a column name.
Note: When you use GROUP BY, the right operand of the HAVING clause can also be the subselect.
*Looking at an Example
Here's an example showing a fairly complex main SELECT.
SELECT NAME, SALARY, COMM, SALARY+COMM
FROM STAFF
FROM STAFF)
ORDER BY SALARY DESC
- This query will output information about clerks from department 10 whose salary is above the average salary for the company.
*Another Example
Here's an example showing a fairly complex main SELECT.
SELECT NAME, SALARY, COMM, SALARY+COMM
FROM STAFF
WHERE JOB = 'CLERK'
AND DEPT = 10
AND SALARY > (SELECT AVG(SALARY)
FROM STAFF)
- The output is to be in descending order by SALARY.
*Limiting Values
Here's an example showing a fairly complex main SELECT.
SELECT NAME, SALARY, COMM, SALARY+COMM
FROM STAFF
WHERE JOB = 'CLERK'
AND DEPT = 10
AND SALARY >
ORDER BY SALARY DESC
- There are no limits on the values that can be selected — though you cannot output AVG(SALARY), which was selected by the subselect.
*Using the WHERE Clause
Here's an example showing a fairly complex main SELECT.
SELECT NAME, SALARY, COMM, SALARY+COMM
FROM STAFF
(SELECT AVG(SALARY)
FROM STAFF)
ORDER BY SALARY DESC
- There can be several conditions in the WHERE clause.
Note: When more than one row is returned by the subselect, the standard relational operations (e.g., = or < ) may not be used. More about that in the next objective.
*Using the ORDER BY Clause
Here's an example showing a fairly complex main SELECT.
SELECT NAME, SALARY, COMM, SALARY+COMM
FROM STAFF
WHERE JOB = 'CLERK'
AND DEPT = 10
AND SALARY > (SELECT AVG(SALARY)
FROM STAFF)
- The ORDER BY clause comes after the SUBSELECT, but it refers to the columns selected by the main SELECT.
*Running the Subselect
First the subselect is run to find the lowest salary earned by a manager.
Then the main query is run to find out which non-management employees (WHERE JOB <> 'MGR') have a salary greater than the value returned by the subselect — AND SALARY > (SELECT MIN(SALARY). SELECT *
FROM STAFF
WHERE JOB <> 'MGR'
AND SALARY > (SELECT MIN(SALARY)
FROM STAFF
WHERE JOB = 'MGR')
ORDER BY SALARY DESC
*Modifying the Query
Which presidents served longer than the average?
You already ran a query to select the president who served longest.
You can modify that query to find all the presidents who served longer than average.
*Remember
You saw in the previous query that you could use the > as well as = in the main SELECT to compare with the value returned by the subselect.
WHERE YRS_SERVED > (SELECT AVG(YRS_SERVED)
When a single value is returned by the subselect, you can use any operators in your main SELECT EXCEPT:
- BETWEEN
- LIKE
*Using a Subquery with Multiple Tables
If the SELECT just provides a value to be used in a WHERE clause, it doesn't matter which table it got that value from. Therefore, you can use a subquery with more than one table.
Let's look at an example.
*Posing a Question
How much does the manager who heads up the Home Office earn?
ORG has the employee ID of the department heads, while STAFF contains salary information.
SELECT ID, NAME, SALARY
FROM STAFF
WHERE ID = (SELECT MANAGER
FROM ORG
WHERE DEPTNAME = 'HEAD OFFICE')
The MANAGER value returned by the subselect is the employee ID of the Head Office Manager.
*Coding a Subselect
When coding the subselect...
- Only one column, either aggregated or non-aggregated, can be selected by the subselect.
- The value or values in that column are used by a main query as part of its search condition.
Topic 7.2: Subqueries Returning Several Rows
*Returning More Than One ValueWhich presidents were born in the original 13 states? A subquery can be used to answer this question, but the subselect must return 13 values instead of just one.
How do you handle the WHERE clause of the main SELECT to compare 13 values? That is the only difference between this query and those you saw in the previous objective.
This section covers the use of the relational operators ALL, ANY, SOME and IN.
*Starting with a Query
Here's the query that lists all presidents born in the original thirteen states.
SELECT PRES_NAME
FROM PRESIDENT_TABLE
WHERE STATE_BORN IN (SELECT STATE_NAME
FROM STATE_TABLE
WHERE YEAR_ENTERED = 1776)
Think about what makes this subquery different from the others you have seen.
*Viewing the Results
When the SUBSELECT runs, it returns 13 rows.
(SELECT STATE_NAME
FROM STATE_TABLE
WHERE YEAR_ENTERED = 1776)
The main query must now be able to compare each president's STATE_BORN with all thirteen values in the list. If it matches any one of them, then that row is selected.
*Using the IN Operator
The MAIN QUERY must check for a value included in this list. The IN operator can be used.
SELECT PRES_NAME
FROM PRESIDENT_TABLE
WHERE STATE_BORN IN
The STATE_BORN value is compared with all the values in the list. If it matches any one of them, then that row is selected.
*An Alternative Method
Another way this query can be entered is
SELECT PRES_NAME
FROM PRESIDENT_TABLE
WHERE STATE_BORN = ANY (SELECT STATE_NAME
FROM STATE_TABLE
WHERE YEAR_ENTERED = 1776)
= ANY or = SOME can be used with subselects to mean the same as IN.
The query asks: Is STATE_BORN equal to one of the values in the list of values selected by the subquery? If so, the row is selected.
*Selecting Specific Rows
But you won't always want to check that a value is included in the list. Sometimes you may want to select rows whose value is greater or less than the values in the list. Did any salespeople earn less than clerks? Here are two queries that answer that question.
SELECT NAME, SALARY
FROM STAFF
WHERE JOB = 'SALES'
AND SALARY < ANY (SELECT SALARY
FROM STAFF
WHERE JOB = 'CLERK')
This first version lists all salespeople who earn less than any one or more clerks. Let's see how SQL calculates this.
*Using ANY
The condition stated:
WHERE SALARY < ANY (SELECT SALARY
The first value, 30000.00, is compared with the subquery's values. It is not selected because it is too high.
ANY ONE
The condition stated:
WHERE SALARY < ANY (SELECT SALARY
The second value is compared. To be selected, it only needs to be lower than ANY ONE of the values in the list. Therefore, it is selected.
Then the third value is also selected, since it is less than at least one of the values returned by the subselect.
*Using ALL
The previous version of the query listed all salespeople whose salary was less than ANY ONE of the clerks. But what if you want to know if any salesperson earns less than ALL of the clerks? — Instead of < ANY, you use < ALL.
SELECT NAME, SALARY
FROM STAFF
WHERE JOB = 'SALES'
AND SALARY < ALL (SELECT SALARY
FROM STAFF
WHERE JOB = 'CLERK')
Now let's see how SQL compares the results.
*Comparing the Results
The condition stated:
WHERE SALARY < ALL (SELECT SALARY
The first value, 30000.00, is compared with the subquery's values. It is not selected because it is too high.
*Using ALL
The condition stated:
WHERE SALARY < ALL (SELECT SALARY
The third value will not be selected either. It is not less than ALL the values from the subselect.
Topic 7.3: Workshop
*Getting Started with a WorkshopThis workshop tests your understanding of subqueries. Instead of a single problem, this workshop presents a series of questions that require you to:
- Identify appropriate applications for subqueries
- Correct invalid subqueries
- Complete partially written queries involving subselects
- Show you understand when to use = ANY, = ALL and IN in a WHERE clause
Topic 7.4: Unit 7 Summary
The following points were covered in this unit:- A subquery is required whenever you need to compare a column value from individual rows with an aggregated value.
- Rules for coding a subselect are the subselect replaces the right operand of the main SELECT's WHERE clause, the whole subselect must be enclosed in parentheses, a SUBSELECT can select only one column, and do not use ORDER BY in the subselect.
- The main SELECT of a subquery is like any regular SELECT, except that the right operand of one of its search conditions is a SUBSELECT instead of a value or a column name.
- When a single value is returned by the subselect, you can use any operators in your main SELECT except BETWEEN and LIKE.
- If the SELECT just provides a value to be used in a WHERE clause, it doesn't matter which table it got that value from. Therefore, you can use a subquery with more than one table.
- When a subselect can select more than one row, the main concern is the relational operator comparing the values of the subselect.
- When checking for equality, you can use IN, = ANY, or = SOME.
- When using other operators to check for greater or less than the values in the list, you must add ALL, ANY, or SOME.