SQL: Query Techniques
Unit 1. SQL Overview
This unit presents introductory information that is helpful while you're learning SQL — IBM's Structured Query Language.
You briefly go over concepts about relational databases in this section. In the next section, you see the data used in this course. The final section in this unit covers general topics about SQL.
After completing this unit, you should be able to:
- Explain how a relational database functions and stores data
- Identify the rules that apply to database tables
- Understand the tables that are used in this course
- Explain the basic concepts of the SQL language
Topic 1.1: Relational Concepts
*Data in TablesIn a relational database model, the data is stored in one or more tables.
Each table consists of a specific number of columns and some number of unordered rows.
One row of a sample table named STUDENT is displayed below.
STUDENTNO | FIRSTNAME | MIDINIT | LASTNAME | MAJOR | SEX |
---|---|---|---|---|---|
S10 | GERALD | A | ATTRIX | MATH | M |
STUDENTNO FIRSTNAME MIDINIT LASTNAME MAJOR SEX
————————— ———————————— ——————— ——————————————— —————————— ———
S10 GERALD A ATTRIX MATH M
S20 PHYLISS A LUTZ PHYSICS F
S30 DELBERT A SPRAY MATH M
S40 JAMES E SHUE M
S50 BEVERLY A BOGGS HISTORY F
S60 DAVE I HICKS MATH M
S70 JACKIE I RICTER BUSINESS F
S80 MARGE E GILHOULY F
S90 IRMA A SPRAY HOME ECON F
*Columns and Rows
Above is the complete table. As you can see, the table consists of six columns: STUDENTNO, FIRSTNAME, MIDINIT, LASTNAME, MAJOR and SEX.
It also consists of nine rows, one for each student.
The next section goes over this table and the others used in this course.
Topic 1.2: Rules for Tables
*Characteristics of Columns and RowsColumns of a table have the following characteristics:
- Appear vertically when displayed on a report
- Contain the same kind of data
- Have names which are used as column titles when they are displayed on a report
- Appear horizontally when displayed on a report
- Contain one or more columns of data
- Unlike columns, they have no names
*How Large Can a Table Be?
The tables you work with in this course are relatively small for ease of use and reference. However, this does not mean you are limited to working with small tables.
A table can be virtually any number of rows with one or more columns.
There is a limitation, but it is unlikely to be reached under normal conditions.
A table with zero rows is called an empty table.
*Tables Are Relations
Relational databases are based on a model formulated by Dr. E. F. Codd of IBM in 1968. He used very formal terms to describe the concepts you just saw. Here are the equivalent terms he used:
Formal Relational Term | Informal Equivalents |
---|---|
relation | table |
tuple | row or record |
attribute | column or field |
So you can see where the terms relational model and relational database come from. Tables were called relations.
In this course, you see the less formal terms — table, row and column.
Topic 1.3: Course Tables
*Example TablesThroughout this course several tables are used for examples and exercises. This section describes each and shows the table entries.
Throughout this course, the tables may be modified slightly to illustrate a point. Also, due to the size of the screen, only portions are displayed at times.
*The STUDENT Table
The STUDENT table contains student information. Each row describes one student. The columns are
STUDENTNO — student number
FIRSTNAME — first name
MIDINIT — middle initial
LASTNAME — last name
MAJOR — major subject
SEX — M or F
STUDENTNO | FIRSTNAME | MIDINIT | LASTNAME | MAJOR | SEX |
---|---|---|---|---|---|
S10 | GERALD | A | ATTRIX | MATH | M |
S20 | PHYLISS | A | LUTZ | PHYSICS | F |
S30 | DELBERT | A | SPRAY | MATH | M |
S40 | JAMES | E | SHUE | M | |
S50 | BEVERLY | A | BOGGS | HISTORY | F |
S60 | DAVE | I | HICKS | MATH | M |
S70 | JACKIE | I | RICTER | BUSINESS | F |
S80 | MARGE | E | GILHOULY | F | |
S90 | IRMA | A | SPRAY | HOME ECON | F |
*The INSTRUCT Table
The INSTRUCT table contains information about instructors. Each row describes one instructor. The columns are
INSSN — social security number
DEPTNO — department number
IFIRSTNAME — first name
IMIDINIT — middle initial
ILASTNAME — last name
SALARY — salary
EXEMPTIONS — number of exemptions
INSSN | DEPTNO | IFIRSTNAME | IMIDINIT | ILASTNAME | SALARY | EXEMPTIONS |
---|---|---|---|---|---|---|
334838999 | MTH | ELIZABETH | A | SONS | 35000 | 3 |
393209494 | MTH | ALYCE | P | SEEGERS | 40000 | 3 |
382949898 | ENG | STEWART | I | GOTZ | 31000 | 4 |
893993999 | ENG | JOHN | A | RABBET | 41000 | 0 |
998329399 | HST | THOMAS | A | SEEGERS | 40000 | 2 |
*The GRADE Table
The GRADE table contains numeric and character data for grades. Each row is for one grade. The columns are
GRADE — character grade value
NUMGRADE — numeric grade value
GRADE | NUMGRADE |
---|---|
A | 4.0000 |
B | 3.0000 |
C | 2.0000 |
D | 1.0000 |
F | 0.0000 |
*The CLASS Table
The CLASS table contains data about specific classes. Each row defines one class. The columns are
CLASSNO — class number
CLASSNAME — class name
CLASSCOST — cost for class
COURSEHRS — course hours
CLASSNO | CLASSNAME | CLASSCOST | COURSEHRS |
---|---|---|---|
E01 | ENGLISH 101 | 500.00 | 4 |
E02 | ENGLISH 102 | 450.00 | 3 |
M01 | MATH 101 | 600.00 | 3 |
M02 | MATH 102 | 700.00 | 3 |
M03 | MATH 103 | 800.00 | 3 |
P01 | PHYSICS 101 | 600.00 | 3 |
H01 | HISTORY | 510.00 | 4 |
*The SECTION Table
The SECTION table contains information about each section that is taught for a specific class. Each row describes one section. The columns are
CLASSNO — class number
SECTIONNO — section number
INSSN — instructor's social security number
ROOMNO — room number
CLASSNO | SECTIONNO | INSSN | ROOMNO |
---|---|---|---|
E01 | 001 | 893993999 | 301 |
E01 | 002 | 893993999 | 302 |
E02 | 001 | 382948398 | 205 |
M01 | 001 | 334838999 | 389 |
M02 | 001 | 334838999 | 239 |
M02 | 002 | 393209494 | 222 |
M03 | 001 | 393209494 | 203 |
P01 | 001 | 393209494 | 105 |
H01 | 001 | 998329399 | 333 |
*The ROSTER Table
The ROSTER table contains data that identifies each student signed up for a specific section of a class. Each row describes one class for a student. The columns are
CLASSNO — class number
SECTIONNO — section number
STUDENTNO — student number
GRADE — grade
CLASSNO | SECTIONNO | STUDENTNO | GRADE |
---|---|---|---|
E01 | 001 | S20 | A |
E01 | 001 | S30 | B |
E01 | 001 | S70 | B |
E01 | 002 | S80 | A |
E01 | 002 | S10 | C |
E01 | 002 | S40 | B |
E02 | 001 | S50 | C |
E02 | 001 | S60 | A |
E02 | 001 | S90 | D |
M01 | 001 | S40 | A |
ROSTER Table Continued
*
CLASSNO | SECTIONNO | STUDENTNO | GRADE |
---|---|---|---|
M01 | 001 | S70 | D |
M02 | 001 | S90 | B |
M02 | 001 | S80 | F |
M02 | 001 | S10 | C |
M01 | 001 | S30 | B |
M02 | 002 | S20 | C |
M02 | 002 | S60 | D |
M02 | 002 | S80 | A |
M02 | 002 | S90 | A |
M02 | 002 | S50 | F |
M03 | 001 | S10 | D |
M03 | 001 | S40 | D |
M03 | 001 | S70 | D |
M03 | 001 | S60 | C |
M03 | 001 | S90 | A |
H01 | 001 | S10 | C |
H01 | 001 | S40 | B |
H01 | 001 | S70 | B |
ROSTER Table Continued
*
CLASSNO | SECTIONNO | STUDENTNO | GRADE |
---|---|---|---|
H01 | 001 | S80 | B |
H01 | 001 | S30 | B |
P01 | 001 | S10 | B |
P01 | 001 | S20 | B |
P01 | 001 | S30 | B |
P01 | 001 | S40 | B |
P01 | 001 | S50 | B |
P01 | 001 | S60 | B |
P01 | 001 | S70 | B |
P01 | 001 | S80 | B |
P01 | 001 | S90 | B |
Topic 1.4: SQL Concepts
*A Relational Database InterfaceSQL is a high-level language that allows you to access and maintain data stored in relational databases, such as DB2 and SQL/DS. Products like FOCUS, SAS and Oracle, which create and maintain relational tables, also use SQL. SQL is one interface to products in these relational systems.
SQL is designed for programmers, system administrators and data processing professionals. Yet, it is especially nice for users who have little training in computer programming. With SQL you specify what you want rather than how to get it. You do not have to be concerned with how the data is stored, or how it is retrieved.
*Power and Simplicity
SQL is rapidly becoming the industry standard for fourth generation query languages. SQL is easy to learn and easy to use.
Power and simplicity are both features of SQL. You can use single SQL statements to perform functions that require many lines of conventional code.
SQL has these three basic components:
- Data Definition Language (DDL)
- Data Manipulation Language (DML)
- Authorization Language
Topic 1.4.1: Data Definition Language
*Defining New TablesDDL (Data Definition Language) is used to define new tables. The columns to be included in your table are defined using the CREATE statement. Here is a sample CREATE statement for the STUDENT table used in this course.
CREATE TABLE STUDENT
(STUDENTNO CHAR(3) NOT NULL,
FIRSTNAME VARCHAR(20),
MIDINIT CHAR(1),
LASTNAME VARCHAR(25) NOT NULL,
MAJOR CHAR(15),
SEX CHAR(1) NOT NULL)
As you can see, the table consists of 6 columns: STUDENTNO, FIRSTNAME, MIDINIT, LASTNAME, MAJOR and SEX.
*Define and Modify Online
DDL statements permit you to create, change and delete various components of the relational database. In fact, you can define and modify the database online without special utility programs.
Thus, you can interactively:
- create and manipulate tables or
- test SQL statements you plan on using in programs
Topic 1.4.2: Data Manipulation Language
*DML Basic FunctionsDML (Data Manipulation Language) has two basic functions:
- Creating reports
- Maintaining tables
Let's look at a sample report request.
STUDENT Table Report
For example, the statement below requests a report of all columns and all rows for the table named STUDENT.
SELECT * FROM STUDENT
STUDENTNO | FIRSTNAME | MIDINIT | LASTNAME | MAJOR | SEX |
---|---|---|---|---|---|
S10 | GERALD | A | ATTRIX | MATH | M |
S20 | PHYLISS | A | LUTZ | PHYSICS | F |
S30 | DELBERT | A | SPRAY | MATH | M |
S40 | JAMES | E | SHUE | M | |
S50 | BEVERLY | A | BOGGS | HISTORY | F |
S60 | DAVE | I | HICKS | MATH | M |
S70 | JACKIE | I | RICTER | BUSINESS | F |
S80 | MARGE | E | GILHOULY | F | |
S90 | IRMA | A | SPRAY | HOME ECON | F |
*The SELECT * Retrieval Request
SELECT * is the simplest retrieval request. Other functions of the SELECT statement include:
- Limit the request to specific columns.
- Limit the request to specific rows.
- Order the rows presented on the request.
- Develop new column values through calculations.
- Accumulate numeric column values within specific groupings.
- Combine data from two or more tables through a common key.
*Maintaining Tables
The other function of DML is to maintain tables.
Here's a simple request to create a new row in the STUDENT table.
INSERT INTO STUDENT
VALUES ('S02', 'RALPH', 'A', 'LARSON', 'PHYSICS', 'M')
This request adds the new row containing the values listed in parentheses.
Topic 1.4.3: Authorization Language
*Provides Control over ResourcesThe function of the Authorization Language is to provide effective control over the database resources.
Security in relational databases like DB2 and SQL/DS can be summarized in one sentence.
Nothing can be done unless you have been given the authority.
Authority is controlled by the data administration area through SQL commands.
GRANT and REVOKE Commands
Although you need authority to do anything with the database, it is a relatively simple matter to obtain it.
Authority is given through the GRANT command. For example, the command below gives user PROG1 authority to create a table in the database named SKILLS.
GRANT CREATETAB ON DATABASE SKILLS TO PROG1
Authority may also be removed with the REVOKE command. Here's an example removing CREATE authority for tables in database SKILLS from PROG1.
REVOKE CREATETAB ON DATABASE SKILLS FROM PROG1
Topic 1.4.4: SQL Syntax
*Interactive StatementsYou can use SQL statements interactively. For example, via IBM's Query Management Facility (QMF) or an interactive facility provided with the relational database.
You can also use SQL statements embedded in application programs such as APL2, BASIC, COBOL, PL/I, FORTRAN, LISP or assembler language programs
These various implementations of SQL, however, are semantically and syntactically the same. What you learn here can be carried to any SQL-based database management system.
*Syntax Rules for Coding
When coding SQL statements, follow these syntax rules:
- Variables and keywords must be separated by one or more spaces.
- Parentheses are often required around a keyword's values. Commas must separate values in a list (e.g. column names, table names, data values)
- Spacing around commas, parentheses and operators (=,>,+,*,etc.) is for readability. Spaces are not required.
- Character strings are sequences of characters that begin and end with a string delimiter. A string delimiter is either ' (apostrophe) or " (quotes) depending on the host language and the character chosen when the database system is installed. Use the same character (' or ") for the beginning and ending delimiter.
INSERT INTO GRADE VALUES ('E01',0.000)
INSERT INTO GRADE VALUES('E01',0.000)
INSERT INTO GRADE VALUES ( 'E01', 0.000 )
*An Example of Syntax Rules
Above are three valid codings of the same statement showing the syntax rules.
Spaces are required between each keyword or variable (INSERT, INTO, GRADE, VALUES). In the VALUES portion, parentheses are needed around the values and commas must separate the values. The character string E01 is delimited by apostrophes. Spaces around parentheses and commas are optional.
*Course Conventions
In this course, you will see a few conventions followed for readability or emphasis. These conventions are
- Each clause is entered on a separate line; for example
SELECT * FROM ROSTER
WHERE STUDENTNO = 'S10'
rather than
SELECT * FROM ROSTER WHERE STUDENTNO = 'S10' - Information coded between the symbols < > is optional
- ... indicates information that repeats
Topic 1.5: Unit 1 Summary
This unit covered the following points:- Databases based on a relational model store data in one or more tables.
- Each table in a relational database consists of
- A specific number of columns
- Some number of unordered rows
- Columns in a relational database
- Appear vertically in a report
- Contain the same kind of data
- Have names that are used as column titles
- Rows in a relational database
- Appear horizontally in a report
- Contain one or more columns of data
- Do not have names
- Formal terms to describe SQL concepts are:
- Tables are named relations
- Columns are called attributes
- Rows are known as tuples
- Structured Query Language (SQL) is a high-level language that allows you to access and maintain data stored in relational databases, such as DB2 and SQL/DS. Products like FOCUS, SAS and Oracle, which create and maintain relational tables, also use SQL. With SQL you do not have to be concerned with how the data is stored, or how it is retrieved.
- DDL (Data Definition Language) allows you to define new tables in the database. You can define and modify the database online without special utility programs.
- DML (Data Manipulation Language) is used to create reports and maintain tables.
- The Authorization Language provides control of database resources. Authority is controlled by the data administration area through SQL commands.
- When coding SQL statements, follow these syntax rules:
- Variables and keywords must be separated by one or more spaces.
- Parentheses are often required around a keyword's values. Commas must separate values in a list.
- Spacing around commas, parentheses and operators is for readability. Spaces are not required.
- Character strings are sequences of characters that begin and end with ' or ". Use the same character (' or ") for the beginning and ending delimiter.
Unit 2. Selecting Columns
This unit begins the discussion of the specific details of writing SQL queries.
After completing this unit, you should be able to:
- Write a SQL statement to view a table
- Write a SQL statement that produces a report showing specified columns form a table
- Add a qualifier to a table name to view a table that is not prefixed by your authorization id.
Topic 2.1: Viewing Tables
*A Easy and Frequent QueryOne of the simplest, and most common queries is to view a complete table. To do this, all you need is the name of the table.
For example, the statement
SELECT * FROM STUDENT
requests all columns and all rows for the table named STUDENT.
*The Resulting Report
Here is the report for the query SELECT * FROM STUDENT.
STUDENTNO | FIRSTNAME | MIDINIT | LASTNAME | MAJOR | SEX |
---|---|---|---|---|---|
S10 | GERALD | A | ATTRIX | MATH | M |
S20 | PHYLISS | A | LUTZ | PHYSICS | F |
S30 | DELBERT | A | SPRAY | MATH | M |
S40 | JAMES | E | SHUE | M | |
S50 | BEVERLY | A | BOGGS | HISTORY | F |
S60 | DAVE | I | HICKS | MATH | M |
S70 | JACKIE | I | RICTER | BUSINESS | F |
S80 | MARGE | E | GILHOULY | F | |
S90 | IRMA | A | SPRAY | HOME ECON | F |
INSSN DEPTNO IFIRSTNAME IMIDINIT ILASTNAME SALARY EXEMPTIONS
————————— ————— ———————————— ———————— ————————— —————— ——————————
334838999 MTH ELIZABETH A SONS 35000 3
393209494 MTH ALYCE P SEEGERS 40000 3
382949898 ENG STEWART I GOTZ 31000 4
893993999 ENG JOHN A RABBET 41000 0
998329399 HST THOMAS A SEEGERS 40000 2
*The INSTRUCT Report
The report above is the result of the statement SELECT * FROM INSTRUCT
If the table is very large, you may not want to see it all. Look now at limiting your request to specific columns.
Topic 2.2: Selecting Columns
*Specifying ColumnsColumns are identified by naming each one in the SELECT statement using the following format
SELECT column1, column2, column3, ... FROM tablename
where:
column1, column2 and column3 are the names of the columns you want printed.
tablename is the name of the table that contains the columns.
Each column appears in the order specified, not in the order they are listed in the table.
*Columns in Specified Order
The SQL statement
SELECT INSSN, IFIRSTNAME, ILASTNAME FROM INSTRUCT
gives the result shown. Notice the columns are in the order specified by the request.
INSSN IFIRSTNAME ILASTNAME
————————— ———————————————————— —————————————————————————
334838999 ELIZABETH SONS
393209494 ALYCE SEEGERS
382949898 STEWART GOTZ
893993999 JOHN RABBET
998329399 THOMAS SEEGERS
*The Resulting Report
This report results from the query SELECT FIRSTNAME, LASTNAME FROM STUDENT. As you can see, only two columns get selected.
FIRSTNAME LASTNAME
———————————————————— —————————————————————————
GERALD ATTRIX
PHYLISS LUTZ
DELBERT SPRAY
JAMES SHUE
BEVERLY BOGGS
DAVE HICKS
JACKIE RICTER
MARGE GILHOULY
IRMA SPRAY
*All Rows Are Retrieved
When you enter a SELECT statement, it retrieves the rows that meet the search condition. It retains all those rows in the final result table unless you limit the rows.
For example: SELECT CLASSNO, SECTIONNO FROM ROSTER
gives this result:
CLASSNO | SECTIONNO |
E01 | 001 |
E01 | 001 |
E01 | 001 |
E01 | 002 |
E01 | 002 |
E01 | 002 |
E02 | 001 |
Duplicate rows appear unless you limit the rows.
Topic 2.3: Eliminating Duplicate Rows
*Using DISTINCT
Duplicate rows can be eliminated from the report by simply adding the word DISTINCT in front of the columns selected.
For example:
SELECT DISTINCT CLASSNO, SECTIONNO FROM ROSTER
gives this result:
CLASSNO | SECTIONNO |
E01 | 001 |
E01 | 002 |
E02 | 001 |
M01 | 001 |
M02 | 001 |
M02 | 002 |
M03 | 001 |
H01 | 001 |
P01 | 001 |
In this case, duplicate rows are eliminated.
*Table Qualifiers
Before you move on to the next unit, let's look a little more at table names.
In all the examples so far, you've simply entered a name, like ROSTER. In some cases, you must enter a qualifier to access the table.
For example, the request below uses a qualified name for the table.
SELECT PROJNO, PROJNAME FROM DSN8.TPROJ
*Authorization ID Prefix
A qualifier, like DSN8, is required if the table name is not prefixed by your authorization id. That is, if you leave off the qualifier it is assumed that the table starts with your id.
For example, if your authorization id is ZTCB03, and you entered the following SELECT statement:
SELECT * FROM ROSTER
the table name is assumed to be ZTCB03.ROSTER.
This makes it simpler to access tables that you create. In other words, when accessing a table that has a prefix other than your own, you must prefix the table name.
Topic 2.4: Unit 2 Summary
The following topics were covered in this unit:- The statement SELECT * FROM tablename views a complete table. This statement prints all columns and all rows from the named table.
- You may identify specific columns by naming each one in the SELECT statement using this format
SELECT column1, column2, column3, ...
FROM tablename
column1, column2 and column3 are the names of the columns you want printed. tablename is the name of the table that contains the columns.
- Duplicate rows can be eliminated from a report by simply adding the word DISTINCT in front of the columns selected. For example,
SELECT DISTINCT CLASSNO, SECTIONNO FROM ROSTER
- Table names require a qualifier if they are not prefixed by your authorization id. If you leave off the qualifier, it is assumed that the table name starts with your id.
For example, assume your authorization id is ZTCB03. If you enter the table name ROSTER, it is interpreted as ZTCB03.ROSTER. However, the qualified name DSN8.ROSTER does not have your authorization id added to it.
Unit 3. Selecting and Sorting Rows
In the previous unit, you learned the SQL statements to select columns. Consider now the rows in the table you are viewing. In this unit, you will learn how to sort the rows in your table.
You will also learn how to sort the rows once they're selected.
After completing this unit, you should be able to:
- Write a SQL statement that limits the rows displayed in the resulting report
- Write a SQL statement that compares numeric and non-numeric data
- Explain the requirements for character columns and numeric columns
- Use options with the WHERE clause to further define the data you want to display
- Use clauses with multiple search conditions
- Use the ORDER BY clause to specify the order in which rows appear
- Special multiple sort fields in the ORDER BY clause
Topic 3.1: Selecting Rows
*Indicating Specific RowsIn some cases it is not necessary to list all the rows in your report. Let's see how you indicate specific rows.
Limit the rows selected using the WHERE clause to determine which rows to include in the report. The basic format of the WHERE clause is
WHERE expression1 operator expression2
This clause is a comparison between two values or expressions.
*Valid Operators
The type of comparison is controlled by the operator. The following is a list of valid operators.
Type of Comparison | Operator |
---|---|
Equal to | = |
Not equal to | <> |
Less than | < |
Less than or equal to | <= |
Greater than | > |
Greater than or equal to | >= |
The logical not character can cause errors in statements passed from one DBMS to another. It is therefore preferable to use not equal to.
*Using the WHERE Clause
Here are some examples of valid WHERE clauses.
WHERE DEPTNO = 'MTH'
Only rows that have a department number (DEPTNO) equal to mathematics (MTH) are selected.
Notice when you compare a value that has letters or letters and numbers, it must be enclosed in single quotes.
WHERE SALARY >= 30000
Only rows that have a salary (SALARY) greater than or equal to 30000 are selected.
*Reporting Results
This sample request with the WHERE clause...
SELECT INSSN, IFIRSTNAME, ILASTNAME FROM INSTRUCT
WHERE DEPTNO = 'MTH'
Results in...
INSSN | IFIRSTNAME | ILASTNAME |
334838999 | ELIZABETH | SONS |
393209494 | ALYCE | SEEGERS |
These are the two instructors that have a department number equal to MTH.
Topic 3.2: Comparing Numeric and Non-Numeric Data
*Non-Numeric Values Must Match Exactly
One thing to keep in mind is how SQL does comparisons of non-numeric and numeric data. For example, consider this WHERE clause:
WHERE STATE_BORN = 'N. CAROLINA'
SQL, not knowing the difference, applies all search criteria exactly as you state them. Any data stored in the table as NORTH CAROLINA, and not N. CAROLINA, is not found.
SQL does a character-by-character comparison for non-numeric data.
Numeric values, however, behave a little differently.
*Numeric Values Can Be Equivalent
While non-numeric values have to match character-for-character, numeric values need only to be equivalent. The WHERE clauses
WHERE PRICE = 2000.00 and WHERE PRICE = 2000
select the same data. Any row with 2000 or 2000.00 for PRICE is found by both WHERE clauses. Matching on a numeric column is based on equivalent values and 2000 = 2000.00.
Topic 3.3: Character and Numeric Columns
*Check the Data TypeWhen a table is created, each column is designated as being either a character column or a numeric column. A character column can contain any kind of data, but numeric columns must contain numeric values.
When you code a WHERE clause, therefore, you should be aware of whether a column is a character or numeric column.
It is obvious that the PRES_NAME column is a character column, while a column called SALARY is likely to be numeric. Some columns, such as ACCOUNT, could be either.
It is always advisable to check the data type of questionable columns before you use them.
Topic 3.4: Options for the WHERE Clause
*Special WHERE OptionsSQL provides four special options that can be used in the WHERE clause.
IN | to specify a list of values to be selected from |
BETWEEN | to specify a range of values |
LIKE | to do a partial search for a character string within a column value |
NULL | to search for columns that have (or have not) been assigned values |
Let's see how these options work.
IN
The format of the IN option is
expression IN(value1, value2, ...)
An example is
WHERE DEPTNO IN('ENG', 'MTH', 'PHY')
This example selects only rows that have one of the three listed values of DEPTNO.
BETWEEN
The format of the BETWEEN option is
expression <NOT> BETWEEN expression1 AND expression2
An example is
WHERE SALARY BETWEEN 20000 AND 35000
Notes:
- The symbols < > indicate that the enclosed information may optionally be coded
- The limits specified by expression1 and expression2 are inclusive
- Always put the LOWER value first
- Link the two values with AND
NULL
The format of the NULL option is
columnname IS <NOT> NULL
If a column has no value currently assigned to it, this is called a null value. This option checks for a null value. An example is
WHERE FIRSTNAME IS NOT NULL
This statement selects only rows that do not have a null value assigned to the column FIRSTNAME.
LIKE
LIKE is a special option enabling you to search a column for partial values. The format is
WHERE fieldname <NOT> LIKE 'string'
The quoted string on the right side of LIKE is called the pattern. The pattern may contain any character string, with special meanings for the characters _ and %.
_ (underscore) represents any character.
% (percent sign) represents any string of zero or more characters.
Let's see some examples.
*Special Characters
WHERE LASTNAME LIKE 'BEL%' selects any students whose last name begins with the characters BEL.
WHERE LASTNAME LIKE '%BEL' selects any student whose last name ends with the characters BEL.
WHERE LASTNAME LIKE '%BEL%' selects any students that have the characters BEL anywhere in their name.
WHERE LASTNAME LIKE '_BEL%' selects any students who have the characters BEL in the second, third and fourth positions of their last name.
*Resulting Report
INSSN | DEPTNO | IFIRSTNAME | IMIDINIT | ILASTNAME |
334838999 | MTH | ELIZABETH | A | SONS |
393209494 | MTH | ALYCE | P | SEEGERS |
998329399 | HST | THOMAS | A | SEEGERS |
This is the report produced by the request
SELECT * FROM INSTRUCT WHERE ILASTNAME LIKE 'S%'.
Now, what if you need to search a column for a character string that includes an underscore or a percent sign?
*The ESCAPE Clause
Use the ESCAPE clause and an escape character with the LIKE option to search for the _ or % characters. The format is
WHERE fieldname <NOT> LIKE 'string' <ESCAPE 'character'>
The following WHERE clause selects any salary increase that begins with the string AVERAGE and ends with 15%.
WHERE SALARY_INC LIKE 'AVERAGE%15*%' ESCAPE '*'
- The ESCAPE clause defines the escape character as *.
- The first % represents any string of characters, as usual.
- The second % is preceded by the escape character *, so it is interpreted as part of the search string.
Topic 3.5: Multiple Search Conditions
*Combining Search ConditionsThe WHERE clauses you've seen so far contained a single search condition. Consider now clauses with multiple search conditions. You can combine multiple search conditions by connecting them with an OR or AND.
For example, the following request limits the report to include only instructors in the math department who have a salary greater than 35000.
WHERE DEPTNO = 'MTH' AND SALARY > 35000
The next request limits the report to include only instructors in the math department or instructors in other departments that are making more than 35000.
WHERE DEPTNO = 'MTH' OR SALARY > 35000
STUDENTNO FIRSTNAME LASTNAME
————————— ———————————————————— ————————
S10 GERALD ATTRIX
S30 DELBERT SPRAY
S60 DAVE HICKS
*Resulting Report
Above is the report listing the male students who are MATH majors.
If more than two search conditions are used and specified by an AND or OR keyword, use parentheses to make sure the intended condition is not interpreted another way. The next screen has an example.
*Using Parentheses for Clarity
This WHERE clause
WHERE (DEPTNO='MTH' AND SALARY > 35000)
OR
(EXEMPTIONS > 5)
selects the row of each instructor that satisfies at least one of the following conditions:
- The instructor is in the math department AND has a salary greater than 35000.
- The instructor's exemptions are greater than 5.
If the parentheses are moved or not present, the meaning of the WHERE clause changes significantly.
*Parentheses Placement Gives Different Meanings
WHERE (DEPTNO='MTH' AND SALARY > 35000)
OR
(EXEMPTIONS > 5)
The WHERE clause on the bottom has the same conditions as the one on the top. However, the placement of the parentheses gives the clause a very different meaning.
The clause on the right selects the row of each instructor that satisfies both of the following conditions:
- The instructor is in the math department.
- The instructor's salary is greater than 35000 or the instructor's exemptions are greater than 5.
*Processing Order Without Parentheses
WHERE (DEPTNO='MTH' AND SALARY > 35000)
OR
(EXEMPTIONS > 5)
When you don't use parentheses, SQL uses this order in processing search conditions: NOT
AND
OR
Consequently,
DEPTNO='MTH' AND SALARY > 35000 OR EXEMPTIONS > 5
behaves as if there were parentheses enclosing the first two conditions.
Topic 3.6: Sorting Rows
*The ORDER BY ClauseThe ORDER BY clause allows you to specify the order in which rows appear. If you use ORDER BY, it must be the last clause in the SELECT statement.
The format is
ORDER BY columnname <, columnname ...>
The default order is ascending (ASC). To list the columns in descending order, you must specify DESC following the column name.
*Row Order Is Unpredictable
If the ORDER BY is not used, the order in which the rows appear on the report is unpredictable. In other words, do not assume they will display on the report in the order stored in the table.
When using ORDER BY, sorting rules are applied.
- For a purely alphabetic column, rows are sorted in alphabetical order from A to Z, just as in the dictionary.
- For a column containing alphabetic and numeric characters, alphabetic characters are considered lower than numeric. For example: 'ABC' comes before 'A99', which comes before '001'
- Other characters, such as spaces or dollar signs are considered lowest of all. For example: '$1' is less than 'A1', which is less than '1'.
Note: These sorting rules may vary, depending on the platform where you're using SQL.
*An Example of Row Order
For example, this SQL statement:
SELECT INSSN, IFIRSTNAME, ILASTNAME
FROM INSTRUCT
WHERE DEPTNO='MTH'
gives this result:
INSSN IFIRSTNAME ILASTNAME
-------------- --------------- ------------
334838999 ELIZABETH SONS
393209494 ALYCE SEEGERS
Notice the instructors are listed in ascending order by social security number (INSSN).
*Another Example
SELECT INSSN, IFIRSTNAME, ILASTNAME
FROM INSTRUCT
WHERE DEPTNO='MTH'
gives this result:
INSSN IFIRSTNAME ILASTNAME
-------------- --------------- ------------
393209494 ALYCE SEEGERS
334838999 ELIZABETH SONS
Here the instructors are listed in descending order by social security number (INSSN).
Topic 3.7: Multiple Sort Fields
*Using Multiple Sort FieldsYou can specify multiple sort fields in the ORDER BY clause. For example, the SQL request
SELECT * FROM SECTION
WHERE CLASSNO LIKE 'E__'
ORDER BY CLASSNO, SECTIONNO
gives the result:
CLASSNO | SECTIONNO | INSSN | ROOMNO |
---|---|---|---|
E01 | 001 | 893993999 | 301 |
E01 | 002 | 893993999 | 302 |
E02 | 001 | 382949898 | 205 |
Notice the rows are listed in order by section number (SECTIONNO) within class number (CLASSNO).
*Using Order Numbers
Instead of column names, you can also use numbers that correspond to the order in which rows are selected. The following example orders rows by CLASSNO and SECTIONNO.
SELECT CLASSNO, SECTIONNO, INSSN FROM SECTION
WHERE CLASSNO LIKE 'E__'
The 1 refers to the first column selected by the query. CLASSNO, therefore, becomes the primary sort field.
The 2 refers to the second column selected by the query. SECTIONNO, thus, becomes the secondary sort field.
INSSN DEPTNO IFIRSTNAME IMIDINIT ILASTNAME
————————— —————— ———————————————————— ———————— —————————————————————
382949898 ENG STEWART I GOTZ
893999999 ENG JOHN A RABBET
998329399 HST THOMAS A SEEGERS
334838999 MTH ELIZABETH A SONS
393209494 MTH ALYCE P SEEGERS
*Ascending by Alpha and SSN
The report lists departments in ascending alphabetical order. Within a department, the entries are in ascending order by social security number.
Topic 3.8: Unit 3 Summary
The following points were covered in this unit:- Limit the rows selected in your query using the WHERE clause to determine which rows to include. The basic format of the WHERE clause is
WHERE expression1 operator expression2
- The type of comparison performed by a WHERE clause is controlled by the operator used. Valid operators are
Equal to =
Not equal to <>
Less than <
Less than or equal to <=
Greater than >
Greater than or equal to >=
- Use the IN option of the WHERE clause to specify a list of values to be selected from. The format of the IN option is
WHERE expression IN(value1, value2, ...)
- The BETWEEN option of the WHERE clause specifies a range of values. The format of the BETWEEN option is
WHERE expression <NOT> BETWEEN expression1 AND expression2
- Use the NULL option in the WHERE clause to search for columns that have (or have not) been assigned values. The format for the NULL option is
columnname IS <NOT> NULL
- The LIKE option of the WHERE clause searches a column value for partial values. The format of the LIKE option is
WHERE fieldname <NOT> LIKE 'string'
string may contain any character string. Use a percent sign (%) to represent any string of zero or more characters. Use an underscore (_) to indicate any one character.
- Use the ESCAPE clause and an escape character with the LIKE option to search for the _ or % characters. The format is
WHERE fieldname <NOT> LIKE 'string' <ESCAPE 'character'>
- Combine multiple search conditions by connecting them with AND or OR. All conditions connected with AND must be met to satisfy the search criteria. Meeting any of the conditions connected with OR satisfies the search criteria.
- Use parentheses to group conditions when specifying two or more search conditions. The parentheses ensure that the conditions in the WHERE clause are interpreted as you want.
When you don't use parentheses, SQL uses this order in processing search conditions: NOT, AND, OR.
- The ORDER BY clause allows you to specify the order in which rows appear for your SQL queries. The format is
ORDER BY columnname <, columnname ...>
- Ascending order is assumed for the column specified. DESC following the column name changes the order to descending.
- Multiple sort fields are specified by listing additional column names. For example
ORDER BY CLASSNO, SECTIONNO
lists rows in order by SECTIONNO within CLASSNO.
- Instead of column names, you can also use numbers that correspond to the order in which rows are selected. The following example orders rows by CLASSNO and SECTIONNO.
SELECT CLASSNO, SECTIONNO, INSSN FROM SECTION
WHERE CLASSNO LIKE 'E__'
Unit 4. SQL Built-In Functions
This unit discusses basic query techniques. In this section, you learn about performing calculations with functions that are built into SQL. SQL's built-in functions can be used to calculate values which can then be displayed.
You'll also see how your report information can be grouped.
After completing this unit, you should be able to:
- Use built-in functions to perform calculations on groups of values
- Use the GROUP BY query to create a summary table
- Use the HAVING clause to qualify conditions to groups
Topic 4.1: Built in Functions
*Rows Used for CalculationsColumn values of selected rows only are the basis for the calculations. That is, all rows that satisfy the WHERE clause can be used with these built-in functions. Calculation is done for all
Keywords | Selected Rows |
SUM (columnname) | the total value |
MIN (columnname) | the minimum value |
AVG (columnname) | the average value |
MAX (columnname) | the maximum value |
COUNT (*) | the number of selected rows |
COUNT (DISTINCT columnname) | the number of unique values in the selected set of rows |
*Salary Report
For example, the following request:
SELECT AVG(SALARY), MAX(SALARY), COUNT(*)
FROM INSTRUCT
gives this result:
COL1 COL2 COL3
——————————— ————————————— ———————————
37400.0000000 41000.00 5
This result is a three-column report showing the average salary, highest salary and the total number of instructors in the whole table.
COL1 COL2 COL3
————————————— ————————————— ———————————
37400.0000000 41000.00 5
*Groups of Values
By their nature, built-in functions apply to groups of values.
In the example listed below, a one-line report results even though 5 rows are in the table.
SELECT AVG(SALARY), MAX(SALARY), COUNT(*)
FROM INSTRUCT
*Accuracy of Averages
In addition to NULLS, you need to be concerned about the accuracy of averages calculated by SQL.
For example, the YRS_SERVED column is an INTEGER (i.e., no decimal places). The average of 2, 3, and 3 is 2.667, but SQL calculates it as 2. Instead of rounding 2.667 up to 3, it just drops any decimals.
The solution is to convert YRS_SERVED to a decimal column. This can be done on the SELECT statement.
Multiplying YRS_SERVED by 1.0 produces a decimal column in the table. The decimal average will be 2.6666666666666.
SELECT AVG(YRS_SERVED) = 2
SELECT AVG(1.0*YRS_SERVED) = 2.6666666666666
*Excluding Rows
Built-in functions apply to the rows selected. But how do you exclude rows in the resulting report?
That's right... use your old friend the WHERE clause.
*Using WHERE to Exclude Rows
The WHERE clause can be used to exclude certain rows from the result. For example, if you have some non-salaried instructors that have a zero value in the salary field, you might want to exclude them.
Here's a request that does that.
SELECT AVG(SALARY) FROM INSTRUCT
WHERE SALARY <> 0
The WHERE clause eliminates zero values from being reflected in the resulting report. Remember, null values are automatically excluded.
COL1 COL2 COL3
—————————————————— ————————————— ———————————
75000.00 37500.0000000 2
WHERE Can Control Request Items
Here's another example of using a WHERE clause to control the items involved in the request.
SELECT SUM(SALARY), AVG(SALARY), COUNT(*)
FROM INSTRUCT
WHERE DEPTNO = 'MTH'
The one-line report displays the accumulated salary, average salary and the number of instructors in the math department. The WHERE clause eliminates all other departments from the request.
COL1 COL2
————————————— —————————————
41000.00 31000.00
*A Report with MAX and MIN
This report results from the query. The maximum (41000) and minimum (31000) salaries in the English department display.
When producing reports from built-in functions, you probably feel the urge to
- Change column titles to something more meaningful
- Reduce the number of significant digits to the right of the decimal point
For example, this report...
DEPTNO COL1 COL2 COL3 COL4
—————— ————————————— ————————————— —————————————————— ———————————
ENG 41000.00 31000.00 36000.0000000 2
MTH 40000.00 35000.00 37500.0000000 2
can be changed to look like this...
DEPTNO MAX MIN AVERAGE COUNT
—————— ————————————— ————————————— —————————————————— ———————————
ENG 41000.00 31000.00 36000.00 2
MTH 40000.00 35000.00 37500.00 2
*More Report Changes
These changes are possible, but they are beyond the scope of this course.
Topic 4.2: Grouping Report Information
*Using GROUP BYIn the last section, you learned about the built-in functions available with SQL. You saw that when a built-in function is used, it applies to all rows selected for the report.
If that one-line table really didn't provide enough detail, you can create a summary table using GROUP BY in your query.
*Putting Data in Specific Groups
The information for your report can be put into specific groups using the GROUP BY clause, written as GROUP BY columnname
For example, the GROUP BY clause in this request:
SELECT DEPTNO, SUM(SALARY), AVG(SALARY), COUNT(*) FROM INSTRUCT
GROUP BY DEPTNO
ORDER BY DEPTNO
returns one row for each value of DEPTNO in the INSTRUCT table.
DEPTNO COL1 COL2 COL3
—————— —————————————————— —————————————————— ———————————
ENG 72000.00 36000.0000000 2
HST 40000.00 40000.0000000 1
MTH 75000.00 37500.0000000 2
*Processing a SELECT with GROUP BY
SELECT DEPTNO, SUM(SALARY), AVG(SALARY), COUNT(*) FROM INSTRUCT
GROUP BY DEPTNO
ORDER BY DEPTNO
First all the rows and columns are selected based on the SELECT and any WHERE clause. This intermediate result is kept in a temporary table:
DEPTNO | SALARY |
---|---|
MTH | 35000 |
MTH | 40000 |
ENG | 31000 |
ENG | 41000 |
HST | 40000 |
*Temporary Tables for Each GROUP
SELECT DEPTNO, SUM(SALARY), AVG(SALARY), COUNT(*) FROM INSTRUCT
GROUP BY DEPTNO
ORDER BY DEPTNO
Then the system creates a temporary table for each GROUP and puts the appropriate rows in each table.
DEPTNO | SALARY | DEPTNO | SALARY | DEPTNO | SALARY |
---|---|---|---|---|---|
MTH | 35000 | ENG | 31000 | HST | 40000 |
MTH | 40000 | ENG | 41000 |
*Calculations and Output
SELECT DEPTNO, SUM(SALARY), AVG(SALARY), COUNT(*) FROM INSTRUCT
GROUP BY DEPTNO
ORDER BY DEPTNO
Now the sum, average and count for each group can be calculated and output to the new table.
DEPTNO | COL1 | COL2 | COL3 |
---|---|---|---|
ENG | 72000.00 | 36000.0000000 | 2 |
HST | 40000.00 | 40000.0000000 | 1 |
MTH | 75000.00 | 37500.0000000 | 2 |
The ORDER BY statement is not required. It only ensures the order of the rows is ascending order by DEPTNO.
The order in which the clauses are entered is critical. Specifically, the GROUP BY must come before the ORDER BY and after the FROM.
*Items Must be Built In or Named
When you use the GROUP BY option, all the items included on the SELECT list must be either built-in functions or named in the GROUP BY clause.
For example, this request is invalid.
SELECT DEPTNO, SALARY, AVG(SALARY), COUNT(*)
FROM INSTRUCT
GROUP BY DEPTNO
ORDER BY DEPTNO
SALARY is on the SELECT list, but it is not part of a built-in function. Since SALARY is not part of the GROUP BY, this query generates an error.
Topic 4.3: Qualifying Conditions
*The HAVING ClauseThe HAVING clause allows you to qualify conditions to groups. The purpose of the HAVING clause is to further define the group.
Only table data which satisfies the condition for the specified group is returned.
HAVING condition is the clause's format.
*Result of Using HAVING
For example, the following request:
SELECT DEPTNO, SUM(SALARY), AVG(SALARY), COUNT(*) FROM INSTRUCT
GROUP BY DEPTNO
HAVING AVG(SALARY) > 36000
ORDER BY DEPTNO
results in:
DEPTNO | COL1 | COL2 | COL3 |
---|---|---|---|
HST | 40000.00 | 40000.0000000 | 1 |
MTH | 75000.00 | 37500.0000000 | 2 |
Notice the value of average salary (in COL2) for department is greater than 36000. The English department is not on the report because its average salary of 36000.0000000 does not satisfy the HAVING clause.
*Rules for Using HAVING
When using the HAVING clause with GROUP BY
- HAVING must immediately follow the GROUP BY clause.
- HAVING can only compare built-in functions, not individual columns.
DEPTNO MAX MIN AVERAGE COUNT
—————— ————————————— ————————————— —————————————————— ———————————
ENG 41000.00 31000.00 36000.00 2
MTH 40000.00 35000.00 37500.00 2
*The Resulting Report
Here is the report for the request. Note that the history department does not appear. The HAVING clause qualifies the groups to those with more than one instructor. The history department does not satisfy this condition.
Topic 4.4: Unit 4 Summary
This unit covered the following points:- SQL has built-in functions that you can use to calculate values which can then be displayed. These functions are
SUM (columnname) | the total value |
MIN (columnname) | the minimum value |
AVG (columnname) | the average value |
MAX (columnname) | the maximum value |
COUNT (*) | the number of selected rows |
COUNT (DISTINCT columnname) | the number of unique values in the selected set of rows |
- The calculated values are based on column values of selected rows. By default, all rows in a table are included in the calculations unless a null value was assigned.
- Exclude rows not wanted in the report with the WHERE clause.
- Information selected for your report can be put into specific groups with the GROUP BY clause. The format for this clause is GROUP BY columnname
- When you use both GROUP BY and ORDER BY clauses, GROUP BY must come before ORDER BY and after the FROM clause. You are not required to use an ORDER BY clause.
- When you use the GROUP BY option, all items included on the SELECT list must be either built-in functions or named in the GROUP BY clause.
- The HAVING clause contains a condition to qualify groups. Only results that satisfy the condition for the specified groups are returned. When used, the HAVING clause usually follows the GROUP BY clause.
Unit 5. Combining Table Information
The previous unit showed you the SQL statements to group information generated for your report.
Now it's time to see how you can perform calculations with the values found in the columns.
In this unit, you will also see how to combine information from two or more tables.
After completing this unit, you should be able to:
- Specify column names and values in an arithmetic expression to calculate column values
- Combine column values from more than one table to produce a report
- Code several SQL statements using clauses presented so far in this course
Topic 5.1: Calculating Column Values
*Using Arithmetic ExpressionsYou can specify column names and values in an arithmetic expression.
Form expressions using the following operators for these operations:
+ Addition
- Subtraction
* Multiplication
/ Division
More than one of these operators can be combined in one expression. When you do so, use parentheses to ensure the order in which the operation is performed.
IFIRSTNAME ILASTNAME COL1
—————————— ————————— ————
ELIZABETH SONS 37800.0000
ALYCE SEEGERS 43200.0000
STEWART GOTZ 33480.0000
JOHN RABBET 44280.0000
THOMAS SEEGERS 43200.0000
*Column Calculations
For example this request:
SELECT IFIRSTNAME, ILASTNAME, SALARY * 1.08 FROM INSTRUCT
gives the result shown:
The COL1 value displayed on the report is calculated by multiplying the current salary stored in the table by 1.08.
*Operators Cannot Be Used on Character Data
You cannot use any of these operators on character data even if the field contains only numbers. That is, arithmetic operations can only be used on numeric fields.
To make this clearer, you'll look at the SQL TABLE statement that defines the INSTRUCT table. (The TABLE statement, as well as other SQL statements for building tables, are covered in another SQL course.)
*Arithmetic Operations Only on Numeric Fields
Below is the SQL TABLE statement for the INSTRUCT table.
CHAR(9) describes INSSN in the INSTRUCT table to be 9 characters of data. Even though INSSN contains all numbers, you cannot use it in a calculation. Thus, the rule: Arithmetic operations can only be used on numeric fields.
CREATE TABLE INSTRUCT
(INSSN | CHAR(9) | NOT NULL, |
DEPTNO | CHAR(3) | NOT NULL, |
IFIRSTNAME | VARCHAR(20), | |
IMIDINIT | CHAR(1), | |
ILASTNAME | VARCHAR(25) | NOT NULL, |
SALARY | DECIMAL(10,2) | NOT NULL, |
EXEMPTIONS | SMALLINT | NOT NULL) |
| COMM | SALARY | RESULT |
+————————————+————————————————+——————————————+
| 200.00 | 15000.00 | 15200.00 |
| - | 14520.00 | - |
NULL Values
Another rule to keep in mind when you use arithmetic expressions in a query is:
If any column in an arithmetic expression contains a NULL value, the result will ALWAYS BE NULL.
For example: SALARY + COMM
IFIRSTNAME ILASTNAME COL1
—————————— ————————— ————
ELIZABETH SONS 2916.6667000
ALYCE SEEGERS 3333.3333000
STEWART GOTZ 2583.3333000
JOHN RABBET 3416.6667000
THOMAS SEEGERS 3333.3333000
*Resulting Report
This is the resulting report. Note the heading COL1 is used for the calculated field.
*Using Arithmetic Operations with Built-In Functions
The arithmetic operations can also be used with the built-in functions covered earlier.
For example, the request below:
SELECT DEPTNO, SUM(SALARY * 1.08), AVG(SALARY * 1.08)
FROM INSTRUCT
GROUP BY DEPTNO
ORDER BY DEPTNO
gives this result:
DEPTNO COL1 COL2
—————— —————————————————— ——————————————————
ENG 77760.0000 38880.000000
HST 43200.0000 43200.000000
MTH 81000.0000 40500.000000
*Ordering Output Data
To order the output table data by the calculated column, you enter the number that corresponds to the order in which the ordering column is selected. For example:
SELECT DEPTNO, SUM(SALARY * 1.08), AVG(SALARY * 1.08)
FROM INSTRUCT
GROUP BY DEPTNO
ORDER BY 2
sorts the rows by SUM(SALARY * 1.08) and gives this result:
DEPTNO COL1 COL2
—————— —————————————————— ——————————————————
HST 43200.0000 43200.000000
ENG 77760.0000 38880.000000
MTH 81000.0000 40500.000000
IFIRSTNAME ILASTNAME SALARY
—————————— ————————— ——————
ALYCE SEEGERS 43200.0000
JOHN RABBET 44280.0000
THOMAS SEEGERS 43200.0000
*Using a Calculated Value in a WHERE Clause
You can also use a calculated value in a WHERE clause. For example, this request:
SELECT IFIRSTNAME, ILASTNAME, SALARY * 1.08 FROM INSTRUCT
WHERE SALARY * 1.08 > 40000
gives the result shown.
The SALARY field for each instructor is multiplied by 1.08. If the result is greater than 40000 the instructor is included in the report.
*The AS Clause
The AS clause allows you to give meaningful names to result columns you've created.
The format for the AS clause is
SELECT arithmetic-expression AS new-column-name
*A More Meaningful Name
Using the example shown previously,
SELECT IFIRSTNAME, ILASTNAME, SALARY * 1.08 AS NEWSALARY
FROM INSTRUCT
the expression SALARY * 1.08 has the name NEWSALARY.
IFIRSTNAME | ILASTNAME | NEWSALARY |
---|---|---|
ELIZABETH | SONS | 37800.0000 |
ALYCE | SEEGERS | 43200.0000 |
STEWART | GOTZ | 33480.0000 |
JOHN | RABBET | 44280.0000 |
THOMAS | SEEGERS | 43200.0000 |
*Use New Column Name in ORDER BY
Once the result column has a name, you can use that column in an ORDER BY clause to sort the table data.
You can also use the AS clause to rename the column generated by the GROUP BY clause.
Let's look at a few examples.
*Add ORDER BY
Adding an ORDER BY clause,
SELECT IFIRSTNAME, ILASTNAME, SALARY * 1.08 AS NEWSALARY
FROM INSTRUCT
ORDER BY NEWSALARY
displays the table in NEWSALARY order.
IFIRSTNAME | ILASTNAME | NEWSALARY |
---|---|---|
STEWART | GOTZ | 33480.0000 |
ELIZABETH | SONS | 37800.0000 |
ALYCE | SEEGERS | 43200.0000 |
THOMAS | SEEGERS | 43200.0000 |
JOHN | RABBET | 44280.0000 |
AS More Meaningful than GROUP BY
When you use the GROUP BY clause to create a result table, the derived column name defaults to a generic name like COL1.
When you use the AS clause to name the column, the result table output is more meaningful, as shown in the following example.
*Requested Output
The output of
SELECT DEPTNO, AVG(SALARY) AS AVG_SALARY
FROM INSTRUCT
GROUP BY DEPTNO
ORDER BY DEPTNO
is
DEPTNO | AVG_SALARY |
---|---|
ENG | 36000.00 |
HST | 40000.00 |
MTH | 37500.00 |
Topic 5.2: Combining Tables
*Joining TablesAll of the SELECT clauses covered so far were limited to working with one table. Column values from more than one table can be combined to produce a report.
This is called joining tables.
*Rules for Joining Tables
To join two tables you need to do two things:
1. Enter the names of the tables in the FROM clause. For example:
FROM CLASS, SECTION
2. Enter a WHERE clause that identifies how the tables are to be joined. For example:
WHERE SECTION.SECTIONNO = CLASS.SECTIONNO
The qualifiers are required. SECTIONNO exists in both tables, thus each table must be identified by using the form tablename.columnname in the WHERE clause.
Note that the SELECT works without a WHERE clause. However, use a WHERE clause to make the table joining meaningful.
*Equijoin
Here's a SELECT statement that takes column data from two tables. Class name (CLASSNAME) and class number (CLASSNO) are from the CLASS table. Section number (SECTIONNO) is from the SECTION table.
SELECT CLASSNAME, CLASS.CLASSNO, SECTIONNO
FROM CLASS, SECTION
WHERE CLASS.CLASSNO=SECTION.CLASSNO
This is an example of the most common type of join. It is called an equijoin because the tables are combined based on a common value (CLASSNO) used in an equal comparison.
CLASSNAME CLASSNO SECTIONNO
————————————————————————————————————————————— ——————— —————————
ENGLISH 101 E01 001
ENGLISH 101 E01 002
ENGLISH 102 E02 001
AMERICAN HISTORY H01 001
MATH 101 M01 001
MATH 102 M02 001
MATH 102 M02 002
MATH 103 M03 001
PHYSICS 101 P01 001
*Two Tables, One Report
Above is the report generated by the query
SELECT CLASSNAME, CLASS.CLASSNO, SECTIONNO
FROM CLASS, SECTION
WHERE CLASS.CLASSNO=SECTION.CLASSNO
Information from both tables displays in one report.
*Factors in Joining Tables
Here are several considerations about joining tables.
- Both fields in a join must be numeric data or both fields must be character data.
- There is no requirement that the columns in each table have identical names. However, it is a recommended practice.
- There is no requirement that you use an equal comparison. However, it is the most common.
- The WHERE clause can include multiple conditions. For example,
WHERE STUDENT.STUDENTNO = ROSTER.STUDENTNO AND GRADE = 'A' - It is possible to join more than two tables.
Topic 5.3: Query Practice
*A Recap of SQL ClausesYou learned some basic query techniques in this course. In general, you learned the important features of the SQL SELECT statement.
Here are some of the clauses described in this course. When you specify each of the clauses, you must enter them in this order:
SELECT column1 <,column2>....
FROM tablename1 <,tablename2>...
WHERE expression1 operator expression2
GROUP BY columnname1 <,columnname2>...
HAVING condition
ORDER BY columnname1 <,columnname2>...
*Putting it All Together
Try a simulation that uses what you learned in this course to build a complete SQL request.
Normally when you code an SQL query, you enter all the clauses one after another until the SELECT statement is complete. In the following simulation, you'll be prompted to enter one clause at a time, each clause on a separate screen.
Good luck!
STUDENTNO COL1
————————— ——————————————————
S20 3.000000000
S30 3.000000000
S40 2.800000000
S80 2.800000000
S90 3.000000000
*The Resulting Report
This is the report generated from your query. Students with a grade point average greater than 2.5 display in student number order.
Topic 5.4: Unit 5 Summary
This unit covered the following points:- Column names and constant values can be specified in an arithmetic expression. Expressions can be formed using the following operators.
+ Addition
* Multiplication
- Subtraction
/ Division
- Arithmetic operations can only be used on numeric fields. You cannot use any of these operations on character data even if the field contains numbers. If any column in an arithmetic expression contains a NULL value, the result will always be NULL.
- Arithmetic operations can be used with SQL's built-in functions. For example, SUM(SALARY * 1.08) accumulates each SALARY after being multiplied by 1.08.
- To order the output table data by the calculated column, you enter the number that corresponds to the order in which the ordering column is selected. For example:
SELECT DEPTNO, SUM(SALARY * 1.08), AVG(SALARY * 1.08)
FROM INSTRUCT
GROUP BY DEPTNO
ORDER BY 2
- Calculated values can be used in a WHERE clause. An example is WHERE SALARY * 1.08 > 40000.
- The AS clause allows you to give meaningful names to result columns you've created. The format for the AS clause is
SELECT arithmetic-expression AS new-column-name
- Column values from more than one table can be combined to produce a report. This is known as joining tables.
- Enter the names of the tables to be joined in the FROM clause. The format is
FROM tablename, tablename, ...
- Enter a WHERE clause that identifies how to join the tables. Use qualified names, such as tablename.columnname, when you join tables by the same column.
- An equijoin combines tables based on a common value used in an equal comparison. For example,
WHERE ROSTER.GRADE = GRADE.GRADE
Unit 6. Date And Time Columns
This unit describes how to use date and time data in SQL queries. Topics include
- Selecting date and time columns
- Selecting parts of date and time columns
After completing this unit, you should be able to:
- Select date and time columns to use in SQL statements
Topic 6.1: Selecting Date and Time Columns
DATE and TIME DataIn its most basic form, selecting date and time data is no different than selecting any other kind of data. As you will see, however, columns defined as DATE and TIME provide you with some special capabilities.
For now, let's look at some simple examples that use date and time columns. Then we'll talk about how date and time data differ from numeric and character data.
*The PROJECT Table
Let's take a table called PROJECT. Assume, in addition to columns such as DEPT and PROJNO, that it has three columns of date/time data:
- START_DATE — represents project start date
- END_DATE — represents project end date
- TIMESTAMP — shows when the row of data was entered into the PROJECT table
Note: TIMESTAMP data is primarily used by the system to keep track of when rows are updated. We will not cover TIMESTAMP data in any depth.
Topic 6.1.1: Date Columns
*Date Column FormatSELECT PROJNO,
FROM PROJECT
WHERE PROJNO IN (141, 210, 302)
Here you see two date columns, START_DATE and END_DATE, displayed in their default formats along with the TIMESTAMP column. Dates are stored as 4-byte packed decimal fields,without signs. The internal format is YYYYMMDD. As you'll learn, the external format is selectable.
*The MONTH Keyword
SELECT PROJNO, DEPT, END_DATE
FROM PROJECT
This query answers the question, which projects are scheduled for completion in the current month (October)? As you'll learn, the MONTH keyword extracts only the month value from END_DATE. Other keywords let you extract year or day values.
CURRENT DATE is a system variable that you can access at any time.
*Elapsed Days
SELECT PROJNO, DEPT,
FROM PROJECT
WHERE PROJNO = 255
This query performs simple arithmetic on a date field and tells you how many days Project 255 took to complete.
Other uses include calculating billing due dates and estimating future completion dates if projects go over schedule.
*Built-In Functions for Date Columns
There are a number of built-in functions that work with DATE, TIME and TIMESTAMP data types. Understanding these data types will help you better understand the date and time functions available to you.
Let's talk about DATE columns first.
*Parts of the DATE Column
DATE columns consist of three parts — YEAR, MONTH and DAY in the internal format YYYYMMDD (e.g., 19890814).
Dates are always ascending. That is, today's date is a higher number than yesterday's date. This is good to know for sorting or comparing dates. As you'll see later, you're not bound to the internal format. You can select an external format to suit your needs.
*Single Values
Although dates are stored as single values, they can be treated as three values. You can access the YEAR, MONTH and DAY values separately. You can add and subtract these values from dates or subtract one date from another date.
Topic 6.1.2: Time Columns
*Parts of the TIME ColumnTIME columns consist of three parts — HOURS, MINUTES and SECONDS — in the internal format HHMMSS (e.g., 173516).
For any given day, times are always ascending, meaning that a later time has a higher value than an earlier time. As with dates, this fact aids sorting and comparing times. Also like dates, you are not bound to the internal format and can select an external format to suit your needs.
*Single Values
Although times are stored as single values, they can be treated as three values. You can access the HOUR, MINUTE and SECOND values separately. You can add and subtract these values from times or subtract one time from another time.
TIMESTAMP
TIMESTAMP values tell you the precise moment that each row was written or updated. TIMESTAMP columns consist of seven parts in the internal format YYYYMMDDHHMMSSMICROS.
Additional precision is obtained from the MICROS portion of the TIMESTAMP, which represents microseconds.
As with DATE and TIME columns, you can access the entire value or you can access each part separately. You can subtract TIMESTAMP columns, and any simple duration (such as MM or YYYY), date duration, or time duration can be subtracted from timestamps. You cannot select an output format for timestamp data. The external format is YYYY-MM-DD-HH.MM.SS.MICROS.
*In Summary
- Three data types store date/time values — DATE, TIME, TIMESTAMP
- Date/time columns can be selected
- Date/time columns can be used in WHERE clauses
- A number of built-in functions work with date/time columns
*The Query Result
SELECT DESTINATION,
FROM CRUISES
WHERE DESTINATION = 'BERMUDA'
Here you see the default output formats for date and time columns. Your system default formats may be different.
Of course, you may not want to use the default output formats. Let's see what you can do about this.
Topic 6.1.3: The CHAR Function
*Changing Default FormatsThe CHAR function is a quick and easy way to change the default formats for date/time columns. For example:
SELECT DESTINATION, DEPARTURE_DATE,
FROM CRUISES
WHERE DESTINATION = 'BERMUDA'
produces this:
DEPARTURE
DESTINATION DATE COL1
----------- ---------- ----------
BERMUDA 1989-01-30 01/30/1989
The CHAR function changed the format of DEPARTURE_DATE from YYYY-MM-DD to MM/DD/YYYY. This is called USA format.
*Converts from System Format
The CHAR function allows you to easily convert date and time columns from the system format to one of several international standard formats.
The syntax of CHAR is
CHAR(column_name,format)
Column_name can be any valid DATE or TIME column.
Format can be one of the following: USA, ISO, EUR, JIS or LOCAL. We'll discuss these more in a moment.
If you omit the format type, the date or time format is the one chosen as the installation default.
CHAR Formats
Here are the CHAR formats for your reference.
Date and time output format can be specified in SQL statements, at bind time or when the database is installed. Another course in this series explains the binding process.
We'll use ISO as the default output format in the remainder of this objective.
*In Summary
- The CHAR function changes the default output formats of date and time columns.
- The syntax of the CHAR function is:
CHAR(column_name,format)
- The output format can also be specified at bind time or when the database is installed.
Topic 6.1.4: Date/Time Values in WHERE Clauses
*Don't Worry About the FormatYou can use date and time values in WHERE clauses and not have to worry about the format in which the dates and times are stored. SQL can interpret any of the valid date and time formats. This SELECT statement, for example:
SELECT PROJNO
FROM PROJECTS
WHERE END_DATE = '10/14/1988' OR
END_DATE = '1988-10-31' OR
END_DATE = '29.12.1988'
is perfectly acceptable. As long as the date or time value is enclosed in quotes and written in a valid date/time format, you can mix the different formats in WHERE clauses.
Topic 6.2: Unit 6 Summary
This unit covered the following points:- In its most basic form, selecting date and time data is no different than selecting any other kind of data.
- DATE columns consist of three parts — YEAR, MONTH and DAY in the internal format YYYYMMDD (e.g., 19890814), where
YYYY corresponds to a year value from 0000 to 9999
MM corresponds to a month value from 1 to 12
DD corresponds to a day value from 1 to 31
- TIME columns consist of three parts — HOURS, MINUTES and SECONDS — in the internal format HHMMSS (e.g., 173516), where
HH corresponds to an hour value from 00 to 23
MM corresponds to a minute value from 0 to 59
SS corresponds to a second value from 0 to 59
- TIMESTAMP values tell you the precise moment that each row was written or updated. TIMESTAMP columns consist of seven parts in the internal format YYYYMMDDHHMMSSMICROS.
YYYYMMDD corresponds to YEAR, MONTH and DAY
HHMMSS corresponds to HOUR, MINUTE and SECOND
- The CHAR function allows you to easily convert date and time columns from the system format to one of several international standard formats.
The syntax of CHAR is CHAR(column_name,format)
- You can use date and time values in WHERE clauses and not have to worry about the format in which the dates and times are stored. SQL can interpret any of the valid date and time formats.
Unit 7. Date And Time Data in Queries
This unit describes how to use date and time data in SQL queries. Topics include arithmetic computations on date and time columns.
After completing this unit, you should be able to:
- Select and parse date and time columns
- Perform arithmetic functions on date and time values
Topic 7.1: Parsing Date and Time Columns
*Accessing Parts of Date/Time ColumnsWhile it's all well and good to be able to use and select date and time columns, you can easily imagine situations in which you might want to access just the month of a date column or just the hour of a time column.
Suppose, for example, you want to select all projects that began in December or all projects that were scheduled for completion in 1990. Think about how you might write a WHERE clause to do these tasks when the dates are stored as 19891015, for example. As you'll see, this is easy.
You can access parts of date/time columns. Let's see how.
*Scalar Functions
In an earlier course, you learned about built-in functions like SUM and AVG. These are column functions which produce a single value for a group of rows.
SQL also provides functions called scalar functions. Unlike column functions, scalar functions are based on a single value.
There are several kinds of scalar functions, but we are interested in those that allow you to process date and time columns. We'll call these the date/time functions.
*Using Scalar Functions
The date/time functions are: DAY, MONTH, YEAR, HOUR, MINUTE, SECOND and MICROSECOND.
Scalar functions can be used wherever an expression can be specified in an SQL statement. You'll mostly be using scalar functions in SELECT statements and WHERE clauses, but you can also use them to do date/time arithmetic (covered in the next objective).
Let's look at the date functions DAY, MONTH and YEAR first.
*Parsing Date Columns
The DAY, MONTH and YEAR functions allow you to parse date columns. If, for example, the value in START_DATE is 1989-01-13, you can use DAY, MONTH and YEAR to produce the following:
SELECT PROJNO, DEPT, YEAR(START_DATE)
FROM PROJECT
WHERE MONTH(END_DATE) = MONTH(CURRENT DATE)
*Using Date/Time Functions in WHERE Clauses
Date and time functions can be used in WHERE clauses. You can use them in simple relational expressions such as:
SELECT NAME, ADDRESS
FROM PERSONNEL
which selects everyone born in October.
Or you can write more complex expressions such as:
SELECT NAME, ADDRESS
FROM PERSONNEL
which selects everyone born on October 4th.
*Using Time Functions
The time functions work the same as the date functions.
Simply specify the function followed by a time column name in parentheses. Keep in mind that time columns do not contain microseconds; timestamp columns do.
Given a time column called HIRETIME with a value of 14.15.00, HOUR, MINUTE and SECOND will return results of:
HOUR(HIRETIME) 14
MINUTE(HIRETIME) 15
SECOND(HIRETIME) 00
These results can be output or used in WHERE clauses as a basis for selecting rows.
Topic 7.2: Arithmetic with Date and Time Columns
*Performing CalculationsThe previous sections in this unit showed you how to select and parse date and time columns.
You can also perform arithmetic functions on date and time values. For example, you can add a duration to a date or subtract a duration from a date and get a new date. This can be useful for calculating things like billing due dates, project start and end dates, and so on.
You can also subtract two dates and get the length of time between the two dates.
Let's look at some examples.
*Adding Days
You can add YEARS, MONTHS or DAYS to any date and the result is a new date. For example:
SELECT NAME, ADDRESS, CURRENT DATE + 30 DAYS
gives a date that is 30 days greater than the CURRENT DATE. SQL knows how many days there are in each month, so the results are accurate. Therefore, if :
CURRENT DATE CURRENT DATE + 30 DAYS
1989-08-01 1989-08-31
1989-08-02 1989-09-01
1989-12-02 1990-01-01
The result of adding days to a date always yields an accurate new date.
*Adding Months and Years
In addition to DAYS, you can also use the YEARS and MONTHS functions to calculate dates. You can, for example, add 5 YEARS, 10 MONTHS and 20 DAYS to any date.
If DUEDATE is 1989-04-08 then...
...DUEDATE + 5 YEARS + 10 MONTHS + 20 DAYS yields 1995-02-28
*Subtracting Time
Or, you can subtract 5 YEARS, 10 MONTHS and 20 DAYS from any date.
If DUEDATE is 1989-04-08 then...
...DUEDATE - 5 YEARS - 10 MONTHS - 20 DAYS yields 1983-05-19
The result of adding or subtracting durations from a date is another date.
MONTHS May Not Be Accurate
Be aware, however, that adding and subtracting MONTHS may not be totally accurate.
Because of the inconsistencies in the number of days in each month of the year, adding a month to a given date does not always result in the same day of the next month.
*SQL Adjusts Dates
You've seen that 1989-01-31 + 1 MONTH cannot give 1989-02-31 because February has only 28 or 29 days, depending on the year. SQL knows about leap years. If adding 1 MONTH to a date yields a valid date, that's okay. Otherwise, the result will be the last day of the month.
Thus, 1989-01-31 + 1 MONTH yields 1989-02-28. If you then subtract a month from 1989-02-28, the result is 1989-01-28 and not 1989-01-31.
One way to avoid problems like this is to add and subtract DAYS whenever possible.
Next...Adding and subtracting two dates.
*Subtracting Actual Date Values
You can also subtract actual date values. The result is a date duration, or the number of years, months and days between the two dates. The result is in the form shown.
Let's look at an example.
*Date Duration
SELECT PROJNO, END_DATE, START_DATE,
FROM PROJECT
WHERE PROJNO = 1
Here you see the result of subtracting START_DATE from END_DATE in order to see how long Project 1 should take. The result, in COL1, is 15 years, 2 months and 00 days. The result is called a date duration.
*Days Between Two Dates
Many times you want to know the number of days between two dates. You can do this using the DAYS function (that's DAYS, not DAY).
The DAYS function calculates the number of days between the arbitrary date December 31, 0000 and the date provided to the DAYS function. Thus, given that START_DATE has a value of 1990-06-30, DAYS(START_DATE) will return a value of 726648. The actual value returned by the DAYS function is of little interest by itself.
Now that you have a date duration, however, you can easily subtract dates. Suppose you want to find all accounts that are overdue and how many days overdue each one is. Easy...
*The Query
Your query might look like this:
SELECT ACCOUNTNO, DAYS(CURRENT DATE) - DAYS(DUEDATE)
FROM ACCOUNTS
WHERE DUEDATE < CURRENT DATE
*Figuring Overdue Accounts
To find out how far these accounts are overdue you must subtract the due date from the current date.
DAYS(CURRENT DATE) - DAYS(DUEDATE) first calculates the number of days between December 31, 0000 and the CURRENT DATE and DUEDATE, and then subtracts the two values. The result is the number of days between the two dates.
SELECT ACCOUNTNO, DAYS(CURRENT DATE) - DAYS(DUEDATE)
FROM ACCOUNTS
WHERE DUEDATE < CURRENT DATE
*Checking CURRENT DATE
SELECT ACCOUNTNO, DAYS(CURRENT DATE) - DAYS(DUEDATE)
FROM ACCOUNTS
To limit the report to those accounts that are overdue, you must check that the current date is greater than the due date. Remember, CURRENT DATE is a system field that you can access at any time.
*In Summary
- You can add or subtract durations from date and time columns. The result is a new date or time.
- You can subtract dates. The result is a date duration of the form YYYYMMDD.
- You can subtract times. The result is a time duration of the form HHMMSS.
- The DAYS function converts a date to a number of days since December 31, 0000.
*This Is Just the Beginning!
This lesson only touched on the capabilities and uses of date and time columns. Many of the functions discussed here can be combined to perform more complex tasks.
The next few pages show some varied and more advanced uses of date/time data and the date/time functions.
*Combining Date/Time With Other Built-In Functions
You can combine the date/time built-in functions with other SQL built-in functions. Suppose you want to know the total number of days needed to complete all projects. The following query accomplishes this.
SELECT SUM(DAYS(END)-DAYS(START))
FROM PROJECTS
Or, how about the longest time any project has taken...
SELECT MAX(DAYS(END)-DAYS(START))
FROM PROJECTS
Or, how about the earliest date any employee was hired...
SELECT YEAR(MIN(HIREDATE))
FROM PERSONNEL
*Another Example
Here's an example that determines how long each interview lasted for each applicant in a table called INTERVIEW. The result is calculated in hours.
SELECT MINUTE(ENDTIME - STARTTIME) / 60.00 + HOUR(ENDTIME - STARTTIME)
FROM INTERVIEW
As you can see, there are many ways to combine the built-in functions to produce the results you need.
Topic 7.3: Unit 7 Summary
This unit covered the following topics:- You can use the YEAR, MONTH, DAY functions to parse date columns.
- The data you extract from date or time columns can be output or used in WHERE clauses.
- You can use the HOUR, MINUTE, SECOND functions to parse time columns.
- You can add YEARS, MONTHS or DAYS to any date and the result is a new date. For example:
SELECT NAME, ADDRESS, CURRENT DATE + 30 DAYS
gives a date that is 30 days greater than the CURRENT DATE.
- Because of the inconsistencies in the number of days in each month of the year, adding a month to a given date does not always result in the same day of the next month.
- You can also subtract actual date values. The result is a date duration, or the number of years, months and days between the two dates.
- The DAYS function calculates the number of days between the arbitrary date December 31, 0000 and the date provided to the DAYS function.
- You can combine the date/time built-in functions with other SQL built-in functions.
No comments:
Post a Comment