This article is the first in a new series that helps you build on the fundamentals you learned in the 12-part SQL 101 series in Oracle Magazine. SQL 101 outlines the basics of the relational database and SQL for single Oracle Database table access. That series shows how data is organized in a relational database, the basics of modeling and accessing relational data, and how a SELECT statement helps in querying data and creating result sets for interpreting that data. SQL 101 also outlines components of Oracle queries—the WHERE clause; the ORDER BY clause; and character, date, and numeric functions—to show how these features give SQL developers control over the number of records returned, the order in which they are returned, and how they are ultimately displayed. And finally, SQL 101 introduces ways to aggregate data and how to use analytic functions to obtain result sets without writing complicated code.
The query examples in the SQL 101 series show SQL statements that run against only a single table. However, to support business queries, SQL developers must often join two or more tables together to obtain a meaningful result set.
In this article, you’ll learn about
The equijoin is the most common type of join. Using equijoins correctly, with the correct primary and foreign key relationships, helps you avoid an unnecessary Cartesian product—a join of every row of one table to every row of another table. Table aliases help developers cut down on the amount of code necessary to write SQL join statements.
To try out the examples in this series, you need access to an Oracle Database instance. If necessary, download and install an Oracle Database edition for your operating system. I recommend installing Oracle Database, Enterprise Edition 12c Release 1 (188.8.131.52.0). If you install the Oracle Database software, choose the installation option that enables you to create and configure a database. A new database, including sample user accounts and their associated schemas, will be created for you. (Note that SQL_201 is the user account for the examples in this series; it’s also the schema in which you’ll create database tables and other objects.) When the installation process prompts you to specify schema passwords, enter and confirm passwords for the SYS and SYSTEM users and make a note of them.
Finally—whether you installed the database software from scratch or have access to an existing Oracle Database instance—download, unzip, and execute the SQL script to create the tables for the SQL_201 schema required for this article’s examples. (View the script in a text editor for execution instructions.)Relating on Equal Terms
Suppose your business requirement is to obtain a list of all employees who work in your company’s accounting department. Knowing that your EMPLOYEE table has a DEPARTMENT_ID column, you could query the table that contains a full list of the company’s departments to discover the DEPARTMENT_ID for the accounting department. Then, using the DEPARTMENT_ID you obtained, you could query the employee records in the EMPLOYEE table that match your DEPARTMENT_ID value.
The problem with this approach is that it requires more coding work and more database access than necessary. Consider Listing 1, in which a single query displays all the employees who work in the accounting department. The
where department.name = 'Accounting'
predicate is used to obtain only those employee records associated with the DEPARTMENT_ID value of 10. The DEPARTMENT_ID value of 10 equates to the NAME column value of Accounting in the DEPARTMENT table.
Code Listing 1: Obtaining a list of employees in the accounting department
SQL> set feedback on SQL> set lines 32000 SQL> select rtrim(first_name||' '||last_name) "Employee Name", employee.department_id, department.name 2 from employee, department 3 where employee.department_id = department.department_id 4 and department.name = 'Accounting' 5 order by last_name, first_name; Employee Name DEPARTMENT_ID NAME ———————————————— ————————————— —————————— Lori Dovichi 10 Accounting Emily Eckhardt 10 Accounting Roger Friedli 10 Accounting Betsy James 10 Accounting Matthew Michaels 10 Accounting Donald Newton 10 Accounting 6 rows selected.
After you decide which columns you want to include in the result set, choose the tables to which those columns belong and then determine which columns the tables have in common. These common columns will become your join criteria. Remember that tables (also known as relations) relate to each other with one-to-one, one-to-many, or one-to-zero relationships between their column values (“Modeling and Accessing Relational Data,” Oracle Magazine, November/December 2011). Often the join criteria for two tables consists of the primary key column of one table and the foreign key column of the other table.
Right now, neither of the two example tables is formally defined with primary and foreign key constraints. The result in Listing 1 illustrates that the relationship between the DEPARTMENT table and the EMPLOYEE table is one-to-many. One DEPARTMENT_ID value from the DEPARTMENT table can exist one or many times in the DEPARTMENT_ID column of the EMPLOYEE table. And a given DEPARTMENT_ID in the EMPLOYEE table can exist only once in the DEPARTMENT table.
Listing 1 looks much like other SQL statements you’ve written previously, with three differences. The first difference is that two tables separated by commas are listed in the FROM clause. The second is that one of the predicates in the WHERE clause formulates the join criteria (or join condition) between the two tables by using the common DEPARTMENT_ID column. The values in the common column must equal each other for rows to be returned in the result set, which is why this type of join is called an equijoin.
The third difference is that you differentiate similar column names by prepending the associated table name and a period to qualify which table’s column is being addressed, in either the SELECT list clause or the join condition predicate. Listing 2 illustrates what is returned if you do not qualify like-named columns with their prefixed table names. Note that you could display the DEPARTMENT_ID value from either the DEPARTMENT table or the EMPLOYEE table, and the result would be the same, because the type of join used in this example is an equijoin.
Code Listing 2: A join condition with no table name qualifiers
SQL> select rtrim(first_name||' '||last_name) "Employee Name", name 2 from employee, department 3 where department_id = department_id 4 and department.name = 'Accounting' 5 order by last_name, first_name; where department_id = department_id * ERROR at line 3: ORA-00918: column ambiguously definedA Table by Any Other Name
Because distinguishing each like-named column with the full name of its associated table can add quite a bit of code to your query, you have the option to use a short name for this table qualifier instead. Table aliases are typically short abbreviations for the table names listed in your FROM clause.
Consider the query in Listing 3. This query, which uses table aliases, requires much less code than its counterpart in Listing 1, which uses full table names. You can use any alias you want (although you cannot use any Oracle SQL reserved words), but it is best to keep the names short, as illustrated in the query in Listing 3.
Code Listing 3: A query with a join condition that uses table aliases
SQL> select rtrim(first_name||' '||last_name) "Employee Name", e.department_id, name 2 from employee e, department d 3 where e.department_id = d.department_id 4 and name = 'Accounting' 5 order by last_name, first_name; Employee Name DEPARTMENT_ID NAME ———————————————— ————————————— —————————— Lori Dovichi 10 Accounting Emily Eckhardt 10 Accounting Roger Friedli 10 Accounting Betsy James 10 Accounting Matthew Michaels 10 Accounting Donald Newton 10 Accounting 6 rows selected.
It is generally a good practice to use table aliases for all column names in your queries that use joins. This way, if you add a column to a table used in a FROM clause with the same name as that of a column already existing in one of your other tables listed in the same FROM clause, there will be no ambiguities during query execution. This practice also helps increase readability and helps you and others more easily identify source tables for each column in your queries.
Equijoins rely on the fact that a column value that exists within one column also exists in the column with which it is being joined. In the case of a foreign key column that allows NULL values, the NULL value encountered during an equijoin operation is simply ignored and the row that contains the NULL value is not included in the result set. The reason for this exclusion is that a NULL value cannot be equal to another value—not even another NULL value.
Listings 4 and 5 illustrate how equijoins exclude NULL values during join column value equality checks. The EMPLOYEE table contains 11 employee records, as shown in Listing 4. However, when the DEPARTMENT_ID values of all the employee records are joined with the DEPARTMENT_ID values present in the DEPARTMENT table, only those records that have matching values between the join criteria columns of these two tables are returned. The record for Frances Newton does not have an assigned DEPARTMENT_ID value; it has a NULL value. Therefore, the equijoin operation does not return this record, as shown in Listing 5.
Code Listing 4: The employee table, with 11 employee records
SQL> select rtrim(e.first_name||' '||e.last_name) "Employee Name", e.department_id 2 from employee e 3 order by e.department_id, e.last_name, e.first_name; Employee Name DEPARTMENT_ID ————————————————— ————————————— Lori Dovichi 10 Emily Eckhardt 10 Roger Friedli 10 Betsy James 10 Matthew Michaels 10 Donald Newton 10 mark leblanc 20 michael peterson 20 Thomas Jeffrey 30 Theresa Wong 30 Frances Newton 11 rows selected.
Code Listing 5: The number of employee records returned (10) when the employee table is joined with the department table
SQL> select rtrim(e.first_name||' '||e.last_name) "Employee Name", e.department_id, d.name 2 from employee e, department d 3 where e.department_id = d.department_id 4 order by d.name, e.last_name, e.first_name; Employee Name DEPARTMENT_ID NAME ———————————————— ————————————— —————————— Lori Dovichi 10 Accounting Emily Eckhardt 10 Accounting Roger Friedli 10 Accounting Betsy James 10 Accounting Matthew Michaels 10 Accounting Donald Newton 10 Accounting Thomas Jeffrey 30 IT Theresa Wong 30 IT mark leblanc 20 Payroll michael peterson 20 Payroll 10 rows selected.Meaningful Combinations
If you write a SQL query that lists two tables in the FROM clause but you do not create a join criterion predicate between those two tables, the result set can be a Cartesian product, which indicates a many-to-many relationship. Legitimate many-to-many relationships should be accounted for during the design and modeling phase through the creation of an intersection table that resolves such relationships between two tables (“Modeling and Accessing Relational Data,” Oracle Magazine, November/December 2011).
More often than not, a Cartesian product comes about as a result of missing join criteria. One simple method you can use to avoid missing join criteria is to ensure that your SQL query has n – 1 the number of join criteria as you have tables listed in your FROM clause. For example, if you have three tables listed in your FROM clause, your predicate list should have at least two join criteria. One join criterion joins the common column values between the first and second tables, and another join criterion joins the common column values between the second and third tables. Note: If there are extra predicate criteria that make sense for the answer you seek (for example, common column values between the first and the third tables), you should add them to reduce your result set and receive a more meaningful result.
Listing 6 illustrates the kind of result you receive when you forget to add necessary join criteria to your SQL query. The result is merely the number of records in the EMPLOYEE table (11) multiplied by the number of records in the DEPARTMENT table (3). This is clearly not a meaningful, realistic representation of which employees are assigned to which departments. Compare the result of this query with that in Listing 5.
Code Listing 6: A Cartesian product between the employee and department tables
SQL> select rtrim(e.first_name||' '||e.last_name) "Employee Name", e.department_id, d.name 2 from employee e, department d 3 order by d.name, e.last_name, e.first_name; Employee Name DEPARTMENT_ID NAME ———————————————— ————————————— —————————— Lori Dovichi 10 Accounting Emily Eckhardt 10 Accounting Roger Friedli 10 Accounting Betsy James 10 Accounting Thomas Jeffrey 30 Accounting Matthew Michaels 10 Accounting Donald Newton 10 Accounting Frances Newton Accounting Theresa Wong 30 Accounting mark leblanc 20 Accounting michael peterson 20 Accounting Lori Dovichi 10 IT Emily Eckhardt 10 IT Roger Friedli 10 IT Betsy James 10 IT Thomas Jeffrey 30 IT Matthew Michaels 10 IT Donald Newton 10 IT Frances Newton IT Theresa Wong 30 IT mark leblanc 20 IT michael peterson 20 IT Lori Dovichi 10 Payroll Emily Eckhardt 10 Payroll Roger Friedli 10 Payroll Betsy James 10 Payroll Thomas Jeffrey 30 Payroll Matthew Michaels 10 Payroll Donald Newton 10 Payroll Frances Newton Payroll Theresa Wong 30 Payroll mark leblanc 20 Payroll michael peterson 20 Payroll 33 rows selected.Getting ANSI About Your Join Syntax
Oracle Database is an ANSI-compliant database. If you code your queries with the ANSI-compliant syntax, they might more easily be run against non-Oracle ANSI-compliant databases. This ANSI syntax uses the concept of an inner join, which is synonymous with an equijoin, in that its join condition is based on equality. The ANSI syntax is different in some ways from the syntax you have learned thus far in this article. One example of this difference is that the JOIN keyword placed between tables in the FROM clause replaces the comma delimiter.
Additionally, your join condition can use either the USING clause or the ON clause. Consider Listing 7, which employs ANSI syntax, including the USING clause. A couple of things to note about the query in this listing are that
Code Listing 7: An ANSI inner join that employs the Using clause
SQL> select rtrim(first_name||' '||last_name) "Employee Name", department_id, name 2 from employee JOIN department 3 USING (department_id) 4 order by name, last_name, first_name; Employee Name DEPARTMENT_ID NAME ———————————————— ————————————— —————————— Lori Dovichi 10 Accounting Emily Eckhardt 10 Accounting Roger Friedli 10 Accounting Betsy James 10 Accounting Matthew Michaels 10 Accounting Donald Newton 10 Accounting Thomas Jeffrey 30 IT Theresa Wong 30 IT mark leblanc 20 Payroll michael peterson 20 Payroll 10 rows selected.
Listing 8 illustrates an inner join that specifies the ON clause. In contrast to the USING clause, the ON clause allows table aliases and does not require the parentheses around the join condition. The ON condition can also be used when the column names of the tables being used for the join criteria are different. Otherwise, this syntax is very similar to the old style of join syntax. It should be noted that both styles are 100 percent ANSI-compliant. The new ANSI syntax style is often preferred over the old syntax style for readability. With the new style, it is obvious what is being joined to what and it is virtually impossible to miss or forget a join condition.
Code Listing 8: An ANSI inner join that employs the On clause
SQL> select rtrim(e.first_name||' '||last_name) “Employee Name”, e.department_id, d.name 2 from employee e JOIN department d 3 ON (e.department_id = d.department_id) 4 order by d.name, e.last_name, e.first_name; Employee Name DEPARTMENT_ID NAME ———————————————— ————————————— —————————— Lori Dovichi 10 Accounting Emily Eckhardt 10 Accounting Roger Friedli 10 Accounting Betsy James 10 Accounting Matthew Michaels 10 Accounting Donald Newton 10 Accounting Thomas Jeffrey 30 IT Theresa Wong 30 IT mark leblanc 20 Payroll michael peterson 20 Payroll 10 rows selected.Conclusion
This article introduced you to Oracle SQL table joins and explained the concept of an equijoin. You saw how table aliases can be used to reduce the amount of code necessary to formulate a table join. You also learned how a Cartesian product can result from missing join criteria and how to double-check whether you’ve included the correct number of necessary join conditions against the number of tables you’ve included in your FROM clause. And you were introduced to ANSI syntax with respect to inner joins and how the USING and ON clauses can be employed.
The next article in this series introduces complex joins and subqueries.
READ SQL 101, Parts 1–12.
LEARN more about relational database design and concepts.
DOWNLOAD the sample script for this article.
READ more Beyond SQL 101.
Photography by Meric Dagli, Unsplash