X

Insights into Statistics, Query Optimization and the Oracle Optimizer

Why are some of the tables in my query missing from the plan?

Maria Colgan
Master Product Manager
A query transformation called table elimination (alternately called "join elimination") removes redundant tables from a query. A table is redundant if its columns are only referenced to in join predicates, and it is guaranteed that those joins neither filter nor expand the resulting rows. There are several cases where Oracle will eliminate a redundant table. We will discuss each case in turn.

Primary Key-Foreign Key Table Elimination

The optimizer eliminates tables that are redundant due to primary key-foreign key constraints. Consider the following example tables:

create table jobs 
(
job_id     NUMBER PRIMARY KEY,
job_title  VARCHAR2(35) NOT NULL,
min_salary NUMBER,
max_salary NUMBER
);
create table departments 
(
  department_id NUMBER PRIMARY KEY,
  department_name VARCHAR2(50));

create table employees
( employee_id NUMBER PRIMARY KEY,
  employee_name VARCHAR2(50),
  department_id NUMBER REFERENCES departments(department_id),
  job_id NUMBER REFERENCES jobs(job_id));

and the query:

select e.employee_name
from employees e, departments d
where e.department_id = d.department_id;

In this query, the join to departments is redundant. The only column referenced in the query appears in the join predicate, and the primary key-foreign key constraint guarantees that there is at most one match in departments for each row in employees. Hence, the query is equivalent to:

select e.employee_name
from employees e
where e.department_id is not null;

The optimizer will generate this plan for the query:

-------------------------------------------
 Id   Operation             Name      
-------------------------------------------
   0  SELECT STATEMENT                
*  1   TABLE ACCESS FULL    EMPLOYEES 
-------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
 1 - filter("E"."DEPARTMENT_ID" IS NOT NULL)
Note that the IS NOT NULL predicate is not necessary if the column has a NOT NULL constraint on it. The optimizer will also eliminate tables that are semi-joined or anti-joined. Consider the following query:
select e.employee_id, e.employee_name
from employees e
where not exists (select 1
                  from jobs j
                  where j.job_id = e.job_id);

Since employees.job_id is a foreign key to jobs.job_id, any non-null value in employees.job_id must have a match in jobs. So only employees with null values for employees.job_id will appear in the result. Hence, this query is equivalent to:

select e.employee_id, e.employee_name
from   employees e
where  job_id is null;

and the optimizer can choose this plan:

-------------------------------------------
 Id   Operation             Name      
-------------------------------------------
   0  SELECT STATEMENT                
*  1   TABLE ACCESS FULL    EMPLOYEES 
-------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("E"."JOB_ID" IS NULL)

Suppose employees.job_id has a NOT NULL constraint:

alter table employees modify job_id not null;

In this case, there could not possibly be any rows in EMPLOYEES, and the optimizer could choose this plan:

-------------------------------------------
 Id   Operation             Name      
-------------------------------------------
   0  SELECT STATEMENT                
*  1   FILTER                         
   2    TABLE ACCESS FULL   EMPLOYEES 
-------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
 1 - filter(NULL IS NOT NULL)

The "NULL IS NOT NULL" filter is a false constant predicate, that will prevent the table scan from even taking place.

The optimization is available for ANSI compliant joins. For this query:

select employee_name
from employees e inner join jobs j
on e.job_id = j.job_id;

...the optimizer can eliminate JOBS and produce this plan:

-------------------------------------------
 Id   Operation             Name      
-------------------------------------------
   0  SELECT STATEMENT                
   1   TABLE ACCESS FULL    EMPLOYEES 
-------------------------------------------

Outer Join Table Elimination

Table elimination is available for outer joins, which does not require PK-FK constraints. For the example, we require a new table and an addition to EMPLOYEES:

create table projects
( project_id NUMBER UNIQUE,
  deadline DATE,
  priority NUMBER );

alter table employees add project_id number;

Now consider a query that outer joins employees and projects:

select e.employee_name, e.project_id
from employees e, projects p
where e.project_id = p.project_id (+);

The outer join guarantees that every row in employees will appear at least once in the result. The unique constraint on projects.project_id guarantees that every row in employees will match at most one row in projects. Together, these two properties guarantee that every row in employees will appear in the result exactly once. Since no other columns from projects are referenced, projects can be eliminated, and the optimizer can choose this plan:

-------------------------------------------
 Id   Operation             Name      
-------------------------------------------
   0  SELECT STATEMENT                
   1   TABLE ACCESS FULL    EMPLOYEES 
-------------------------------------------

Why Would I Ever Write Such a Query?

All of the example queries in this post are very simple, and one would be unlikely to write a query where the join is so obviously unnecessary. There are many real world scenarios where table elimination may be helpful, including machine-generated queries and elimination of tables in views. For example, a set of tables might be exposed as a view, which contains a join. The join may be necessary to retrieve all of the columns exposed by the view. But some users of the view may only access a subset of the columns, and in this case, the joined table can be eliminated.

For example, consider the view:

create view employee_directory_v as
select e.employee_name, d.department_name, j.job_title
from employees e, departments d, jobs j
where e.department_id = d.department_id
and e.job_id = j.job_id;

This view might be exposed to a simple employee directory application. To lookup employee names by job title, the application issues a query:

select employee_name
from employee_directory_v
where department_name = 'ACCOUNTING';

Since the job_title column is not referenced, jobs can be eliminated from the query, and the optimizer can choose this plan:

--------------------------------------------
 Id   Operation             Name       
--------------------------------------------
   0  SELECT STATEMENT                 
*  1   HASH JOIN                       
   2    TABLE ACCESS FULL   EMPLOYEES  
*  3    TABLE ACCESS FULL   DEPARTMENTS
--------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
 1 - access("E"."DEPARTMENT_ID"="D"."DEPARTMENT_ID")
 3 - filter("D"."DEPARTMENT_NAME"='ACCOUNTING')

Note that prior to Oracle Database 12c Release 2, multi-column primary key-foreign key constraints were not supported.

Join the discussion

Comments ( 3 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.Captcha
Oracle

Integrated Cloud Applications & Platform Services