Loading an Excel Spreadsheet into an Oracle Table

Recently we came across a question from the field about how an Excel spreadsheet can be loaded into an Oracle table with the help of OWB 11GR2. There are many ways that this can be achieved. With OWB 11GR2, the introduction of Code Template adds a new way of how a spreadsheet can be loaded. In this article, we’ll review some of the old ways and explore the new way using OWB11GR2.

Let’s consider an example where we have 2 spreadsheets, hr_employees.xls and hr_departments.xls and we want to load only the employees in the Sales department into the Oracle table. Following is the spreadsheet of hr_department. And the columns from A to D are defined as department_id, department_name, manager_id and location_id.

clip_image002

And the hr_employees spreadsheet looks like below. And the columns from A to K are defined as employee_id, first_name, last_name, email, phone_number, hire_date, job_id, salary, commission_pct, manager_id and department_id.

clip_image004

To make the scenario more interesting, we only want the employees in the Sales department in our result table. Below is the structure of the result table sales_employees where target data will be stored:

clip_image006

To load the sales employees into the target table, we have to join the data from 2 spreadsheets by using the reference key department_id. Besides, the column employee_name in the target table sales_employees is designed to include the first name and the last name of an employee. Hence, a transformation operator is needed. For example, we can utilize the Oracle pre-defined function CONCAT to concatenate the names.

Let’s think about how to implement this in OWB 11GR2 release.

1. The first way: access Excel spreadsheets through the Oracle Database Gateway for ODBC and load data by using PL/SQL mapping.

The Oracle Database Gateway for ODBC can be set up to access an Excel file, making each spreadsheet in the Excel file looks as if it were an Oracle table. And we can make use of this to create a PL/SQL mapping in OWB to load the Excel data into an Oracle table. First, we need to set up the ODBC gateway for the Excel file (See "Using Excel Spreadsheets as Source"). In OWB, create an ODBC module corresponding to the ODBC gateway and invoke the Import Metadata Wizard to import the structures of the 2 spreadsheets into OWB. Then construct a PL/SQL mapping as follows.

odbc_map

The Joiner condition is configured as below.

join

When it’s executed, OWB will read the data in spreadsheets through Oracle Database Gateway for ODBC, transform and load them into the target table.

2. The second way: utilize flat files and load data from converted CSV files to staging tables by SQL*Loader mapping, then load the target using PL/SQL mapping.

2.1 First of all, we need to convert the Excel from XLS format into CSV or text format so that OWB flat file can access it. This can be done by third-party tools or the “Save As” feature in Excel. In my case, I open the spreadsheets with Microsoft Excel, and save them as hr_employees.csv and hr_departments.csv respectively. I also perform a Linux shell command dos2unix to convert any Windows characters into Unix format, for my OWB repository is built on a Linux platform.

2.2 Create an OWB flat file module referencing to the directory where the CSV files are, and use the Import Metadata Wizard to import the CSV files.

2.3 Create 2 SQL*Loader mappings to load data into the staging tables as follows.

sqdr1

This mapping will load the data in hr_departments.csv into the staging table STAGING_HR_DEPT. Note that the loading type has been configured as TRUNCATE/INSERT so that no duplicate data will be loaded into the staging table after multi-running.

sqdr2

This mapping is to load the data in hr_employees.csv into staging table STAGING_HR_EMP, and the loading type is also set up as TRUNCATE/INSERT.

2.4 Create a PL/SQL mapping to transform and load data from staging tables into final target as follows.

sqdr3

That’s it. After deploying and executing the 3 mappings, data will be loaded into target.

3. The third way: utilize Oracle External tables, extract and load data from external tables to the target by PL/SQL mapping.

External table helps to expose data in a flat file as if it were a relational table. Thus it’s more flexible to involve external table operator as a source than the flat file operator in a mapping. OWB generates the SQL and PL/SQL codes for the mapping with external table, and more operators can be utilized in such kind of mapping. As a result, only one PL/SQL mapping is needed to do the entire job.

For this example, we create 2 external tables that reference the 2 CSV files respectively, and then we join them together to extract and transform the sales employees into the target table. And the PL/SQL mapping looks like this.

map1

External table operators and PL/SQL mappings allow for more complex mappings without the use of staging tables.

4. The fourth and newest way: extract and load data from the CSV files to the target table by Code Template Mapping.

In 11GR2, Code Template mapping (CT mapping) is introduced for heterogeneous data extraction, transformation and loading by using code templates. This enhances OWB to extract and load data (either with or without transformation) between different database platforms easily. Each CT mapping contains one or more execution units, and each unit is independent and associated with one code template. It utilizes the technology defined in the code template to move data, process data movement unit by unit, and then loads it into the final target. Because of the independency of each unit, CT Mapping allows users to handle data from different sources using different technologies.

For this example, we include the 2 file operators (for the two CSV files) and the target table in a CT mapping, and then handle the ETL in 3 execution units as follows.

ct

Before deployment, we have to assign the code template to each execution unit, and configure the source flat files and locations of the file operators HR_EMPLOYEES_CSV and HR_DEPARTMENTS_CSV.

To assign code template to an execution unit, open the CT mapping and select the execution view tab, and then select the execution unit box. And OWB will populate the available code templates shown in the drop-down list. For example, when above execution unit DEPT_EX_UNIT is selected, several available templates are shown, such as LCT_FILETO_SQL, LCT_FILE_TO_ORACLE_SQLLDR and LCT_FILE_TO_ORACLE_EXTER_TABLE. Though we have several options, we still need to determine which one is appropriate based upon your data and the functionality of the template. Besides, since each execution unit is independent, even if the execution unit DEPT_EX_UNIT and EMP_EX_UNIT are doing similar jobs, we can assign different code template to each of them. For instance, the DEPT_EX_UNIT can have code template LCT_FILE_TO_ORACLE_SQLLDR while EMP_EX_UNIT can use another LCT_FILETO_SQL, as shown below. This greatly augments the flexibility of CT mapping.

g2

g1

If the flat file referenced by file operator is imported through Import Metadata Wizard, the CT mapping knows the location and the file physical name during code generation. However, we can still re-configure them in the mapping configuration editor if we want to change them. This should be done right before the first time code generation.

ct_map_conf2

That’s it, it’s ready to deploy and execute the CT mapping now. See the execution result below. The execution unit EMP_EX_UNIT and SALES_EX_UNIT utilize JDBC to load the data while execution unit DEPT_EX_UNIT uses SQL*Loader technology.

e1

CT mapping allows flexible technology for data movement, and the performance on large data volumes depends on the code template applied. Though CT mapping allows more operators than SQL*Loader mapping, there may be some restrictions with the operators, for example, some operators can only be supported in the Oracle target CT mapping, such as Set Operation. For details, please refer to "Mapping Operators that are Only Supported Directly in Oracle Target CT Mappings".

So far, I’ve illustrated 4 ways to load Excel spreadsheets into Oracle tables, which one is best suited for your needs?

Ways to load Excel spreadsheet

The 1st way: load data through ODBC gateway

The 2nd way: load data from CSV files by SQL*Loader

The 3rd way: load data with external tables

The 4th way: load data by code template

Precondition

ODBC gateway for Excel is set up.

A CSV/ text file converted from Excel is required

Mapping type

PL/SQL mapping

SQL*Loader mapping & PL/SQL Mapping if doing more transform

PL/SQL mapping

CT mapping

Are staging table and extra mapping required?

No.

Yes when there are more than one source files, or mapping is complex.

No.

No.

What mapping operators are supported?

High. It can offer the full range of Warehouse Builder data transformation capabilities.

Low. Only one flat file source is allowed within each mapping, and only a few operators are applicable. See document "Creating SQL*Loader Mappings to Extract Data from Flat Files".

High. It can support the full range of Warehouse Builder data transformation capabilities.

Middle. More than one file operators can be involved within one mapping. CT mappings support only a subset of transformations available in PL/SQL mappings. See "Mapping Operators that are Only Supported Directly in Oracle Target CT Mappings".

Comments:

How to load SQL data to Oracle by scheduling a job that runs every 15 mins, taking the data for the slot of 15 minutes

Posted by guest on July 06, 2011 at 08:18 PM PDT #

how to import excel file to oracle table

Posted by sukhvir on May 24, 2012 at 04:47 AM PDT #

Hi ,

I am having a data in excel sheet (ex:- emp table ).
I want that date to be inserted or copy into Oracle emp or any other table . How can i get this?

Regards,
Rajesh

Posted by guest on July 03, 2013 at 03:29 AM PDT #

Hi Rajesh

Did you read through this in the doc;
http://docs.oracle.com/cd/E11882_01/owb.112/e10582/loading_ms_data.htm

Cheers
David

Posted by David on July 08, 2013 at 01:28 PM PDT #

Post a Comment:
  • HTML Syntax: NOT allowed
About

ETL, CDC, Real-Time DI and Data Quality for the Oracle Database from the inside.

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today