X

Move data between Oracle Database and Apache Hadoop using high speed connectors.

  • May 15, 2017

Move Data between Apache Hadoop and Oracle Database with SQL Developer

SQL Developer is much loved in the developer community.   A developer starting on the task of moving data between Apache Hadoop and Oracle Database can save lots of time by using a tool they know and like.

Download SQL Developer 4.2 and try it out if you are moving data between Apache Hadoop and Oracle Database.  

Remember that you always start with the target (destination) of the load.  To load data from Hadoop into Oracle Database:

Using Oracle Loader for Hadoop: Start by right-clicking the target table in an Oracle Database connection.

Using Oracle SQL Connector for HDFS:  Start by right-clicking the 'Tables' tab in an Oracle Database connection.

Using Copy to Hadoop: Start by right-clicking on the 'Tables' tab in a Hive connection.

Additional details on getting started are below.   They are also in the SQL Developer help menu.

Installing and Getting Started with SQL Developer and Connectors for Apache Hadoop

Install SQL Developer

Download SQL Developer 4.2.

Checkboxes in SQL Developer

The following options should be selected in SQL Developer to use the Apache Hadoop connectors

  • Tools -> Features -> Database. Check the box for Oracle SQL Developer – Hadoop Connectors.
  • In the connections panel on the left, right-click on the Oracle Database connections you plan to use, and then select Manage Features -> Hadoop Integration

Configure Cloudera-Hive JDBC drivers in SQL Developer

The connectors for Apache Hadoop require Cloudera JDBC drivers.  You can download them from cloudera.com, and upload them as follows

  • Tools -> Preferences -> Database -> Third party JDBC drivers

Using Cloudera_HiveJDBC4_*.zip is recommended.   Download from cloudera.com, unzip to a local directory, and add to SQL Developer as follows:

Environment Variables

Oracle Loader for Hadoop, Oracle SQL Connector for HDFS and Copy to Hadoop ‘directcopy’ option require an SSH connection to a Hadoop client/edge node/Hadoop cluster node.  Copy to Hadoop ‘stage’ option requires an SSH connection to the database node.   See the section Create SSH Hosts in SQL Developer for steps to create an SSH connection.

The home directory of the SSH connection (for example, /home/<user>) should contain the following files with environment variables.   See Appendix A for information on creating these files and adding the required environment variables.

Copy to Hadoop: .sqldev_cp2hadoop_env

Oracle Loader for Hadoop: .sql_olh_env

Oracle SQL Connector for HDFS: .sqldev_osch_env

Setting up Connections

Create SSH hosts in SQL Developer

To use the SQL developer wizards for Oracle Loader for Hadoop, Oracle SQL Connector for HDFS and Copy to Hadoop ‘directcopy’ option you will need to create SSH host connections to a Hadoop client/ edge node/Hadoop cluster node.  (To run Copy to Hadoop ‘stage’ option you will need to create an SSH connection to the Oracle Database node.)

  • view -> SSH to see the SSH Hosts pane
  • SSH Hosts -> New SSH Host -> Enter hostname and user name
  • Click OK

Note that the home directory of the SSH connections should contain the files with environment variables as explained in the previous section.  See Appendix A for more details.

To run Copy to Hadoop ‘stage’ option you will need to create an SSH connection to the Oracle Database node, as user ‘oracle.’  It is important the user is ‘oracle.’

Oracle Database Connections

Set up connections to Oracle Database schemas.

Apache Hive Connections

  • Open the new connection dialog box
  • Click on the ‘Hive’ tab.  If the ‘Hive’ tab does not display next to the ‘Oracle’ tab it is likely that the Hive JDBC drivers did not install correctly.
  • Username: On the BDA the default is ‘oracle’
  • Host name: Hostname where hiveserver2 is running
  • Port: Usually 10000
  • Database: Hive schema you want to connect to

Getting Started

You are now ready to get started using the SQL Developer wizards for the Apache Hadoop connectors.  A rule of thumb is the wizards start from the destination object.

Oracle Loader for Hadoop

Right click on a target table in Oracle Database.  If this option is grayed out, you may need to select “Manage Features -> Hadoop Integration” on the database connection you are using.  See “Checkboxes in SQL Developer” for more information.

Oracle SQL Connector for HDFS

Right click on the ‘Tables’ tab in the Oracle Database connection.  If this option is grayed out, you may need to select “Manage Features -> Hadoop Integration” on the database connection you are using.  See “Checkboxes in SQL Developer” for more information.

Copy to Hadoop

Right click on the ‘Tables’ tab in the Apache Hive connection

Appendix A

Copy to Hadoop environment file

The file .sqldev_cp2hadoop_env has to be create in the home directory (for example, /home/<user>) of the SSH Host.  The contents of the file are:

#!/bin/bash

export CP2HADOOP_HOME=<parent directory of the directory containing Copy to Hadoop jars.  On the BDA this is /opt/oracle/bigdatasql/bdcell-12.1>

export HADOOP_CLASSPATH=${CP2HADOOP_HOME}/jlib/*

export PATH=${HADOOP_HOME}/bin:${PATH}

# If using Oracle Wallet

export WALLET_LOCATION=<location of the wallet files, if using Oracle Wallet.  For example, /home/${USER}/wallet>

export TNS_ADMIN=<location of the wallet files, if using Oracle Wallet. TNS_ADMIN should point to the same location. For example, /home/${USER}/wallet>

export CLUSTER_WALLET_LOCATION=${WALLET_LOCATION}

export CLUSTER_TNS_ADMIN=${TNS_ADMIN}

 

Oracle Loader for Hadoop environment file

The file .sqldev_olh_env has to be create in the home directory (/home/<user>) of the SSH Host.  The contents of the file are:

#!/bin/bash

export HIVE_HOME=<Example: /opt/cloudera/parcels/CDH/lib/hive>

export HIVE_CONF_DIR=<Example: /etc/hive/conf>

export OLH_HOME=<Example on Oracle Big Data Appliance: /opt/oracle/olh>

export HADOOP_CLASSPATH=${OLH_HOME}/jlib/*:${HIVE_CONF_DIR}:${HIVE_HOME}/lib/*

export OLH_LIB_JARS=${HIVE_HOME}/lib/hive-exec.jar,${HIVE_HOME}/lib/hive-metastore.jar,${HIVE_HOME}/lib/libfb303-0.9.2.jar

 

#If using Oracle Wallet

 

export WALLET_LOCATION=<location of the wallet files, if using Oracle Wallet.  For example, /home/${USER}/wallet>

export TNS_ADMIN=<location of the wallet files, if using Oracle Wallet. TNS_ADMIN should point to the same location. For example, /home/${USER}/wallet>

export CLUSTER_WALLET_LOCATION=${WALLET_LOCATION}

export CLUSTER_TNS_ADMIN=${TNS_ADMIN}

 

Oracle SQL Connector for HDFS environment file

The file .sqldev_osch_env has to be create in the home directory (/home/<user>) of the SSH Host.  The contents of the file are:

#!/bin/bash

export HIVE_HOME=<Example: /opt/cloudera/parcels/CDH/lib/hive>

export HIVE_CONF_DIR=<Example: /etc/hive/conf>

export OSCH_HOME=<Example on Oracle Big Data Appliance: /opt/oracle/osch>

export HADOOP_CLASSPATH=${OSCH_HOME}/jlib/*:${HIVE_CONF_DIR}:${HIVE_HOME}/lib/*

 

#If using Oracle Wallet

 

export WALLET_LOCATION=<location of the wallet files, if using Oracle Wallet.  For example, /home/${USER}/wallet>

export TNS_ADMIN=<location of the wallet files, if using Oracle Wallet. TNS_ADMIN should point to the same location. For example, /home/${USER}/wallet>

export CLUSTER_WALLET_LOCATION=${WALLET_LOCATION}

export CLUSTER_TNS_ADMIN=${TNS_ADMIN}

 

 

Note: Oracle Loader for Hadoop and Oracle SQL Connector for HDFS are licensed with Oracle Big Data Connectors.   Oracle Copy to Hadoop is licensed with Oracle Big Data SQL. 

Join the discussion

Comments ( 1 )
  • prabhjot Singh Thursday, October 3, 2019
    May i request you to share a case when connecting and moving data between kerberos apache hadoop and sql developer
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.