X

Move data between Oracle Database and Apache Hadoop using high speed connectors.

  • April 10, 2013

Accessing a Hive Table from Oracle Database

Hive is a popular tool to query data in HDFS.  Oracle SQL Connector for HDFS can read data directly
from a Hive table in version 2.0. Also,
the command-line tool of Oracle SQL Connector for HDFS has been simplified
greatly. The connector only needs the
Hive table and Hive database name, specified as Hadoop configuration properties,
to access data from a Hive table.

oracle.hadoop.exttab.sourceType=hive

oracle.hadoop.exttab.hive.tableName
=<hive table name>
oracle.hadoop.exttab.hive.databaseName=<hive
database name>

A command-line tool creates an external table in Oracle
Database. This external table is queried
to access data in the Hive table. Since an
external table can be queried like any other table in the database, Oracle SQL
functionality is available to query the Hive table.

The name of the external table and database connection
information are also specified as Hadoop configuration properties.

hadoop jar
$OSCH_HOME/jlib/orahdfs.jar \
oracle.hadoop.exttab.ExternalTable
-D oracle.hadoop.exttab.sourceType=hive
\
-D
oracle.hadoop.exttab.hive.tableName=<hive table name> \
-D
oracle.hadoop.exttab.hive.databaseName=<hive database name> \
-D
oracle.hadoop.connection.url=<database connection url> \
-D oracle.hadoop.exttab.tableName=<external
table name> \
….
-createTable

While creating the external table the Hive table data types are
automatically mapped to Oracle data types in the external table. Below are some sample mappings:

Hive Tinyint, Smallint, Int Oracle NUMBER

Hive Float, Double Oracle NUMBER
(default precision)

Hive Decimal Oracle NUMBER

Hive Boolean Oracle VARCHAR2(5)

Hive String Oracle VARCHAR2(4000)

Hive Timestamp Oracle TIMESTAMP

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.