Wednesday Apr 10, 2013

Accessing a Hive Table from Oracle Database

Hive is a popular tool to query data in HDFS.  Oracle SQL Connector for HDFS can read data directly from a Hive table in version 2.0. Also, the command-line tool of Oracle SQL Connector for HDFS has been simplified greatly. The connector only needs the Hive table and Hive database name, specified as Hadoop configuration properties, to access data from a Hive table.

oracle.hadoop.exttab.sourceType=hive
oracle.hadoop.exttab.hive.tableName
=<hive table name>
oracle.hadoop.exttab.hive.databaseName=<hive database name>

A command-line tool creates an external table in Oracle Database. This external table is queried to access data in the Hive table. Since an external table can be queried like any other table in the database, Oracle SQL functionality is available to query the Hive table.

The name of the external table and database connection information are also specified as Hadoop configuration properties.

hadoop jar $OSCH_HOME/jlib/orahdfs.jar \
oracle.hadoop.exttab.ExternalTable
-D oracle.hadoop.exttab.sourceType=hive \
-D oracle.hadoop.exttab.hive.tableName=<hive table name> \
-D oracle.hadoop.exttab.hive.databaseName=<hive database name> \
-D oracle.hadoop.connection.url=<database connection url> \
-D oracle.hadoop.exttab.tableName=<external table name> \
….
-createTable

While creating the external table the Hive table data types are automatically mapped to Oracle data types in the external table. Below are some sample mappings:

Hive Tinyint, Smallint, Int Oracle NUMBER
Hive Float, Double Oracle NUMBER (default precision)
Hive Decimal Oracle NUMBER
Hive Boolean Oracle VARCHAR2(5)
Hive String Oracle VARCHAR2(4000)
Hive Timestamp Oracle TIMESTAMP

About

Oracle Loader for Hadoop and Oracle SQL Connector for HDFS

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today