By Mike.Hallett-Oracle on Jul 15, 2015
You may have noticed that OBI can now analyse Hadoop data directly (without ETL) using Oracle Big Data SQL for Data Access (or Cloudera’s Impala), and can easily join it with other OBI data sources onto one dashboard. To find out more you can review the Big Data documentation below, and try it for yourself by downloading the Demonstration VM for OBI 11g SampleApp v506 including Cloudera and Big Data SQL.
Essentially, Oracle Big Data SQL accesses Hadoop via external tables to:
: Enables you to create Oracle external tables over Apache Hive data sources. Use this access driver when you already have Hive tables defined for your HDFS data sources. can also access data stored in other locations, such as HBase, that have Hive tables defined for them. The PL/SQL package contains a function named . It returns the data dictionary language (DDL) to create an external table for accessing a Hive table.
: Enables you to create Oracle external tables directly over files stored in HDFS. This access driver uses Hive syntax to describe a data source, assigning default column names of , , and so forth. You do not need to create a Hive table manually as a separate step: you can just define the record format of text data, or you can specify a SerDe for a particular data format.
External tables do not have traditional indexes, so that queries against them typically require a full table scan. However, Oracle Big Data SQL extends SmartScan capabilities, such as filter-predicate offloads, to Oracle external tables with the installation of Exadata storage server software onto an Oracle Big Data Appliance. This technology enables the Oracle Big Data Appliance to discard a huge portion of irrelevant data—often up to 99 percent of the total—and return much smaller result sets to the Oracle Exadata Database Machine. Therefore, End users obtain the results of their queries significantly faster, as the direct result of a reduced load on Oracle Database and reduced traffic on the network.
Note that alternatively, and similarly, Oracle SQL Connector for HDFS provides access to Hadoop data for all Oracle Big Data Appliance racks, including those that are not connected to Oracle Exadata Database Machine. However, it does not offer the performance benefits of Oracle Big Data SQL: see Oracle Big Data Connectors User's Guide.