By Sherry Lamonica-Oracle on Feb 02, 2012
Analyzing huge data sets presents a challenging opportunity for IT decision makers, driven by the balance between the maintenance and support of existing IT infrastructure with the need to analyze rapidly growing data stores. In many cases, processing this data requires a fresh approach because traditional techniques fail when applied to massive data sets. To extract immediate value from big data, we desire tools that efficiently access, organize, analyze and maintain a variety of data types.
Oracle R Enterprise (ORE), a component in the Oracle Advanced Analytics Option of Oracle Database Enterprise Edition, emerges as the clear solution to these challenges. ORE integrates the popular open-source R statistical programming environment with Oracle Database 11g, Oracle Exadata and the Oracle Big Data Appliance, delivering enterprise-level analytics based on R scripts and parallelized, in-database modeling.
How do R and Oracle R Enterprise work together?
The powerful R programming environment enables the creation of sophisticated graphics, statistical analyses, and simulations. It contains a vast set of built-in functions which may be extended to build custom statistical packages. The R engine is limited by capacity and performance for large data, but with Oracle R Enterprise, R bypasses these constraints by leveraging the database as the analytics engine directly from their R session.
The components that support Oracle R Enterprise include:
1. The Oracle R Enterprise transparency layer - a collection of R packages with functions to connect to Oracle Database and use R functionality in Oracle Database. This enables R users to work with data too large to fit into the memory of a user's desktop system, and leverage the scalable Oracle Database as a computational engine.
2. The Oracle statistics engine - a collection of statistical functions and procedures corresponding to commonly-used statistical libraries. The statistics engine packages also execute in Oracle Database.
3. SQL extensions supporting embedded R execution through the database on the database server. R users can execute R closures (functions) using an R or SQL API, while taking advantage of data parallelism. Using the SQL API for embedded R execution, sophisticated R graphics and results can be exposed in OBIEE dashboards and BI Publisher documents.
4. Oracle R Connector for Hadoop (ORCH) - an R package that interfaces with the Hadoop Distributed File System (HDFS) and enables executing MapReduce jobs. ORCH enables R users to work directly with an Oracle Hadoop cluster, executing computations from the R environment, written in the R language and working on data resident in HDFS, Oracle Database, or local files.
Using a simple R workflow, R users can seamlessly utilize the parallel processing architecture of ORE and ORCH for scalability and better performance. Analytics and reporting tasks are moved to the Oracle Database, eliminating long approval chains for data movement and dramatically increasing processing speed. R users are not required to learn SQL because the R-to-SQL translation is shipped to the database and processed behind the scenes. The significant benefits to IT include improved data security, data maintenance and audit compliance practices.
We’re proud to announce Oracle R Enterprise 1.0 and look forward to your comments. To learn more about ORE, visit our product page.