By Mark Hornick on Jan 17, 2012
Welcome to the Oracle R Enterprise blog - brought to you by the Oracle Advanced Analytics group. We'll be sharing best practices, tips, and tricks for applying Oracle R Enterprise and Oracle R Connector for Hadoop in both traditional and new "big data" environments. Oracle R Enterprise, along with Oracle Data Mining, are the two components of the new Oracle Advanced Analytics Option to Oracle Database.
Here's a brief introduction to Oracle's R offerings: Oracle R Distribution, Oracle R Enterprise, and Oracle R Connector for Hadoop.
Oracle R Distribution provides an Oracle-supported distribution of open source R — enhanced with Intel’s MKL libraries for high performance mathematical computations on x86 hardware. The Oracle R Distribution facilitates enterprise acceptance of R, since the lack of a major corporate sponsor has made some companies concerned about fully adopting R.
Oracle R Enterprise (ORE) integrates the open-source R statistical environment and language with Oracle Database 11g, and the Oracle engineered solutions of Oracle Exadata and Oracle Big Data Appliance. ORE delivers enterprise-level advanced analytics based on the R environment, leveraging the database as an analytical compute engine. This allows R users like data analysts and statisticians to use the R client directly against data stored in Oracle Database 11g—vastly increasing scalability, performance, and security.
As an embedded component of the RDBMS, ORE eliminates R’s memory constraints since it can work on data directly in the database. R users can also execute R scripts in Oracle Database to support enterprise production applications. R's data.frame results and sophisticated graphics can be delivered through Oracle BI Publisher documents and OBIEE dashboards. Since it’s R, users are also able to leverage the latest contributed open source packages.
For data mining, R users not only can build models using any of the algorithms in the CRAN machine learning task view, but also leverage in-database implementations for predictions (e.g., stepwise regression, GLM, SVM), attribute selection, clustering, feature extraction via non-negative matrix factorization, association rules, and anomaly detection.
Oracle R Connector for Hadoop, one of the connectors available for Oracle Big Data Appliance, allows R users to work with the Hadoop Distributed File System (HDFS) and execute MapReduce programs on the Big Data Appliance Hadoop Cluster. R users write mapper and reducer functions in the R language, and invoke MapReduce jobs from the R environment.
We'll be exploring these components and their application in future posts.