What is the distinction between Oracle R Distribution and Oracle R Enterprise?
Oracle R Distribution (ORD) is Oracle's redistribution of open source R, with enhancements for dynamically loading high performance libraries like Intel's Math Kernel Library (MKL) and setting R memory limits on database server-side R engine execution. Oracle provides support for ORD to customers of the Oracle Advanced Analytics option (which includes Oracle R Enterprise and Oracle Data Mining), Oracle Enterprise Linux, and Oracle Big Data Appliance. ORD can be used in combination with R packages such as those downloaded from CRAN. Oracle does not, however, provide support for non-Oracle-provided R packages.
Oracle R Enterprise (ORE) is a set of R packages and library that allows R users to manipulate data stored in Oracle Database tables and views, leveraging Oracle Database as a high performance compute engine. As noted above, ORE is a component of the Oracle Advanced Analytics option to Oracle Database Enterprise Edition. ORE functionality can be divided into three main areas: transparency layer, machine learning, and embedded R execution.
The Transparency Layer allows R users to specify standard R syntax on ore.frame objects - a subclass of data.frame - which serve as proxies for database tables and views. Rather than pull data into R memory, ORE translates R function invocations into Oracle SQL for execution in Oracle Database. This is also referred to as "function pushdown." By eliminating data movement, several things are accomplished: 1) fast access - eliminate the time required to move data, which for bigger data is significant, 2) scalability - eliminate client R engine memory limitations to hold the data and to manipulate/transform it, and 3) performance - leverage Oracle Database parallelism, query optimization, column indexes, data partitioning, etc.
The ORE Machine Learning, or predictive analytics, capability provides a set of in-database, parallel algorithms that are exposed through an R interface. The set of algorithms was recently highlighted in my recent blog post. These algorithms include features such as algorithm-specific automatic data preparation, and support for integrated text mining. The set of algorithms can be supplemented with the use of open source R packages, such as those available on CRAN via embedded R execution, discussed next.
Embedded R Execution refers to the ability to execute user-defined R functions at the database server, in locally spawned R engines under control of Oracle Database. With the proper database permissions, third party R packages such as those form CRAN can be installed at the database server R engine. Users can store their R function in the database R Script Repository and invoke it by name, passing data and arguments. Users can also leverage data-parallel and task-parallel execution of their R scripts. In addition to invoking named R scripts from R, users can invoke them from SQL. With SQL invocation, structured results and images can be returned as database tables, and R objects and images can be returned together as an XML string. This enables enterprises to more readily integrate R results into applications and dashboards as Oracle Database provides the necessary "plumbing" to make this easy. Further, database components such as DBMS_SCHEDULER can be used to schedule R script execution.
For more detailed information on Oracle R Enterprise, see this link.