I posted a new white paper authored by Denny Wong, Principal Member of Technical Staff, User Interfaces and Components, Oracle Data Mining Technologies. You can access the white paper here and the companion files here. Here is an excerpt:
Miner (Extension of SQL Developer 4.0)
Integrate Oracle R Enterprise Mining Algorithms into workflow
using the SQL Query node
Oracle R Enterprise (ORE), a component of the Oracle Advanced Analytics Option, makes the open source R statistical programming language and environment ready for the enterprise and big data. Designed for problems involving large amounts of data, Oracle R Enterprise integrates R with the Oracle Database. R users can develop, refine and deploy R scripts that leverage the parallelism and scalability of the database to perform predictive analytics and data analysis.
Oracle Data Miner (ODMr) offers a comprehensive set of in-database algorithms for performing a variety of mining tasks, such as classification, regression, anomaly detection, feature extraction, clustering, and market basket analysis. One of the important capabilities of the new SQL Query node in Data Miner 4.0 is a simplified interface for integrating R scripts registered with the database. This provides the support necessary for R Developers to provide useful mining scripts for use by data analysts. This synergy provides many additional benefits as noted below.
· R developers can further extend ODMr mining capabilities by incorporating the extensive R mining algorithms from the open source CRAN packages or leveraging any user developed custom R algorithms via SQL interfaces provided by ORE.
· Since this SQL Query node can be part of a workflow process, R scripts can leverage functionalities provided by other workflow nodes which can simplify the overall effort of integrating R capabilities within the database.
· R mining capabilities can be included in the workflow deployment scripts produced by the new sql script generation feature. So the ability of deploy R functionality within the context of an Data Miner workflow is easily accomplished.
· Data and processing are secured and controlled by the Oracle Database. This alleviates a lot of risk that are incurred by other providers, when users have to export data out of the database in order to perform advanced analytics.
Oracle Advanced Analytics saves analysts, developers, database administrators and management the headache of trying to integrate R and database analytics. Instead, users can quickly gain the benefit of new R analytics and spend their time and effort on developing business solutions instead of building homegrown analytical platforms.
This paper should be very useful to R developers wishing to better understand how to leverage embedding R Scripts for use by Data Analysts. Analysts will also find the paper useful to see how R features can be surfaced for their use in Data Miner. The specific use case covered demonstrates how to use the SQL Query node to integrate R glm and rpart regression model build, test, and score operations into the workflow along with nodes that perform data preparation and residual plot graphing. However, the integration process described here can easily be adapted to integrate other R operations like statistical data analysis and advanced graphing to expand ODMr functionalities.