Monday Nov 19, 2012

Join us at BIWA Summit 2013!

Registration is now open for BIWA Summit 2013.  This event, focused on Business Intelligence, Data Warehousing and Analytics, is hosted by the BIWA SIG of the IOUG on January 9 and 10 at the Hotel Sofitel, near Oracle headquarters in Redwood City, California.

Be sure to check out our featured speakers, including Oracle executives Balaji Yelamanchili, Vaishnavi Sashikanth, and Tom Kyte, and Ari Kaplan, sports analyst, as well as the many other internationally recognized speakers.  Hands-on labs will give you the opportunity to try out much of the Oracle software for yourself (including Oracle R Enterprise)--be sure to bring a laptop capable of running Windows Remote Desktop.  There will be over 35 sessions on a wide range of BIWA-related topics.  See the BIWA Summit 2013 web site for details and be sure to register soon, while early bird rates still apply.

Monday Sep 17, 2012

Podcast interview with Michael Kane

In this podcast interview with Michael Kane, Data Scientist and Associate Researcher at Yale University, Michael discusses the R statistical programming language, computational challenges associated with big data, and two projects involving data analysis he conducted on the stock market "flash crash" of May 6, 2010, and the tracking of transportation routes bird flu H5N1. Michael also worked with Oracle on Oracle R Enterprise, a component of the Advanced Analytics option to Oracle Database Enterprise Edition. In the closing segment of the interview, Michael comments on the relationship between the data analyst and the database administrator and how Oracle R Enterprise provides secure data management, transparent access to data, and improved performance to facilitate this relationship.

Listen now...

Thursday Feb 16, 2012

R and Database Access

In an enterprise, databases are typically where data reside. So where data analytics are required, it's important for R and the database to work well together. The more seamlessly and naturally R users can access data, the easier it is to produce results. R users may leverage ODBC, JDBC, or similar types of connectivity to access database-resident data. However, this  requires working with SQL to formulate queries to process or filter data in the database, or to pull data into the R environment for further processing using R. If R users, statisticians, or data analysts are unfamiliar with SQL or database tasks, or don't have database access, they often consult IT for data extracts.

Not having direct access to database-resident data introduces delays in obtaining data, and can make near real-time analytics impossible. In some instances, users request data sets much larger than required to avoid multiple requests to IT. Of course, this approach introduces costs of exporting, moving, and storing data, along with the associated backup, recovery, and security risks.

Oracle R Enterprise eliminates the need to know SQL to work with database-resident data. Through the Oracle R Enterprise transparency layer, R users can access data stored in tables and views as virtual data frames. Base R functions performed on these "ore.frames" are overloaded to generate SQL which is transparently sent to Oracle Database for execution - leveraging the database as a high-performance computational engine.

Check out Oracle R Enterprise for examples of the interface, documentation, and a link to download Oracle R Enterprise.

Tuesday Jan 17, 2012

Welcome to Oracle R Enterprise!

Welcome to the Oracle R Enterprise blog - brought to you by the Oracle Advanced Analytics group. We'll be sharing best practices, tips, and tricks for applying Oracle R Enterprise and Oracle R Connector for Hadoop in both traditional and new "big data" environments. Oracle R Enterprise, along with Oracle Data Mining, are the two components of the new Oracle Advanced Analytics Option to Oracle Database.  

Here's a brief introduction to Oracle's R offerings: Oracle R Distribution, Oracle R Enterprise, and Oracle R Connector for Hadoop.

Oracle R Distribution provides an Oracle-supported distribution of open source R — enhanced with Intel’s MKL libraries for high performance mathematical computations on x86 hardware. The Oracle R Distribution facilitates enterprise acceptance of R, since the lack of a major corporate sponsor has made some companies concerned about fully adopting R.

Oracle R Enterprise (ORE) integrates the open-source R statistical environment and language with Oracle Database 11g, and the Oracle engineered solutions of Oracle Exadata and Oracle Big Data Appliance. ORE delivers enterprise-level advanced analytics based on the R environment, leveraging the database as an analytical compute engine. This allows R users like data analysts and statisticians to use the R client directly against data stored in Oracle Database 11g—vastly increasing scalability, performance, and security.

As an embedded component of the RDBMS, ORE eliminates R’s memory constraints since it can work on data directly in the database. R users can also execute R scripts in Oracle Database to support enterprise production applications. R's data.frame results and sophisticated graphics can be delivered through Oracle BI Publisher documents and OBIEE dashboards. Since it’s R, users are also able to leverage the latest contributed open source packages.

For data mining, R users not only can build models using any of the algorithms in the CRAN machine learning task view, but also leverage in-database implementations for predictions (e.g., stepwise regression, GLM, SVM), attribute selection, clustering, feature extraction via non-negative matrix factorization, association rules, and anomaly detection.

Oracle R Connector for Hadoop, one of the connectors available for Oracle Big Data Appliance, allows R users to work with the Hadoop Distributed File System (HDFS) and execute MapReduce programs on the Big Data Appliance Hadoop Cluster. R users write mapper and reducer functions in the R language, and invoke MapReduce jobs from the R environment.

We'll be exploring these components and their application in future posts.



The place for best practices, tips, and tricks for applying Oracle R Enterprise, Oracle R Distribution, ROracle, and Oracle R Advanced Analytics for Hadoop in both traditional and Big Data environments.


« April 2014