Wednesday Apr 16, 2014

Oracle's Strategy for Advanced Analytics

At Oracle our goal is to enable you to get timely insight from all of your data. We continuously enhance Oracle Database to allow workloads that have traditionally required extracting data from the database to run in-place. We do this to narrow the gap that exists between insights that can be obtained and available data - because any data movement introduces latencies, complexity due to more moving parts, the ensuing need for data reconciliation and governance, as well as increased cost. The Oracle tool set considers the needs of all types of enterprise users - users preferring GUI based access to analytics with smart defaults and heuristics out of the box, users choosing to work interactively and quantitatively with data using R, and users preferring SQL and focusing on operationalization of models.

Oracle recognized the need to support data analysts, statisticians, and data scientists with a widely used and rapidly growing statistical programming language. Oracle chose R - recognizing it as the new de facto standard for computational statistics and advanced analytics. Oracle supports R in at least 3 ways:


  • R as the language of interaction with the database

  • R as the language in which analytics can be written and executed in the database as a high performance computing platform

  • R as the language in which several native high performance analytics have been written that execute in database


Additionally, of course, you may chose to leverage any of the CRAN algorithms to execute R scripts at the database server leveraging several forms of data parallelism.

Providing the first and only supported commercial distribution of R from an established company, Oracle released Oracle R Distribution. In 2012 Oracle embarked on the Hadoop journey acknowledging alternative data management options emerging in the open source for management of unstructured or not-yet-structured data. In keeping with our strategy of delivering analytics close to where data is stored, Oracle extended Advanced Analytics capabilities to execute on HDFS resident data in Hadoop environments. R has been integrated into Hadoop in exactly the same manner as it has been with the database.

Realizing that data is stored in both database and non-database environment, Oracle provides users options for storing their data (in Oracle Database, HDFS, and Spark RDD), where to perform computations (in-database or the Hadoop cluster), and where results should be stored (Oracle Database or HDFS). Users can write R scripts that can be leveraged across database and Hadoop environments. Oracle Database, as a preferred location for storing R scripts, data, and result objects, provides a real-time scoring and deployment platform. It is also easy to create a model factory environment with authorization, roles, and privileges, combined with auditing, backup, recovery, and security.

Oracle provides a common infrastructure that supports both in-database and custom R algorithms. Oracle also provides an integrated GUI for business users. Oracle provides both R-based access and GUI-based access to in-database analytics. A major part of Oracle's strategy is to maintain agility in our portfolio of supported techniques - being responsive to customer needs.

Thursday Feb 16, 2012

Oracle Announces Availability of Oracle Advanced Analytics for Big Data

Oracle Announces Availability of Oracle Advanced Analytics for Big Data

Oracle Integrates R Statistical Programming Language into Oracle Database 11g

REDWOOD SHORES, Calif. - February 8, 2012

News Facts

Oracle today announced the availability of  Oracle Advanced Analytics, a new option for Oracle Database 11g that bundles Oracle R Enterprise together with Oracle Data Mining.
Oracle R Enterprise delivers enterprise class performance for users of the R statistical programming language, increasing the scale of data that can be analyzed by orders of magnitude using Oracle Database 11g.
R has attracted over two million users since its introduction in 1995, and Oracle R Enterprise dramatically advances capability for R users. Their existing R development skills, tools, and scripts can now also run transparently, and scale against data stored in Oracle Database 11g.
Customer testing of Oracle R Enterprise for Big Data analytics on Oracle Exadata has shown up to 100x increase in performance in comparison to their current environment.
Oracle Data Mining, now part of Oracle Advanced Analytics, helps enable customers to easily build and deploy predictive analytic applications that help deliver new insights into business performance.
Oracle Advanced Analytics, in conjunction with Oracle Big Data Appliance, Oracle Exadata Database Machine and Oracle Exalytics In-Memory Machine, delivers the industry’s most integrated and comprehensive platform for Big Data analytics.

Comprehensive In-Database Platform for Advanced Analytics

Oracle Advanced Analytics brings analytic algorithms to data stored in Oracle Database 11g and Oracle Exadata as opposed to the traditional approach of extracting data to laptops or specialized servers.
With Oracle Advanced Analytics, customers have a comprehensive platform for real-time analytic applications that deliver insight into key business subjects such as churn prediction, product recommendations, and fraud alerting.
By providing direct and controlled access to data stored in Oracle Database 11g, customers can accelerate data analyst productivity while maintaining data security throughout the enterprise.
Powered by decades of Oracle Database innovation, Oracle R Enterprise helps enable analysts to run a variety of sophisticated numerical techniques on billion row data sets in a matter of seconds making iterative, speed of thought, and high-quality numerical analysis on Big Data practical.
Oracle R Enterprise drastically reduces the time to deploy models by eliminating the need to translate the models to other languages before they can be deployed in production.
Oracle R Enterprise integrates the extensive set of Oracle Database data mining algorithms, analytics, and access to Oracle OLAP cubes into the R language for transparent use by R users.
Oracle Data Mining provides an extensive set of in-database data mining algorithms that solve a wide range of business problems. These predictive models can be deployed in Oracle Database 11g and use Oracle Exadata Smart Scan to rapidly score huge volumes of data.
The tight integration between R, Oracle Database 11g, and Hadoop enables R users to write one R script that can run in three different environments: a laptop running open source R, Hadoop running with Oracle Big Data Connectors, and Oracle Database 11g.
Oracle provides single vendor support for the entire Big Data platform spanning the hardware stack, operating system, open source R, Oracle R Enterprise and Oracle Database 11g.
To enable easy enterprise-wide Big Data analysis, results from Oracle Advanced Analytics can be viewed from Oracle Business Intelligence Foundation Suite and Oracle Exalytics In-Memory Machine.

Supporting Quotes

“Oracle is committed to meeting the challenges of Big Data analytics. By building upon the analytical depth of Oracle SQL, Oracle Data Mining and the R environment, Oracle is delivering a scalable and secure Big Data platform to help our customers solve the toughest analytics problems,” said Andrew Mendelsohn, senior vice president, Oracle Server Technologies.
“We work with leading edge customers who rely on us to deliver better BI from their Oracle Databases. The new Oracle R Enterprise functionality allows us to perform deep analytics on Big Data stored in Oracle Databases. By leveraging R and its library of open source contributed CRAN packages combined with the power and scalability of Oracle Database 11g, we can now do that,” said Mark Rittman, co-founder, Rittman Mead.

Supporting Resources

Connect with Oracle Database via Blog, Facebook and Twitter

About Oracle

Oracle engineers hardware and software to work together in the cloud and in your data center. For more information about Oracle (NASDAQ: ORCL), visit http://www.oracle.com.

Trademarks

Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.

Contact Info

Eloy Ontiveros
Oracle
+1.650.607.6458
eloy.ontiveros@oracle.com

Joan Levy
Blanc & Otus for Oracle
+1.415.856.5110
jlevy@blancandotus.com

Friday Feb 03, 2012

What is R?

For many in the Oracle community, the addition of R through Oracle R Enterprise could leave them wondering "What is R?"

R has been receiving a lot of attention recently, although it’s been around for over 15 years. R is an open-source language and environment for statistical computing and data visualization, supporting data manipulation and transformations, as well as sophisticated graphical displays. It's being taught in colleges and universities in courses on statistics and advanced analytics - even replacing more traditional statistical software tools. Corporate data analysts and statisticians often know R and use it in their daily work, either writing their own R functionality, or leveraging the more than 3400 open source packages. The Comprehensive R Archive Network (CRAN) open source packages support a wide range of statistical and data analysis capabilities. They also focus on analytics specific to individual fields, such as bioinformatics, finance, econometrics, medical image analysis, and others (see CRAN Task Views).

So why do statisticians and data analysts use R?

Well, R is a statistics language similar to SAS or SPSS. It’s a powerful, extensible environment, and as noted above, it has a wide range of statistics and data visualization capabilities. It’s easy to install and use, and it’s free – downloadable from the CRAN R project website.

In contrast, statisticians and data analysts typically don’t know SQL and are not familiar with database tasks. R provides statisticians and data analysts access a wide range of analytical capabilities in a natural statistical language, allowing them to remain highly productive. For example, writing R functions is simple and can be done quickly. Functions can be made to return R objects that can be easily passed to and manipulated by other R functions. By comparison, traditional statistical tools can make the implementation of functions cumbersome, such that programmers resort to macro-oriented programming constructs instead.

So why do we need anything else?

R was conceived as a single user tool that is not multi-threaded.  The client and server components are bundled together as a single executable, much like Excel.

R is limited by the memory and processing power of the machine where it runs, but in addition, being single threaded, it cannot automatically leverage the CPU capacity on a user’s multi-processor laptop without special packages and programming.

However, there is another issue that limits R’s scalability…

R’s approach to passing data between function invocations results in data duplication – this chews up memory faster. So inherently, R is not good for big data, or depending on the machine and tasks, even gigabyte-sized data sets.

This is where Oracle R Enterprise comes in. As we'll continue to discuss in this blog, Oracle R Enterprise lifts this memory and computational constraint found in R today by executing requested R calculations on data in the database, using the database itself as the computational engine. Oracle R Enterprise allows users to further leverage Oracle's engineered systems, like Exadata, Big Data Appliance, and Exalytics, for enterprise-wide analytics, as well as reporting tools like Oracle Business Intelligence Enterprise Edition dashboards and BI Publisher documents.





About

The place for best practices, tips, and tricks for applying Oracle R Enterprise, Oracle R Distribution, ROracle, and Oracle R Advanced Analytics for Hadoop in both traditional and Big Data environments.

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today