Tuesday Sep 15, 2015

Download VM Oracle Big Data Lite 4.2.1 Including Big Data Discovery

We just released Oracle Big Data Lite 4.2.1 VM.  This VM provides many of the key big data technologies that are part of Oracle's big data platform.  Along with all the great features of the previous version, Big Data Lite now adds Oracle Big Data Discovery 1.1

Then you can follow the tutorials at Getting Started with Oracle Big Data Discovery and in the Oracle Big Data Learning Library.

All of these products are pre-configured in the Big Data Lite 4.2.1 VM:

· Oracle Enterprise Linux 6.6

· Oracle Database 12c Release 1 Enterprise Edition ( - including Oracle Big Data SQL-enabled external tables, Oracle Multitenant, Oracle Advanced Analytics, Oracle OLAP, Oracle Partitioning, Oracle Spatial and Graph, and more.

· Cloudera Distribution including Apache Hadoop (CDH5.4.0)

· Cloudera Manager (5.4.0)

· Oracle Big Data Discovery 1.1

· Oracle Big Data Connectors 4.2

· Oracle SQL Connector for HDFS 3.3.0

· Oracle Loader for Hadoop 3.4.0

· Oracle Data Integrator 12c

· Oracle R Advanced Analytics for Hadoop 2.5.0

· Oracle XQuery for Hadoop 4.2.0

· Oracle NoSQL Database Enterprise Edition 12cR1 (3.3.4)

· Oracle Big Data Spatial and Graph 1.0

· Oracle JDeveloper 12c (12.1.3)

· Oracle SQL Developer and Data Modeler 4.1

· Oracle Data Integrator 12cR1 (

· Oracle GoldenGate 12c

· Oracle R Distribution 3.1.1

· Oracle Perfect Balance 2.4.0

· Oracle CopyToBDA 2.0 

Download it, and check out the tutorials and demos that are available from the Big Data Lite download page.

Wednesday Aug 19, 2015

Oracle Big Data Discovery Version 1.1

Partners and customers can replay our webcast with Cloudera, Unlock Your Analytic Talent with the Visual Face of Hadoop (originally on August 29).

Oracle Big Data Discovery version 1.1 is now available. This release extends Oracle’s innovation in Big Data Analytics by bringing together the best open-source and Oracle technologies, enabling you to reach more of the market and address more of your clients’ analytics challenges.

  • Big Data Discovery now runs on Hortonworks HDP 2.2.4+, in addition to Cloudera CDH 5.3+.  That makes BDD the first Spark-based big data analytics product to run on the top two Hadoop distributions, significantly expanding your market opportunity.
  • Customers can now access enterprise data sources via JDBC, making it easy to mash up trusted corporate data with big data in Hadoop.  BDD 1.1 elegantly handles changes across all this data, enabling full refreshes, incremental updates, and easy sample expansions.  All data is live, which means changes are reflected automatically in visualizations, transformations, enrichments, and joins.
  • Dynamic visualizations fuel discovery – but no product can include every visualization out-of-the-box.  This release includes a custom visualization framework that allows customers and partners to create and publish any visual and have it behave like it is native to BDD.  Combined with new visualizations and simpler configuration, this streamlines the creation of discovery dashboards and rich, reusable discovery applications.
  • Big Data Discovery is unique in allowing partners and customers to find, explore, transform, and analyze big data all within a single product.  This release significantly extends BDD Transform, making it both easier and more powerful.  New UIs make it easy to derive structure from messy Hadoop sources, guiding users through common functions, like extracting entities and locations, without writing code.  Transformation scripts can be shared and published, driving collaboration, and scripts can be scheduled, automating enrichment.  Transform also includes a redesigned custom transformation experience and the ability to call external functions (such as R scripts), providing increased support for sophisticated users. 
  • BDD 1.1 supports Kerberos for authentication (both MIT and Microsoft versions); enabling authorization via Studio (including integration with LDAP) to support Single Sign-on (SSO); and providing security at both project and dataset levels.   These options allow customers to leverage their existing security and extend fine-grained control to big data analytics, ensuring people see exactly what they should.

Also see “Big Data Discovery” Hosted Demonstration Instances for Partners.

And if you wish to deep dive into Spark (although to use BDD, this is not needed), I suggest you check out the Cloudera Developer Training for Apache Spark.

Wednesday Jul 15, 2015

OBI11g Analyses Hadoop data directly (without ETL) using Oracle Big Data SQL

You may have noticed that OBI can now analyse Hadoop data directly (without ETL) using Oracle Big Data SQL for Data Access (or Cloudera’s Impala), and can easily join it with other OBI data sources onto one dashboard. To find out more you can review the Big Data documentation below, and try it for yourself by downloading the Demonstration VM for OBI 11g SampleApp v506 including Cloudera and Big Data SQL.

Oracle Big Data Appliance Software User's Guide

1 Introducing Oracle Big Data Appliance

2 Administering Oracle Big Data Appliance

3 Supporting User Access to Oracle Big Data Appliance

4 Configuring Oracle Exadata Database Machine for Use with Oracle Big Data Appliance

5 Optimizing MapReduce Jobs Using Perfect Balance

6 Using Oracle Big Data SQL for Data Access

7 Oracle Big Data SQL Reference

Essentially, Oracle Big Data SQL accesses Hadoop via external tables to:

  • ORACLE_HIVE: Enables you to create Oracle external tables over Apache Hive data sources. Use this access driver when you already have Hive tables defined for your HDFS data sources. ORACLE_HIVE can also access data stored in other locations, such as HBase, that have Hive tables defined for them. The DBMS_HADOOP PL/SQL package contains a function named CREATE_EXTDDL_FOR_HIVE. It returns the data dictionary language (DDL) to create an external table for accessing a Hive table.
  • ORACLE_HDFS: Enables you to create Oracle external tables directly over files stored in HDFS. This access driver uses Hive syntax to describe a data source, assigning default column names of COL_1, COL_2, and so forth. You do not need to create a Hive table manually as a separate step: you can just define the record format of text data, or you can specify a SerDe for a particular data format.

External tables do not have traditional indexes, so that queries against them typically require a full table scan. However, Oracle Big Data SQL extends SmartScan capabilities, such as filter-predicate offloads, to Oracle external tables with the installation of Exadata storage server software onto an Oracle Big Data Appliance. This technology enables the Oracle Big Data Appliance to discard a huge portion of irrelevant data—often up to 99 percent of the total—and return much smaller result sets to the Oracle Exadata Database Machine. Therefore, End users obtain the results of their queries significantly faster, as the direct result of a reduced load on Oracle Database and reduced traffic on the network.

Note that alternatively, and similarly, Oracle SQL Connector for HDFS provides access to Hadoop data for all Oracle Big Data Appliance racks, including those that are not connected to Oracle Exadata Database Machine. However, it does not offer the performance benefits of Oracle Big Data SQL: see Oracle Big Data Connectors User's Guide.

Tuesday Jun 16, 2015

Big Data Spatial and Graph Analytics for Hadoop

We have just added Oracle Big Data Spatial and Graph support for Hadoop and NoSQL database technologies.  For over a decade, Oracle has offered leading spatial and graph analytic technology for the Oracle Database: we have now applied this expertise to work with social network data and to exploit Big Data architectures.  

Oracle Big Data Spatial and Graph includes two main components:

  1. A distributed property graph database with 35 built-in graph analytics to discover graph patterns in big data, such as communities and influencers within a social graph
  2. A wide range of spatial analysis functions and services to evaluate data based on how near or far something is to one another, or whether something falls within a boundary or region

Property Graph Data Management and Analysis

Property graphs are commonly used to model and analyze relationships, such as communities, influencers and recommendations, and other patterns found in social networks, cyber security, utilities and telecommunications, life sciences and clinical data, and knowledge networks.  

Property graphs model the real-world as networks of linked data comprising vertices (entities), edges (relationships), and properties (attributes) for both. Property graphs are flexible and easy to evolve; metadata is stored as part of the graph and new relationships are added by simply adding an edge.

Oracle Big Data Graph provides an industry leading property graph capability on Apache HBase and Oracle NoSQL Database with a Groovy-based console; parallel bulk load from common graph file formats; text indexing and search; querying graphs in database and in memory; ease of development with open source Java APIs and popular scripting languages; and an in-memory, parallel, multi-user, graph analytics engine with 35 standard graph analytics.

Spatial Analysis and Services – Enrich and Categorize Your Big Data with Location

With the spatial capabilities, users can take data with any location information, enrich it, and use it to harmonize their data.  For example, Oracle Big Data Spatial can look at datasets like Twitter feeds that include a zip code or street address, and add or update city, state, and country information. These results can be visualized on a map with the included HTML5-based web mapping tool.  Location can be used as a universal key across disparate data commonly found in Hadoop-based analytic solutions. 

“Big Data systems are increasingly being used to process large volumes of data from a wide variety of sources. With the introduction of Oracle Big Data Spatial and Graph, Hadoop users will be able to enrich data based on location and use this to harmonize data for further correlation, categorization and analysis. For traditional geospatial workloads, it will provide value-added spatial processing and allow us to support customers with large vector and raster data sets on Hadoop systems.” - Steve Pierce, CEO, Think Huddle

Your Spatial & Graph specialist contact in EMEA is Hans Viehmann (hans.viehmann@oracle.com).

You can attend a live web-conference on Spatial & Graph on Tuesday, July 21st at 6:00 PM UK / 7:00 PM CET

Tuesday May 19, 2015

OBI 11g Release Now Available

This new release of Oracle Business Intelligence v., includes a number of new features and a focus on overall quality improvement. Significant new features and enhancements include:

Expanded Data Source Support, including support for Cloudera Impala and Hive2.  Users can also access information directly from their Hyperion Planning applications to build and deliver analytical content with OBI. Integration with the Oracle 12c database has been enhanced with support for compression, Exadata Hybrid Columnar Compression (EHCC), and in-memory Oracle database features.

Improved User Experience, including a new Tree Map visualization. Subject area search makes it easier to locate elements to add to an analysis. Users can now save and re-use custom column definitions.  More options are available to users when exporting content.  Customers can now take advantage of HTML5 for rendering charts.

New Capabilities for Exalytics, including support for count distinct aggregations.  Aggregations for levels with non-unique level keys, ragged and skip level hierarchies, and time levels with no chronological keys are also supported.  A new Summary Advisor command line utility is available to assist with scripting maintenance tasks.

New Capabilities for Administration, including Mobile App Designer installed by default.  A database policy store is available for security administration.

BI Publisher Integration with WebCenter Content: This integration will enable customers to deliver BI Publisher output directly to WebCenter Content.

More information:

Monday May 18, 2015

Cloudera is Hadoop Market Leader

Have you connected with Cloudera ? – are you one of their partners yet, because this is a win-win for us all.

 - REPLAY the Oracle & Cloudera webcast (from Friday, May 29 ) - Rapidly Unlock the Value of Big Data for your Organisation

Oracle has a close partnership with Cloudera in that we resell their Hadoop version on our engineered platform, the Big Data Appliance. Additionally we obviously therefore target our big data software stack (e.g. Big Data Discovery, Big Data SQL, Connectors, ODI, ...) to run on Cloudera, as well as on Apache Hadoop and some of the other vendors.

If you had not noticed yet, Hadoop is growing really fast, and Cloudera has the lion’s share of this market: e.g. read this analyst report, saying:

“According to Cloudera’s CEO, Tom Reilly, Cloudera’s strategic decision to provide proprietary solutions in an open-source market paid off when Cloudera earned more than $100 million in revenues in 2014, which is more than double the revenues of Hortonworks and MapR.”

You may also have noticed that Intel announced a substantial equity investment in Cloudera ($740 million): and Intel works closely with Oracle on our engineered platforms, including the Big Data Appliance.

Training your people in Hadoop is critical to our combined success in this market, and Cloudera leads the market in high quality Hadoop courses, plus see the Cloudera Product Webinars tutorials and Training Webinars. Oracle will of course invest in training our partners on our specific product offerings for Big Data, but for a generic understanding of the base platform, I suggest you plug into Cloudera’s education programmes. They also have a certification programme so you can Become a Cloudera Certified Big Data Professional.

If you need an introduction, please contact me (Mike.Hallett@Oracle.com ) or Cloudera’s EMEA Partner Director, Jonathan Cooper @ jcooper@cloudera.com.

Friday Feb 20, 2015

Oracle Big Data Discovery Now Available - The Visual Face of Hadoop

There has been a lot of excitement about Oracle Big Data Discovery, and it is now generally available to customers and partners and can be sold and installed now.

This is a stunningly visual, intuitive tool that enables you to leverage the power of Hadoop and turn raw data into business insight in minutes — without learning complex products or relying only on specialists.

From a partner perspective it is perfect for proof of concept work with your clients to reveal the value they can find in their large information reservoirs combining a variety of data and text sources; potentially leading to much more and deeper analysis as part of an overall big data information architecture.

To find out more see these videos:

Check out the Capabilities of Big Data Discovery to

  1. Easily find relevant data by browsing a rich, interactive catalog of all data in hadoop using familiar keyword search and guided navigation.
  2. Explore data to understand its’ potential by visualizing the shape and quality of unfamiliar data and combining attributes to uncover interesting relationships.
  3. Transform and enrich to make data better with intuitive, user-driven data wrangling.
  4. See rich, interactive visualizations that reveal new patterns by blending diverse data sets for deeper perspectives.

More Resources:

Wednesday Nov 26, 2014

Oracle Enterprise Metadata Management 12c for Big Data and Analytics

One of the best reasons for introducing Hadoop into your clients is as a lower cost ETL and staging platform for any data warehouse... the so called “Data Lake” Concept. But as discussed by Gartner “Beware of the Data Lake Fallacy”... this still needs governance: “Without descriptive metadata and a mechanism to maintain it, the data lake risks turning into a data swamp.”

So, as a follow on to my last blog ( Partners’ Get-Started-Kit with Oracle Big Data and Analytics ), let’s now explore how you can help your clients to better govern and manage a cost effective data lake that delivers additional value to their data warehouse.

Oracle has expanded its Oracle Data Integration portfolio with the addition of Oracle Enterprise Metadata Management 12c, a comprehensive platform that helps reduce compliance risks and ensure the success of governance programs within organizations by providing much-needed business and data transparency (read the Press Release). Together with Oracle Enterprise Data Quality, Oracle Big Data SQL, and Oracle Database security you can manage and control all aspects of big data stewardship, lifecycle management, data protection, auditing, security, and compliance.

To find out more, start with these two webcasts:

Data Quality and Metadata Management for Big Data Governance webcast:
  • Understand how data governance can be applied to big data
  • Explore new Oracle Enterprise Metadata Management technology
  • Learn how data quality is integral to any data governance initiative

Big Data Integration for the Big Data Reservoir webcast:

  • Keep big data reservoirs accurate and real-time using Oracle Data Integrator and Oracle GoldenGate
  • Leverage Hadoop and Data Integration technologies across heterogeneous environments
  • Implement best practices using big data reservoirs to unlock the most value from your enterprise big data

Then try the Tutorial @ Tame Big Data with Oracle Data Integration

And Download a full version of Oracle Enterprise Metadata Management 12c software and documentation.

Tuesday Jul 01, 2014

ODI12c ETL on Hadoop and Oracle Big Data Appliance

One of the best reasons to start using Hadoop, is to off-load ETL processing away from a potentially higher cost “Data Warehouse staging system” and deploy it onto a platform with a better performance-to-cost ratio for this ETL load.

If you do this, you will likely still want high productivity ETL tools such as Oracle Data Integrator (ODI12c), and if you are handling large volumes of data in a limited batch window, you need fast processing and most importantly high speed loading into the Data Warehouse.

ODI12c on Hadoop gives you this, when combined with the Oracle Big Data Connectors. This works especially well on our engineered systems (Big Data Appliance to Exadata), but is still also the best solution for any ETL work from Hadoop to an Oracle database, even on so called “commodity hardware”.

Mark installed all the software elements directly, but if you need to get going quickly, you may be able to use our downloadable VM ( Demonstration VM “BigDataLite 2.4.1” Available on OTN ... although this is now updated to version 3.0) which works on non-BDA hardware.

Monday Feb 03, 2014

Demonstration VM “BigDataLite 2.4.1” Available on OTN

The Demonstration VM “BigDataLite 2.4.1” is now available for download from OTN.  Now, customers and partners can have easy access to many of our big data software products - all configured in an integrated VirtualBox environment.

“BigDataLite” is an Oracle VM VirtualBox that contains many key components of Oracle's big data platform, including:

  • Cloudera Distribution including Apache Hadoop
  • Oracle Database 12c Enterprise Edition
  • Oracle Advanced Analytics and "R"
  • Oracle Data Integrator 12c, Oracle Big Data Connectors
  • Oracle NoSQL Database, and more....

It's been configured to run on at least two cores and about +5Gb memory (so this means that your computer should have at least 8Gb total memory). With BigDataLite, you can develop your big data applications and then deploy them to any compatible hardware including the Oracle Big Data Appliance.

To expand this demonstration platform to include OBI and Endeca, you then also download these VMs and inter-connect them. For this you will likely need +16Gb of RAM and at least 4 cores to get them all running. This is targeted at “BIG data analytics”, so giving this integrated platform set +32Gb of RAM and more cores will help you to show it in its’ best light.

Tuesday Nov 26, 2013

Small Steps to Big Data BI&EPM Partner Community Forum January 2014

Open to all OPN partners in EMEA, we are running the Business Analytics Partner Community Forum over two days in London, on 16th and 17th January 2014 - Register Now Here.

This forum entitled “Small Steps to Big Data” will focus on discussing with Partners, how best to exploit the tremendous interest in “Big Data Analytics” and to clarify under what circumstances “R” and “Hadoop” are best deployed, and how these co-exist, integrate with, and extend the capabilities of tools you are already familiar with such as Oracle BI, ODI, Endeca and the Oracle Database. We will consider guidelines to hardware deployments, but the main focus of the forum will be how the software inter-operates: with guest speakers from Cloudera, and other partners who have experience in this field.

On the one hand, I do not think “Big Data=Hadoop”. While on the other, Oracle whole-heartedly embraces useful Open Source innovations such as “R” and “Hadoop”: we are, after all, a big player in Open Source with for example JAVA, NoSql and MySql.

You can download the agenda here.  We will seek to answer questions such as:

· How Big is “Big” ? ... at what size is Hadoop’s MPP approach beneficial ?

· What about “Variety” ? ... how do we digest “Any Data” ?

· Who uses this analytics ? ... a few “Data Scientists” or 100s of “Business Users” ?

· How do you spot a “Big Data Analytics” opportunity ?

· If someone is already using Hadoop, do they want to talk to Oracle ?

· What is NEW, and what is Business-as-Usual ?

Audience: This forum will appeal to CTOs, Solution architects and consultants in Oracle Partners familiar with Oracle’s Business Analytics solutions. We will examine the economics and business cases driving “Big Data Analytics” projects, and dive into the pros-and-cons of technology options available to your customers.

· Day 1 – Thursday 16th Jan. 2014: Starts 11.0 am – Sales and Executive briefing.

o Networking Dinner in Evening

· Day 2 – Friday 17th Jan. 2014: Ends 4.0 pm – Deeper dive technical discussions.

Register Now Here



« October 2015