Friday Jun 24, 2016

Updates to Big Data Lite 4.5

There are some updates to Big Data Lite 4.5 that you will want to take advantage of:
  • Updated Big Data SQL configuration enables predicate pushdown to engage - which can have pretty dramatic performance benefits
  • Update to the Big Data Spatial framework (fixes a configuration issue)
  • New samples for the Oracle Shell for Hadoop Loaders

To get the latest updates, simply double click on the Refresh Samples icon on the desktop:
refresh-samples

As an example of the BDS performance improvement (and - this is a BDL VM - so the actual numbers aren't nec. interesting.  But, it gets the performance concepts across) - I ran a few tests with the following query:

select count(*)  from table
where custid = 1185972
and activity=6
and genreid=7;

I ran this query 3 times against different tables:

  • Avro - not partitioned - 17s
  • Avro - genre partitions - illustrates partition pruning - 1.1s
  • Parquet - not partitioned - illustrates predicate pushdown into parquet - 0.4s

The VM doesn't have Big Data SQL Cells - so you don't see smartscan and storage indexes - which would show even more performance benefits.

BDS 3.0.x is a great release.  Take it for a spin!

Thursday Mar 17, 2016

Big Data SQL 3.0 is now available!

Oracle Big Data SQL 3.0 is now available!  This is an exciting milestone for Oracle.  With support for Cloudera CDH (both on Big Data Appliance and non-Big Data Appliance), Hortonworks HDP and Oracle Database 12c (both Exadata and non-Exadata) - the benefits derived from unified queries across relational, Hadoop and NoSQL stores can now be achieved across a wide breadth of big data deployments.

Hadoop and NoSQL are rapidly becoming key components of today's data management platform, and many Oracle Database customers use Hadoop or NoSQL in their organization. Using multiple data management solutions typically lead to data silos, where different people and applications can only access a subset of the data needed. Big Data SQL offers an industry leading solution to deliver one fast, secure SQL query on all data: in Hadoop, Oracle Database, and NoSQL. Big Data SQL leverages both innovations from Oracle Exadata and specific Hadoop features to push processing down to Hadoop, resulting in minimized data movement and extreme performance for SQL on Hadoop. 

In summary, Big Data SQL 3.0: 

  • Expands support for Hadoop platforms - covering Hortonworks HDP, Cloudera CDH on commodity hardware as well as on Oracle Big Data Appliance
  • Expands support for database platforms - covering Oracle Database 12c on commodity hardware as well as on Oracle Exadata
  • Improves performance through new features like Predicate Push-Down on top of Smart Scan and Storage Indexes 

 

To learn more:

 

Tuesday Mar 15, 2016

Big Data SQL Quick Start. Parallel Query - Part3.

[Read More]

Friday Feb 19, 2016

How Can Graph Databases and Analytics Help Your Business?

Several videos describing the value of graph analyses are now available from our Oracle Big Data Spatial and Graph + Oracle Labs teams.

video

Check out this blog post for details :). 

Big Data Lite 4.4.0 is now available on OTN

big data lite


It's now available for download on OTN.  Check out this VM to help you learn about Oracle's big data platform.[Read More]

Thursday Feb 04, 2016

Using Spark(Scala) and Oracle Big Data Lite VM for Barcode & QR Detection

Big Data and Scalable Image Processing and Analytics

Guest post by Dave Bayard - Oracle's Big Data Pursuit Team 

One of the promises of Big Data is its flexibility to work with large volumes of unstructured types of data such as images and photos. In todayís world, there are many sources of images including social media photos, security cameras, satellite images, and more. There are many kinds of image processing and analytics that are possible from optical character recognition (OCR), license plate detection, bar code detection, face recognition, geological analysis and more. And there are many open source libraries such as OpenCV, Tesseract, ZXing, and others that are available to leverage.

[Read More]

Tuesday Jan 19, 2016

Big Data SQL Quick Start. Introduction - Part1.

Today I am going to explain steps that required to start working with Big Data SQL. It’s really easy!  I hope after this article you all will agree with me. First, if you want to get caught up on what Big Data SQL is, I recommend that you read these blogs: Oracle Big Data SQL: One Fast Query, Big Data SQL 2.0 - Now Available.

The above blogs cover design goals of Big Data SQL. One of the goals of Big Data SQL is transparency. You just define table that links to some directory in HDFS or some table in HCatalog and continue working with it like with general Oracle Database table.It’s also useful to read the product documentation.

Your first query with Big Data SQL

Let’s start with simplest one example and query data that is actually stored in HDFS via Oracle Database using Big Data SQL. I’m going to begin this example by checking of the data that actually lies into HDFS. To accomplish this, I run the hive console and check hive table DDL:

[Read More]

Thursday Jan 07, 2016

Data loading into HDFS - Part1

Today I’m going to start first article that will be devoted by very important topic in Hadoop world – data loading into HDFS. Before all, let me explain different approaches of loading and processing data in different IT systems.

Schema on Read vs Schema on Write

So, when we talking about data loading, usually we do this into system that could belong on one of two types.  One of this is schema on write. With this approach we have to define columns, data formats and so on. During the reading  every user will observe the same data set. As soon as we performed ETL (transform data in format that mostly convenient to some particular system), reading will be pretty fast and overall system performance will be pretty good. But you should keep in mind, that we already paid penalty for this when were loading data. Like example of schema on write system you could consider Relational data base, for example, like Oracle or MySQL.


Schema on Write

Another approach is schema on read. In this case we load data as-is without any changing and transformations.  With this approach we skip ETL (don’t transform data) step and we don’t have any headaches with data format and data structure. Just load file on file system, like coping photos from FlashCard or external storage to your laptop’s disk. How to interpret data you will decide during the data reading. Interesting stuff that the same data (same files) could be read in different manner. For instance, if you have some binary data and you have to define Serialization/Deserialization framework and using it within your select, you will have some structure data, otherwise you will get set of the bytes. Another example, even if you have simplest CSV files you could read the same column like a Numeric or like a String. It will affect on different results for sorting or comparison operations.

Schema on Read

Hadoop Distributed File System is classical example of schema on read system.More details about Schema on Read and Schema on Write approach you could findhere. Now we are going to talk about data loading data into HDFS. I hope after explanation above, you understand that data loading into Hadoop is not equal of ETL (data doesn’t transform).

[Read More]

Thursday Dec 24, 2015

Oracle Big Data Lite 4.3.0 is Now Available on OTN

Big Data Lite 4.3.0 is now available on OTN



This latest release is packed with new features - here's the inventory of what's included:

  • Oracle Enterprise Linux 6.7
  • Oracle Database 12c Release 1 Enterprise Edition (12.1.0.2) - including Oracle Big Data SQL-enabled external tables, Oracle Multitenant, Oracle Advanced Analytics, Oracle OLAP, Oracle Partitioning, Oracle Spatial and Graph, and more.
  • Cloudera Distribution including Apache Hadoop (CDH5.4.7)
  • Cloudera Manager (5.4.7)
  • Oracle Big Data Spatial and Graph 1.1
  • Oracle Big Data Discovery 1.1.1
  • Oracle Big Data Connectors 4.3
    • Oracle SQL Connector for HDFS 3.4.0
    • Oracle Loader for Hadoop 3.5.0
    • Oracle Data Integrator 12c
    • Oracle R Advanced Analytics for Hadoop 2.5.1
    • Oracle XQuery for Hadoop 4.2.1
  • Oracle NoSQL Database Enterprise Edition 12cR1 (3.4.7)
  • Oracle Table Access for Hadoop and Spark 1.0
  • Oracle JDeveloper 12c (12.1.3)
  • Oracle SQL Developer and Data Modeler 4.1.2 with Oracle REST Data Services 3.0
  • Oracle Data Integrator 12cR1 (12.2.1)
  • Oracle GoldenGate 12c
  • Oracle R Distribution 3.2.0
  • Oracle Perfect Balance 2.5.0
Also, this release is using github as the repository for all of our sample code (https://github.com/oracle/BigDataLite).  This gives us a great mechanism for updating the samples/demos between releases.  Users simply double click the "Refresh Samples" icon on the desktop to download the latest collateral.

Friday Nov 20, 2015

Review of Data Warehousing and Big Data at #oow15

DW BigData Review Of OOW15

This year OpenWorld was bigger, more exciting and packed with sessions about the very latest technology and product features. Most importantly, both data warehousing and Big Data were at the heart of this year’s conference across a number of keynotes and a huge number of general sessions. Our hands-on labs were all completely full as people got valuable hands-on time with our most important new features. The key focus areas at this year’s conference were:
  • Database 12c for Data Warehousing
  • Big Data and the Internet of Things 
  • Cloud from on-premise to on the Cloud to running hybrid Cloud systems
  • Analytics and SQL continues to evolve to enable more and more sophisticated analysis. 
All these topics appeared across the main keynote sessions including live on-stage demonstrations of how many of our news features can be used to increase the performance and analytical capability of your data warehouse and big data management system - checkout the on-demand videos for the keynotes and executive interviews....
[Read More]

Thursday Sep 03, 2015

Oracle Big Data Lite 4.2.1 - Includes Big Data Discovery

We just released Oracle Big Data Lite 4.2.1 VM.  This VM provides many of the key big data technologies that are part of Oracle's big data platform.  Along with all the great features of the previous version, Big Data Lite now adds Oracle Big Data Discovery 1.1:

The list of big data capabilities provided by the virtual machine continues to grow.  Here's a list of all the products that are pre-configured:

  • Oracle Enterprise Linux 6.6
  • Oracle Database 12c Release 1 Enterprise Edition (12.1.0.2) - including Oracle Big Data SQL-enabled external tables, Oracle Multitenant, Oracle Advanced Analytics, Oracle OLAP, Oracle Partitioning, Oracle Spatial and Graph, and more.
  • Cloudera Distribution including Apache Hadoop (CDH5.4.0)
  • Cloudera Manager (5.4.0)
  • Oracle Big Data Discovery 1.1
  • Oracle Big Data Connectors 4.2
    • Oracle SQL Connector for HDFS 3.3.0
    • Oracle Loader for Hadoop 3.4.0
    • Oracle Data Integrator 12c
    • Oracle R Advanced Analytics for Hadoop 2.5.0
    • Oracle XQuery for Hadoop 4.2.0
  • Oracle NoSQL Database Enterprise Edition 12cR1 (3.3.4)
  • Oracle Big Data Spatial and Graph 1.0
  • Oracle JDeveloper 12c (12.1.3)
  • Oracle SQL Developer and Data Modeler 4.1
  • Oracle Data Integrator 12cR1 (12.1.3.0.1)
  • Oracle GoldenGate 12c
  • Oracle R Distribution 3.1.1
  • Oracle Perfect Balance 2.4.0
  • Oracle CopyToBDA 2.0 
Take it for a spin - and check out the tutorials and demos that are available from the Big Data Lite download page.

Thursday Jul 02, 2015

Using Oracle Big Data Spatial and Graph

Wondering how to get started with graph analyses?  The latest Oracle Big Data Lite VM includes Oracle's new spatial and graph toolkit for big data.  Check out these two blog posts that describe how to find interesting relationships in data:

 Pretty cool :)

 

Saturday Jun 20, 2015

Oracle Big Data Spatial and Graph - Installing the Image Processing Framework

Oracle Big Data Lite 4.2 was just released - and one of the cool new features is Oracle Spatial and Graph.  In order to use this new feature, there is one more configuration step required.  Normally, we include everything you need in the VM - but this is a component that we couldn't distribute.

For the Big Data Spatial Image Processing Framework, you will need to install and configure Proj.4 - Cartographic Projections Library.  Simply follow these steps: 

  • Start the Big Data Lite VM and log in as user "oracle"
  • Launch firefox and download this tarball (​http://download.osgeo.org/proj/proj-4.9.1.tar.gz) to ~/Downloads
  • Run the following commands at the linux prompt:
    • cd ~/Downloads
    • tar -xvf proj-4.9.1.tar.gz
    • cd proj-4.9.1
    • ./configure
    • make
    • sudo make install

This will create the libproj.so file in directory /usr/local/lib/.  Now that the file has been created, create links to it in the appropriate directories.  At the linux prompt:

Update

For Big Data Lite 4.2 - use these steps

  • sudo ln -s /usr/local/lib/libproj.so /u02/oracle-spatial-graph/shareddir/spatial/demo/imageserver/native/libproj.so
  • sudo ln -s /usr/local/lib/libproj.so /usr/lib/hadoop/lib/native/libproj.so

For releases after Big Data Lite 4.2 - use this step:

  • sudo ln -s /usr/local/lib/libproj.so /opt/oracle/oracle-spatial-graph/spatial/raster/gdal/lib/libproj.so

That's all there is to it.  Big Data Lite is now ready for Orace Big Data Spatial and Graph!

Oracle Big Data Lite 4.2 Now Available!

Oracle Big Data Lite Virtual Machine 4.2 is now available on OTN.  For those of you that are new to the VM - it is a great way to get started with Oracle's big data platform.  It has a ton of products installed and configured - including: 

  • Oracle Enterprise Linux 6.6
  • Oracle Database 12c Release 1 Enterprise Edition (12.1.0.2) - including Oracle Big Data SQL-enabled external tables, Oracle Multitenant, Oracle Advanced Analytics, Oracle OLAP, Oracle Partitioning, Oracle Spatial and Graph, and more.
  • Cloudera Distribution including Apache Hadoop (CDH5.4.0)
  • Cloudera Manager (5.4.0)
  • Oracle Big Data Connectors 4.2
    • Oracle SQL Connector for HDFS 3.3.0
    • Oracle Loader for Hadoop 3.4.0
    • Oracle Data Integrator 12c
    • Oracle R Advanced Analytics for Hadoop 2.5.0
    • Oracle XQuery for Hadoop 4.2.0
  • Oracle NoSQL Database Enterprise Edition 12cR1 (3.3.4)
  • Oracle Big Data Spatial and Graph 1.0
  • Oracle JDeveloper 12c (12.1.3)
  • Oracle SQL Developer and Data Modeler 4.1
  • Oracle Data Integrator 12cR1 (12.1.3)
  • Oracle GoldenGate 12c
  • Oracle R Distribution 3.1.1
  • Oracle Perfect Balance 2.4.0
  • Oracle CopyToBDA 2.0

Check out our new product - Oracle Big Data Spatial and Graph (and don't forget to read the blog post on a small config update you'll need to make to use it).  It's a great way to find relationships in data and query and visualize geographic data.  Speaking of analysis... Oracle R Advanced Analytics for Hadoop now leverages Spark for many of its algorithms for (way) faster processing.

 But, that's just a couple of features... download the VM and check it out for yourself :). 

Wednesday Mar 11, 2015

Oracle Big Data Lite 4.1 VM is available on OTN

Oracle Big Data Lite 4.1 VM is now available for download on OTN.  Big Data Lite includes many of the key capabilities of Oracle's big data platform.  Each of the components have been configure to work together - and there are many hands-on labs and demonstrations to help you get started using the system.  Below is a listing of what's included:

  • Oracle Enterprise Linux 6.5
  • Oracle Database 12c Release 1 Enterprise Edition (12.1.0.2) - including Oracle Big Data SQL-enabled external tables, Oracle Multitenant, Oracle Advanced Analytics, Oracle OLAP, Oracle Partitioning, Oracle Spatial and Graph, and more.
  • Cloudera Distribution including Apache Hadoop (CDH5.3.0)
  • Cloudera Manager (5.3.0)
  • Oracle Big Data Connectors 4.1
    • Oracle SQL Connector for HDFS 3.2.0
    • Oracle Loader for Hadoop 3.3.0
    • Oracle Data Integrator 12c
    • Oracle R Advanced Analytics for Hadoop 2.4.1
    • Oracle XQuery for Hadoop 4.1.0
  • Oracle NoSQL Database Enterprise Edition 12cR1 (3.2.5)
  • Oracle JDeveloper 12c (12.1.3)
  • Oracle SQL Developer and Data Modeler 4.0.3
  • Oracle Data Integrator 12cR1 (12.1.3)
  • Oracle GoldenGate 12c
  • Oracle R Distribution 3.1.1
  • Oracle Perfect Balance 2.3.0
  • Oracle CopyToBDA 1.1 

 

 Enjoy!

About

The data warehouse insider is written by the Oracle product management team and sheds lights on all thing data warehousing and big data.

Search

Archives
« June 2016
SunMonTueWedThuFriSat
   
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
  
       
Today