Tuesday Apr 22, 2014

Announcing: Big Data Appliance 3.0 and Big Data Connectors 3.0

Today we are releasing Big Data Appliance 3.0 (which includes the just released Oracle NoSQL Database 3.0) and Big Data Connectors 3.0.These releases deliver a large number of interesting and cool features and enhance the overall Oracle Big Data Management System that we think is going to be the core of information management going forward.


This post highlights a few of the new enhancements across the BDA, NoSQL DB and BDC stack.

Big Data Appliance 3.0:
  • Pre-configured and pre-installed CDH 5.0 with default support for YARN and MR2
  • Upgrade from BDA 2.5 (CDH 4.6) to BDA 3.0 (CDH 5.0)
  • Full encryption (at rest and over the network) from a single vendor in an appliance
  • Kerberos and Apache Sentry pre-configured
  • Partition Pruning through Oracle SQL Connector for Hadoop
  • Apache Spark (incl. Spark Streaming) support
  • More
Oracle NoSQL Database 3.0:
  • Table data model support layered on top of distributed key-value model
  • Support for Secondary Indexing
  • Support for "Data Centers" => Metro zones for DR and secondary zones for read-only workloads
  • Authentication and network encryption
  • More

You can read about all of these features by going to links above and reading the OTN page, data sheets and other relevant information.

While BDA 3.0 immediately delivers upgrade from BDA 2.5, Oracle will also support the current version and we fully expect more BDA 2.x releases based on more CDH 4.x releases. As a customer you now have a choice how to deploy BDA and which version it is you want to run, while knowing you can upgrade to the latest and greatest in a safe manner.

Thursday Apr 03, 2014

Updated: Price Comparison for Big Data Appliance and Hadoop

Untitled Document

It was time to update this post a little. Big Data Appliance grew, got more features and prices as well as insights just changed all across the board. So, here is an update.

The post is still aimed at providing a simple apples-to-apples comparison and a clarification of what is, and what is not included in the pricing and packaging of Oracle Big Data Appliance when compared to "I'm doing this myself - DIY style".

Oracle Big Data Appliance Details

A few of the most overlooked items in pricing out a Hadoop cluster are the cost of software, the cost of actual production-ready hardware and the required networking equipment. A Hadoop cluster needs more than just CPUs and disks... For Oracle Big Data Appliance we assume that you would want to run this system as a production system (with hot-pluggable components and redundant components in your system). We also assume you want the leading Hadoop distribution plus support for that software. You'd want to look at securing the cluster and possibly encrypting data at rest and over the network. Speaking of network, InfiniBand will eliminate network saturation issues - which is important for your Hadoop cluster.

With that in mind, Oracle Big Data Appliance is an engineered system built for production clusters.  It is pre-installed and pre-configured with Cloudera CDH and all (I emphasize all!) options included and we (with the help of Cloudera of course) have done the tuning of the system for you. On top of that, the price of the hardware (US$ 525,000 for a full rack system - more configs and smaller sizes => read more) includes the cost of Cloudera CDH, its options and Cloudera Manager (for the life of the machine - so not a subscription).

So, for US$ 525,000 you get the following:

  • Big Data Appliance Hardware (comes with Automatic Service Request upon component failures)
  • Cloudera CDH and Cloudera Manager
  • All Cloudera options as well as Accumulo and Spark (CDH 5.0)
  • Oracle Linux and the Oracle JDK
  • Oracle Distribution of R
  • Oracle NoSQL Database Community Edition
  • Oracle Big Data Appliance Enterprise Manager Plug-In

The support cost for the above is a single line item.. The list price for Premier Support for Systems per the Oracle Price list (see source below) is US$ 63,000 per year.

To do a simple 3 year comparison with other systems, the following table shows the details and the totals for Oracle Big Data Appliance. Note that the only additional item is the install and configuration cost which are done by Oracle personnel or partners, on-site:


Year 1 Year 2 Year 3 3 Year
Total
BDA Cost
$525,000



Annual Support Cost
$63,000
$63,000
$63,000

On-site Install (approximately)
$14,000



Total
$602,000
$63,000
$63,000
$728,150

For this you will get a full rack BDA (18 Sun X4-2L servers, 288 cores (Two Intel Xeon E5-2650V2 CPUs per node), 864TB disk (twelve 4TB disks per node), plus software, plus support, plus on-site setup and configuration. Or in terms of cost per raw TB at purchase and at list pricing: $697.

HP DL-380 Comparative System (this is changed from the original post to the more common DL-380's)

To build a comparative hardware solution to the Big Data Appliance we picked an HP-DL180 configuration and built up the servers using the HP.com website for pricing. The following is the price for a single server.

Model Number Description Quantity Total Price
653200-B21 ProLiant DL380p Gen8 Rackmount Factory Integrated 8 SFF CTO Model (2U) with no processor, 24 DIMM with no memory, open bay (diskless) with 8 SFF drive cage, Smart Array P420i controller with Zero Memory, 3 x PCIe 3.0 slots, 1 FlexibleLOM connector, no power supply, 4 x redundant fans, Integrated HP iLO Management Engine
1
$2,051
715218-L21
2.6GHz Xeon E5-2650 v2 processor (1 chip, 8 cores) with 20MB L3 cache - Factory Integrated Only
2
$3,118
684208-B21
HP 1GbE 4-port 331FLR Adapter - Factory Integrated Only
1
$25
503296-B21
460W Common Slot Gold Hot Plug Power Supply
1
$229
AF041A
HP Rack 10000 G2 Series - 10842 (42U) 800mm Wide Cabinet - Pallet Universal Rack
0
$0
731765-B21
8GB (1 x 8GB) Single Rank x8 PC3L-12800R (DDR3-1600) Registered CAS-11 Low Voltage Memory Kit
8
$1,600
631667-B21
HP Smart Array P222/512MB FBWC 6Gb 1-port Int/1-port Ext SAS controller 1
$599
695510-B21
4TB 6Gb SAS 7.2K LFF hot-plug SmartDrive SC Midline disk drive (3.5") with 1-year warranty
12
$12,588





Grand Total for a single server (list prices)

$20,210

On top of this we need InfiniBand switches. Oracle Big Data Appliance comes with 3 IB switches, allowing us to expand the cluster without suddenly requiring extra switches. And, we do expect these machines to be a part of a much larger clusters. The IB switches are somewhere in the neighborhood of US$ 6,000 per switch, so add $18,000 per rack and add a management switch (BDA uses a Cisco switch) which seems to be around $15,000 list. The total switching comes to roughly $33,000.

We will also need Cloudera Enterprise subscription - and to compare apples to apples, we will do it for all software. Some sources (see this document) peg CDH Core at $3,382 list per node and per year (24*7 support). Since BDA has more software (all options) and that pricing is not public I am going to make an educated calculation and rounding and double the price with a rounding to the nearest nice and round number. That gets me to $7,000 per node, per year for 24*7 support. 

BDA also comes with on-disk encryption, which is even harder to price out. My somewhat educated guess is around $1,500 list or so per node and per year. Oh, and lets not forget the Linux subscription, which lists at $1,299 per node per year. We also run a MySQL database (enterprise edition with replication), which costs list subscription $5,000. We run it replicated over 2 nodes.

This all gets us to roughly $10,000 list price per node per year for all applicable software subscriptions and support and an additional $10,000 for the two MySQL nodes.

HP + Cloudera Do-it-Yourself System

Let's go build our own system. The specs are like a BDA, so we will have 18 servers and all other components included. 


Year 1 Year 2 Year 3 Total

Servers

$363,780



Networking
$33,000



SW Subscriptions and Support
$190,000
$190,000
$190,000

Installation and Configuration
$15,000



Total
$601,780
$190,000
$190,000
$981,780

Some will argue that the installation and configuration is free (you already pay your data center team), but I would argue that something that takes a short amount of time when done by Oracle, is worth the equivalent if it takes you a lot longer to get all this installed, optimized, and running. Nevertheless, here is some math on how to get to that cost anyways: approximately 150 hours of labor per rack for the pure install work. That adds up to US $15,000 if we assume a cost per hour of $100. 

Note: those $15,000 do NOT include optimizations and tuning to Hadoop, to the OS, to Java and other interesting things like networking settings across all these areas. You will now need to spend time to figure out the number of slots you allocate per node, the file system block size (do you use Apache defaults, or Cloudera's or something else) and many more things at system level. On top of that, we pre-configure for example Kerberos and Apache Sentry giving you a secure authorization and authentication method, as well as have a one-click on-disk and network encryption setting. Of course you can contact various other companies to do this for you.

You can also argue that "you want the cheapest hardware possible", because Hadoop is built to deal with failures, so it is OK for things to regularly fail. Yes, Hadoop does deal well with hardware failures, but your data center is probably much less keen about this idea, because someone is going to replace the disks (all the time). So make sure the disks are hot-swappable. An oh, that someone swapping the disks does cost money... The other consideration is failures in important components like power... redundant power in a rack is a good thing to have. All of this is included (and thought about) in Oracle Big Data Appliance.

In other words, do you really want spend weeks installing, configuring and learning or would you rather start to build applications on top of the Hadoop cluster and thus providing value to your organization.

The Differences

The main differences between Oracle Big Data Appliance and a DIY approach are:

  1. A DIY system - at list price with basic installation but no optimization - is a staggering $220 cheaper as an initial purchase
  2. A DIY system - at list price with basic installation but no optimization - is almost $250,000 more expensive over 3 years.
    Note to purchasing, you can spend this on building or buying applications on your cluster (or buy some real intriguing Oracle software)
  3. The support for the DIY system includes five (5) vendors. Your hardware support vendor, the OS vendor, your Hadoop vendor, your encryption vendor as well as your database vendor. Oracle Big Data Appliance is supported end-to-end by a single vendor: Oracle
  4. Time to value. While we trust that your IT staff will get the DIY system up and running, the Oracle system allows for a much faster "loading dock to loading data" time. Typically a few days instead of a few weeks (or even months)
  5. Oracle Big Data Appliance is tuned and configured to take advantage of the software stack, the CPUs and InfiniBand network it runs on
  6. Any issue we, you or any other BDA customer finds in the system is fixed for all customers. You do not have a unique configuration, with unique issues on top of the generic issues.

Conclusion

In an apples-to-apples comparison of a production Hadoop cluster, Oracle Big Data Appliance starts of with the same acquisition prices and comes out ahead in terms of TCO over 3 years. It allows an organization to enter the Hadoop world with a production-grade system in a very short time reducing both risk as well as reducing time to market.

As always, when in doubt, simply contact your friendly Oracle representative for questions, support and detailed quotes.

Sources:

HP and related pricing: http://www.hp.com or http://www.ideasinternational.com/ (the latter is a paid service - sorry!)
Oracle Pricing: http://www.oracle.com/us/corporate/pricing/exadata-pricelist-070598.pdf
MySQL Pricing: http://www.oracle.com/us/corporate/pricing/price-lists/mysql-pricelist-183985.pdf

About

The data warehouse insider is written by the Oracle product management team and sheds lights on all thing data warehousing and big data.

Search

Archives
« April 2014 »
SunMonTueWedThuFriSat
  
2
4
5
6
7
8
9
10
11
12
13
14
16
18
19
20
21
23
24
25
26
27
30
   
       
Today