By Jean-Pierre Dijcks-Oracle on Jul 07, 2015
As more people pick up big data technologies, we see the workloads run on these big data system evolve and diversify. The initial workloads were all Map Reduce and fit a specific, (micro) batch workload pattern. Over the past couple of years that has changed and that change is reflected in the Hadoop tools - specifically with YARN. While there is still quite a bit of batch work being done, typically using Map Reduce (think Hive, Pig etc) we are seeing our customers move to more mixed workloads where the batch work is augmented with both more online SQL as well as more streaming workloads.
More Horsepower - More Capacity
The change towards a more mixed workload leads us to change the shape of the underlying hardware. Systems now shift away from the once sacred "1-core to 1-disk ratio" and also from the small memory footprints for the worker nodes.
With the BDA X5-2 update in December 2014, BDA doubled the base memory configuration and added 2.25x more CPU resources in every node as well as upgrading to Intel's fastest Xeon E5 CPU. BDA X5-2 now has 2 * 18 Xeon cores to enable CPU intense workloads like analytics SQL queries using Oracle Big Data SQL, machine learning, graph applications etc.
With the processing covered for these more mixed workloads, we looked at other emerging trends or workloads and their impact on the BDA X5-2 hardware. The most prominent trend we see in big data are the large data volumes we expect to see from the Internet of Things (IoT) explosion and the potential cost associated with storing that data.
To address this issue (and storage cost in general) we are now adding 2x more disk space onto each and every BDA disk doubling the total available space on the system while keeping the list price constant. That is correct, 2x capacity but no price change!
And if that isn't enough, we are also changing the way our customers can grow their systems by introducing BDA Elastic Configurations.
As we see customers build out production in large increments, we also see a need to be more flexible in expanding the non-production environments (test, qa and performance environments). BDA X5-2 Elastic Configurations enables expansion of a system in 1-node increments by adding a BDA X5-2 High Capacity (HC) plus InfiniBand Infrastructure into a 6-node Starter Rack.
The increased flexibility enables our customers to start with a production scale cluster of 6 nodes (X5-2 or older) and then increment within the base rack up to 18 nodes, then expand across racks without any additional switching (no top of rack required, all on the same InfiniBand network) to build large(r) clusters. The expansion is of course all supported from the Oracle Mammoth configuration utility and its CLI, greatly simplifying expansion of clusters.
Major Improvement, No Additional Cost
Over the past generations BDA has been quickly adopted to changing usage and workload patterns enabling the adoption of Hadoop into the data ecosystem with minimal infrastructure disruption but with maximum business benefits. The latest update to BDA X5-2 enables flexibility, delivers more storage capacity and runs more workloads then ever before.