Learn about data lakes, machine learning & more innovations

Big Data In The Cloud: Why And How?

Prashant Jha
Director of Product Management

Lowered Total Cost of Ownership, Total Flexibility, Hyper-Scale, and the Oracle Advantage

Algorithms have a huge influence on our daily lives and our future. Everything from our social interactions, news, entertainment, finance, health, and other aspects of our lives are impacted by mathematical computations and nifty algorithms, and big data is a significant part of what makes this possible.

We’re now in the era of machine learning and artificial intelligence, one more time. But unlike our previous attempt in the 1960s and 1980s, things are different this time. Of course they are. Thanks to Moore's Law, transistor density continues to increase while storage costs continues to drop.

But that, by itself, is not enough to ensure success.

Sign up for a free trial to build and populate a data lake in the cloud

This time around, we have distributed computing which has also come a long way with new computing paradigms such as Hadoop, Spark, TensorFlow, etc. being developed every day, both within academia and large corporations.

These advancements have enabled us to build systems that are way more powerful than anything that can be achieved with a single machine with the most powerful processor inside, which is what was attempted in the 60s.

It enables us to do more with our big data than we’ve ever been able to before.

But there are two other advances that are playing a huge role in this revolution:

Data is Fueling the Future

Explosive digitization of the physical world is creating an unprecedented amount of data. There’s a deluge of data that needs to get stored and processed. This data comes from:

  • Online social networks
  • User generated content
  • Mobile computing
  • Embedded sensors in everyday objects
  • Automation of routine tasks
  • And much more

The action of analyzing and processing this data is what makes the algorithms smart and efficient, which then gets applied to others areas of our life. And thus, there is this virtuous cycle of self-fulfillment.

According to some estimates, 2/3 of all digital content is user generated and has been created in the last few years. According to Intel, autonomous cars will generate 4000GB of data per car per day. Soon this will be more production and consumption of data than humans can generate and consume.

This is the foundation of a smarter, better, and more efficient planet. If algorithms are the engine that is going to drive us to a better future, then data is the gas in the tank. But how are companies using their gas?

Challenges to Utilizing Big Data

Traditionally, companies have preferred to build out their own server farm, deploy, run, and manage systems themselves. But as this data volume grows and the goal of extracting value out of this massive data set involves complex and sophisticated machine learning and AI algorithms, it is becoming more challenging in terms of operations and total cost of ownership (TCO) to maintain this deployment.

In addition to the TCO, there are challenges with agility and flexibility. From a hardware perspective, there is the sunk cost of buying machines and provisioning for peak load which affects utilization. Longer procurement cycles mean the predictions for growth will have to be accurate and there is no room for error. This limits elasticity of the infrastructure and thus curbs models for experimentation and ad-hoc applications.

In short, here are some major challenges with the traditional model:

  • Low hardware utilization
  • Lack of multi-tenancy support
  • No self-serve model
  • Slow onboarding new applications/users
  • Low bandwidth network
  • High OPEX
  • Lack of big data skills and expertise

The Answer: Oracle Big Data Cloud

For organizations managing this growing volume of data and trying to gain insights/value, the right answer is turning to public cloud computing using open source software (OSS).

However, in certain cases, due to reasons such as organizational concerns, security issues, regulations in industries, or sovereignty rules such as the EU’s GDPR, not all big data deployments can move to the public cloud as is.

Hence, the power, value, and flexibility of Oracle’s Big Data Cloud, which is a modern platform for big data management with support for modern as well as traditional frameworks for:

  • Analytics
  • Business intelligence
  • Machine learning
  • Internet of Things (IoT)
  • Artificial intelligence (AI)

This is the only PaaS service of its kind that addresses the scenarios mentioned previously through two very special offerings:

  • Big Data Cloud: The most comprehensive, secure, performant, scalable, and feature-rich public cloud service for big data in the market today. And we have only gotten started building out this platform so expect more goodness down the road.
  • Cloud At Customer: For customers who cannot move to the public cloud, Oracle Cloud Machine can bring the public cloud to their own data center and provide the same benefits including having Oracle manage the cloud machine. This is a unique service that no other cloud provider offers.

Oracle Big Data Cloud brings the best of open source software to an easy-to-use and secure environment that is seamlessly integrated with other Oracle PaaS and IaaS services. Customers can get started in no time and do not require in-house expertise to put together a solution stack by assembling open source components and other third-party solutions.

7 Key Features of Oracle Big Data Cloud:

1. Advanced Storage: Build a data lake with all your data in one centralized store with advanced storage options. Smart caching allows for extreme performance. Provide your entire organization with access to all of the data sets in a secure and centralized environment. There is built-in data lineage and governance support. It is the easiest way to scale out storage independent of compute clusters.

2. Advanced Compute: Spin up or down compute clusters (Apache Hadoop, Apache Spark, or any analytic stack) within minutes. Auto-scale your clusters based on triggers or metrics. Use GPUs for deep learning.

3. Built-in ML and AI Tools: Data science tools such as Zeppelin come with the service to enable scientists to experiment and explore data sets. As mentioned earlier, there are compute shapes available with full GPU support for advanced algorithms and training in deep learning. A diverse catalog of machine learning libraries such as OpenCV, Scikit, Pandas, etc. makes it easy to build your next intelligent product.

4. Strong Security: Oracle Identity Cloud Service provides a way to allow granular access on a per-user basis and there are audit facilities built in as well. There is full encryption support for data-in-motion and data-at-rest. Sophisticated SDN allows customers to define their own network segments with advanced capability such as custom VPN, whitelisted IP, etc.

5. Integrated IaaS and PaaS Experience: Easy access to other Oracle Cloud Application Development services such as Oracle Event Hub Cloud Service, Oracle Analytics Cloud, Oracle MySQL Cloud, etc. Customers also have the option of using Oracle Cloud Infrastructure to back up Oracle Storage Cloud or create private VPNs to connect on-premise applications with services running in Oracle Public Cloud.

6. Fully Automated: The entire lifecycle of your infrastructure is automated. Our goal is to help you focus on the real differentiator, your data and your application. The platform will take care of all the undifferentiated work of provisioning, managing, patching, etc., so you can focus on your business.

7. World-Class Support: With an integrated approach, Oracle provides a one-stop shop for all things big data including support for Hadoop. Customers will not have to deal with multiple vendors to manage their stack.

For more information on Oracle’s Cloud Platform – Big Data offerings, please visit the Oracle Big Data Cloud webpages. And for the most advanced public cloud service, you can visit the Oracle Cloud Platform pages.

Or, sign up for an Oracle free trial to build and populate your own data lake. We have tutorials and guides to help you along. 

Please leave a comment to let us know how we are doing.


Join the discussion

Comments ( 3 )
  • Sundaresh K A Monday, April 9, 2018
    Big Data Cloud brings the best of open source software to an easy-to-use and secure environment that is seamlessly integrated with other Oracle PaaS and IaaS services. useful information for bigdata learners thanks for sharing this kind of content
    Hadoop training courses
  • nakshatra Tuesday, March 3, 2020
    I feel very grateful that I read this. It is very helpful and very informative and I really learned a lot from it.
    big data in malaysia
  • big data in malaysia Tuesday, April 7, 2020
    I am really enjoying reading your well written articles. It looks like you spend a lot of effort and time on your blog. I have bookmarked it and I am looking forward to reading new articles. Keep up the good work.
    big data in malaysia
    data science course
    data analytics course
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.