Learn about data lakes, machine learning & more innovations

Disruptive Effects of Cloud Native Machine Learning Systems and Tools

Guest Author

In 1970, E. F. Codd proposed the relational database. Fast forward today, and completely different cloud native architectures in application development have emerged that take advantage of native cloud properties. One cloud native paradigm is serverless. With serverless architecture, the entire stack is run in a manner that is inherently distributed and event-driven. This architecture can also be referred to as Functions as a Service (FaaS).  Functions written in Python, Go, Java, or another language run in response to events, and resources are elastically provisioned. Additionally, instead of relational databases, which expect consistency, serverless architectures are designed around “eventual consistency.”

With cloud-native machine learning, the core foundation of the cloud allows an enterprise to leverage the operational expertise of the cloud provider and the system automation of the machine learning system. Tasks that do not add value to an organization’s ML and AI strategy such as feature engineering, ETL (extract, load, transfer), and model selection are automated away. This is accomplished by removing tasks that add little or no value to creating the final prediction model or could be better accomplished by machines. Some examples of this are:

  • Model selection (i.e., random forest vs decision tree vs logistic regression)
  • Hyperparameter tuning
  • Splitting test, train, and validation data
  • Cross-validation folding

In a nutshell, there is a big demand for user-friendly machine learning systems that simplify the often unneeded complexity of training machine learning models.

Subscribe to the Oracle Big Data Blog to get the latest big data content sent straight to your inbox!

Cloud Native Machine Learning Tools
There are a few emerging descriptions of cloud native machine learning tools. Cloud native machine learning tools are defined as tools that have the origins in the cloud. These tools take advantage of the inherent features of the cloud such as elasticity, scalability, and economies of scale. Two of the most popular descriptions are “managed machine learning” and “automated machine learning.”

In the case of managed machine learning systems, the scalability of training a model and managing the entire system is automated. A good example would be a production Hadoop or Spark system that is utilizing a cluster of thousands of machines. Errors in machine learning training and inference will need to be debugged by Hadoop experts. Similarly, the cluster may need to be optimized to correctly scale up and down the resources needed for a training job to optimize cost. In a sophisticated, cloud-native machine learning system, both the training scalability and serving out the inference model can be automatically scaled. The model can also automatically A/B test different versions in production and then automatically switch to the model that performs best according to customer needs. Data scientists in the organization can then focus on solving enterprise needs instead of implementing and maintaining machine learning infrastructure.

Automated machine learning (AutoML) goes one step further. It can completely automate training a machine learning model and serve it out in production. It accomplishes this by training models from labeled columns (say, images) and automatically evaluating the best model. Next, an AutoML system registers an API that allows for predictions again that trained model. Finally, the model will have many diagnostic reports available that allow for a user to debug the created model—all without writing a single line of code.  

Tools like this drive AI adoption in the enterprise by empowering and democratizing AI to all employees. Often, important business decisions are siloed away in the hands of a group of people who are the ones with the technical skills to generate models. With AutoML systems, it puts that same ability directly into the hands of decision makers who create AI solutions with the same ease that they use a spreadsheet.

High-level AI and ML systems that are directly accessible by non-technical gurus have arrived.  As William Gibson said, “the future is here—it just isn't evenly distributed.” Any company can immediately benefit from using AutoML and Managed ML systems to solve business problems. Even if your company isn’t using them, your competitors most likely are.

Subscribe to the Oracle Big Data Blog to catch the latest on machine learning, all delivered straight to your inbox—and don’t forget to follow us on Twitter @OracleBigData.

Guest author, Noah Gift is lecturer and consultant at both UC Davis Graduate School of Management MSBA program and the Graduate Data Science program, MSDS, at Northwestern. He is currently also consulting startups and other companies on machine learning, cloud architecture and CTO level consulting as the founder of Pragmatic AI Labs. His most recent book is Pragmatic AI: An introduction to Cloud-Based Machine Learning (Pearson, 2018).

Join the discussion

Comments ( 1 )
  • Charles R Berger Friday, January 25, 2019
    We've been automating machine learning for years inside the Oracle Database as native parallelized SQL functions. We expose them via their native SQL language API but also expose them via an R language API and provide integration with R. The "drag and drop" Oracle Data Miner UI (a SQL Developer extension) and the Oracle Machine Learning (Zeppelin based) notebooks (packaged with Autonomous Databases) provide additional touch points to the powerful in-DB ML functions. We're adding support for Python soon (OML4Py) which also extends our automatic ML capabilities with AutoML for completely automated model builds. See https://blogs.oracle.com/datamining/a-simple-guide-to-oracle%E2%80%99s-machine-learning-and-advanced-analytics for more information and links to everything.
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.