By Peter Jeffcock, Senior Principal Product Marketing Director - Big Data
There are lots of different ideas about the optimal machine learning workflow and even exactly how many steps there are. But regardless of how many different steps there are, we can all agree that there’s a lot involved in getting from a good idea, to a having a machine learning model in production and delivering value. So, any way to shorten that data science journey is worth investigating. I want to show you how exactly how we can shorten that journey using Oracle Cloud Infrastructure Data Science, and let you know how to test it and trial it for yourself.
We’ve taken a long look at the machine learning workflow and found several areas where we can automate tasks that are tedious, time-consuming or complex. Let’s take a look.
Subscribe to the Oracle AI & Data Science Newsletter to get the latest AI, ML, and data science content sent straight to your inbox!
When you first get a new data set, you need to spend some time exploring it and learning what’s in there, and how it might be useful. We add automation to that process by generating summaries, visualizations and correlations that will take you a long way towards understanding what that data might be able to do for you.
Even the most useful data set needs more work. We provide recommended transformations to the dataset and allow users to pick and choose which transformations to apply to the dataset, so it is ready for model training.
Our AutoML is developed in collaboration with the Oracle Labs research team. It will run a variety of experiments to ensure that you get the right algorithm, with the optimal hyper-parameter settings. And it does all this while evaluating model performance to ensure the results will be what you need.
Once a model is running and generating results or predictions, you know that somebody is going to ask: “Why did it give that result?” With global and local model explainers, we can show you which attributes contributed to any given result, so you can tell if your model is working as you expect. You can even potentially detect bias in the model caused by underlying bias in the original data – very useful if a lawyer or regulator wants to ensure that there is no discrimination going on.
Automating all these functions doesn’t just save time (though of course they do) or save you from doing things that are more tedious than creative (though it does that, too). Since the automation is based on best practices, it can get even experts close to an optimal approach, allowing them to focus on the really hard stuff. And for non-experts, it can often deliver better results than they could do on their own.
But don’t take my word for it. There’s a free hands-on lab that you can try for yourself. We will also be running an interactive office hours on October 9, 2020, to help users with any problems they may have with the lab. See if automation can help you accelerate your workflow.