By Alexa Weber Morales
You might think of machine learning (ML) as a rarified specialty. Heli Helskyaho argues that you can’t afford not to understand ML.
“First of all, machine learning is productivity, because you have so much data, you can’t manually handle it all. Data is the key for making right decisions,” the CEO of Miracle Finland Oy, an IT consultancy, says. “Right kind of data, right decisions: That’s the equation here. You will not be able to work with data in any way—productive or not—without machine learning.”
Not surprisingly for someone with a background in databases, Helskyaho, the author of Oracle SQL Developer Data Modeler for Database Design Mastery (Oracle Press 2015), focuses on big data when thinking about ML. She tells a story of visiting a company and seeing its systems shutting down with no explanation. Helping it discover why this was happening was a simple question of analyzing temperature data with ML, she says, which discovered that on the rare occasion when a certain door was left open, the cold air would kill a particular machine and cause a cascading outage.
ML is not for answering if-else questions, she says. “Machine learning is for something you have no way to predict yourself. ML is, ‘I have no idea what might happen, but whatever happens, I need to be prepared. The world is changing, so I will adapt to that.’”
Understanding the plethora of ML terminology is the first step, Helskyaho says, noting that her popular presentations explain what these words mean and how they relate to each other. “You start out very confused, because people are using different words with the same meaning and that can sometimes mislead you.
What computers cannot do is understand the world.”
“If you are a developer like me, the next step is choosing the programming language you want to start with, because there’s no point to starting with all of them. Use skills you already have, and add something new,” she says. Although Python is probably the most common language for working with ML, “if you’ve been using R, SQL, C++, Java—start with that.”
Next, ask about algorithms, models, and how to prepare the data. “I think 80% of ML work is preparing the data, because the data is usually not the best quality,” Helskyaho says. Using a drag-and-drop tool such as Oracle SQL Developer enables you to build a real-world model and experiment with neural networks, regression, anomaly detection, and classification algorithms.
Starter projects might be trying to predict tomorrow’s temperature or next week’s stock prices, or checking whether a person is creditworthy or not.
As in any educational journey, you can run into dead ends. In ML, Helskyaho says, the biggest risk is bad data: “You think the data you are using is good quality, but if you just start using it without checking it, you find your predictions are very strange,” says Helskyaho. “You start asking, ‘What did I do wrong?’ You end up realizing the data was rubbish. I see that quite a lot.”
It’s also important to remember that as ML becomes more commonplace, it doesn’t require a scary amount of math and statistics to get started. What is necessary, however, is an understanding of the domain-specific questions that ML can answer.
“What computers cannot do is understand the world,” Helskyaho says. “You understand your business, and if the machine learning brings you new knowledge, you understand if it’s valuable or not. It’s a team play, not just one skill set and one person. You need a team to make it really profitable for your company.”
FOLLOW the Oracle Developers blog.
Photography by Oracle Digital Media Production