AI, machine learning, and deep learning - these terms overlap and are easily confused, so let’s start with some short definitions.
Artificial Intelligence (AI) means getting a computer to mimic human behavior in some way.
Machine learning is a subset of AI, and it consists of the techniques that enable computers to figure things out from the data and deliver AI applications.
Deep learning, meanwhile, is a subset of machine learning that enables computers to solve more complex problems.
Those descriptions are correct, but they are a little concise. So I want to explore each of these areas and provide a little more background.
Artificial intelligence as an academic discipline was founded in 1956. The goal then, as now, was to get computers to perform tasks regarded as uniquely human: things that required intelligence. Initially, researchers worked on problems like playing checkers and solving logic problems.
If you looked at the output of one of those checkers playing programs you could see some form of “artificial intelligence” behind those moves, particularly when the computer beat you. Early successes caused the first researchers to exhibit almost boundless enthusiasm for the possibilities of AI, matched only by the extent to which they misjudged just how hard some problems were.
Artificial intelligence, then, refers to the output of a computer. The computer is doing something intelligent, so it’s exhibiting intelligence that is artificial.
The term AI doesn’t say anything about how those problems are solved. There are many different techniques including rule-based or expert systems. And one category of techniques started becoming more widely used in the 1980s: machine learning.
The reason that those early researchers found some problems to be much harder is that those problems simply weren't amenable to the early techniques used for AI. Hard-coded algorithms or fixed, rule-based systems just didn’t work very well for things like image recognition or extracting meaning from text.
The solution turned out to be not just mimicking human behavior (AI) but mimicking how humans learn.
Think about how you learned to read. You didn’t sit down and learn spelling and grammar before picking up your first book. You read simple books, graduating to more complex ones over time. You actually learned the rules (and exceptions) of spelling and grammar from your reading. Put another way, you processed a lot of data and learned from it.
That’s exactly the idea with machine learning. Feed an algorithm (as opposed to your brain) a lot of data and let it figure things out. Feed an algorithm a lot of data on financial transactions, tell it which ones are fraudulent, and let it work out what indicates fraud so it can predict fraud in the future. Or feed it information about your customer base and let it figure out how best to segment them. Find out more about machine learning techniques here.
As these algorithms developed, they could tackle many problems. But some things that humans found easy (like speech or handwriting recognition) were still hard for machines. However, if machine learning is about mimicking how humans learn, why not go all the way and try to mimic the human brain? That’s the idea behind neural networks.
The idea of using artificial neurons (neurons, connected by synapses, are the major elements in your brain) had been around for a while. And neural networks simulated in software started being used for certain problems. They showed a lot of promise and could solve some complex problems that other algorithms couldn’t tackle.
But machine learning still got stuck on many things that elementary school children tackled with ease: how many dogs are in this picture or are they really wolves? Walk over there and bring me the ripe banana. What made this character in the book cry so much?
It turned out that the problem was not with the concept of machine learning. Or even with the idea of mimicking the human brain. It was just that simple neural networks with 100s or even 1000s of neurons, connected in a relatively simple manner, just couldn’t duplicate what the human brain could do. It shouldn't be a surprise if you think about it; human brains have around 86 billion neurons and very complex interconnectivity.
Put simply, deep learning is all about using neural networks with more neurons, layers, and interconnectivity. We’re still a long way off from mimicking the human brain in all its complexity, but we’re moving in that direction.
And when you read about advances in computing from autonomous cars to Go-playing supercomputers to speech recognition, that’s deep learning under the covers. You experience some form of artificial intelligence. Behind the scenes, that AI is powered by some form of deep learning.
Let’s look at a couple of problems to see how deep learning is different from simpler neural networks or other forms of machine learning.
If I give you images of horses, you recognize them as horses, even if you’ve never seen that image before. And it doesn’t matter if the horse is lying on a sofa, or dressed up for Halloween as a hippo. You can recognize a horse because you know about the various elements that define a horse: shape of its muzzle, number and placement of legs, and so on.
Deep learning can do this. And it’s important for many things including autonomous vehicles. Before a car can determine its next action, it needs to know what’s around it. It must be able to recognize people, bikes, other vehicles, road signs, and more. And do so in challenging visual circumstances. Standard machine learning techniques can’t do that.
Take natural language processing, which is used today in chatbots and smartphone voice assistants, to name two. Consider this sentence and work out what the last part should be:
I was born in Italy and, although I lived in Portugal and Brazil most of my life, I still speak fluent ________.
Hopefully you can see that the most likely answer is Italian (though you would also get points for French, Greek, German, Sardinian, Albanian, Occitan, Croatian, Slovene, Ladin, Latin, Friulian, Catalan, Sardinian, Sicilian, Romani and Franco-Provencal and probably several more). But think about what it takes to draw that conclusion.
First you need to know that the missing word is a language. You can do that if you understand “I speak fluent…”. To get Italian you have to go back through that sentence and ignore the red herrings about Portugal and Brazil. “I was born in Italy” implies learning Italian as I grew up (with 93% probability according to Wikipedia), assuming that you understand the implications of born, which go far beyond the day you were delivered. The combination of “although” and “still” makes it clear that I am not talking about Portuguese and brings you back to Italy. So Italian is the likely answer.
Imagine what’s happening in the neural network in your brain. Facts like “born in Italy” and “although…still” are inputs to other parts of your brain as you work things out. And this concept is carried over to deep neural networks via complex feedback loops.
So hopefully that first definition at the beginning of the article makes more sense now. AI refers to devices exhibiting human-like intelligence in some way. There are many techniques for AI, but one subset of that bigger list is machine learning – let the algorithms learn from the data. Finally, deep learning is a subset of machine learning, using many-layered neural networks to solve the hardest (for computers) problems.
If you're still learning about machine learning, download our free ebook, "Demystifying Machine Learning." But if you've decided it might be time to get started, see what a data science and machine learning platform can do for you.