X

Oracle Data Cloud Blog

Predicting the unobservable: Thoughts on predicted vs. observed data

This week’s guest blog is contributed by David Kelly, Founder & CEO, Analytics IQ, and was originally published in the official magazine for the Oracle Data Cloud Summit in June 2016.

For years, philosophers have argued about the nature of reality. Empiricists like David Hume maintained all we know must be rooted in observable facts. Meanwhile, rationalists like René Descartes thought an inkling of innate knowledge fueled our ability to understand the world around us.

I land some place in-between on this debate. My firm specializes in using analytics to predict future outcomes, such as “household spending on discretionary items.” We often enter the fray when customers ask about the role of predicted data in the online world, given there is so much observed data.

While no one doubts it is often better to go with observed or known data over predicted data, there is a powerful role for predicted data. In other words, the rationalist philosophers were onto something.

Predicting the unobservable

The role of predictive data is intensely evident when trying to understand the unobservable. For example, one cannot observe each U.S. household’s budget/capacity to pay for a cruise over the next 12 months. You can observe propensity to go on certain cruises, but that is not the same thing as understanding total capacity. It’s impossible to observe a knowable outcome from the available data. We need to extrapolate based on other criteria to accurately predict which couples are ready to make this large-ticket purchase.

The challenge of scalability

David Kelly, Analytics IQ talks predicting unobservable dataOne key challenge with observed data is scale and reach. While there is little doubt that known subscribers to a Porsche enthusiast website are great prospects for a Porsche dealer, the reality is it will be an extremely small group. In the predicted world, we would clone this small group to a larger universe—perhaps resulting in millions of Porsche-buyer clones.

Another example is measuring network influencers. Data publishers frequently wish to sell to known, active users of Facebook, Twitter and other social sites. The reality is these individuals only represent a subset of the total universe. A larger market exists, but there is no observable data marketers can use to find them unless a similar cloning exercise is used.

The cost of observed data

Observed data comes with a hefty price tag. Niche data is often two to ten times as expensive as comparable predicted data, which means the subsequent performance must be much better. Being able to predict data is less expensive. Sorry Hume, but Descartes is simply a better bargain shopper.

Clearly, there is a role for observed data. The first step to understanding our world is to observe it. However, when the light dims and we need to figure out what lies in the shadows, we look for different strategies.

Our recommendation is to actually utilize observed data when it is applicable and supplement it with predicted data. When many digital campaigns depend on millions of impressions to drive the expected results, a hybrid approach provides the best mix of power and scalability.

About David Kelly

David Kelly is an analytics entrepreneur with strong business acumen. After successfully creating and selling Sigma Analytics in the early 2000s, Dave founded AnalyticsIQ in 2007 and was named the ‘Analytic Marketer of the Year’ in 2012. Whether he is taking care of his employees, meeting with clients, or traveling the world, this car-lover is always on the go!

Stay up to date with all the latest in data-driven news by following @OracleDataCloud on Twitter and Facebook! Need data-related answers for your next marketing campaign or client partner? Contact The Data Hotline today. (What's The Data Hotline?) 

Photo: Dean Drobot/Shutterstock

 

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.