X

News and Views: Drive Smart Decisions with Cloud Analytics, Machine Learning and More

Is Your Forecasting Like Running with Scissors? Feature Friday

Guest Author

Do you ever feel like you're living on the edge with your forecasts? With advanced analytics being built into more and more of the tools we rely on for everyday decision making, are you just the slightest bit uncertain about what you've gotten yourself into? And are you ready to be grilled on your results?

"What does the forecast for the quarter look like?" You: No worries, I've got this.

"Why does the forecast look like that?" You: I know my business (and I sure hope the data backs me up).

"Is this adjusted for seasonality? Have you reviewed the outliers?" You: Whoa now, say what?

"What algorithm did you use for this forecast?" You: I'm gonna have to get back to you on that.

In a world of automatic advanced analytics, sometimes we take the power for granted and forget that "with great power comes great responsibility." (Yeah….I stole that from Spider-Man.) So, let's make sure we know our stuff.

Say you want to forecast order volumes for the next 3 months? It's easy. All you need to do is: 1) Open Oracle Analytics Cloud; 2) Click on your sales data (or my sales data -- link to data); 3) Drag Quantity Ordered; 4) Order Date>Month onto the canvas.

Toss on a filter (by dragging columns to the top bar) for individual product categories and products. We'll look at the following area for now: Product Category=Furniture, Product Sub Category=Bookcases.

Now, let's flex our analytic muscle with a handy forecast. You can get to this in one of two ways.

1) Hover over the line and right click, select Add Statistics>Forecast.

2) From the left side Data Panel, select the Analytics icon, expand Overlay & Projection, select forecast, and drag forecast onto the canvas.

Whether you went left or right, now you should see some forecast values. So now what?

You've successfully executed a time series forecast for order quantity by quarter. But can you explain it?

What kind of algorithm has been applied? Is it the right one?

If you look to the lower left corner, you will see the settings used for the forecast you have just chosen are automatically selected from the chart properties pane.

In the pane, you see the following information:

  • What is being forecast: "Next 3" means whatever timescale you have used; the algorithm is forecasting the next 3 periods.
  • The Method being used: "Next," which is fixed for all current algorithm types.
  • The number of Periods: In this case "3" is the default, but can be edited.
  • The Model being used: "Seasonal ARIMA" is the default but ARIMA and ETS are available.
  • The Prediction Interval: This shows you how wide the range of values would be to accommodate 95 percent of possible outcomes.

What does all that mean?

ARIMA stands for Autoregressive Integrated Moving Average Models. 

ETS stands for Exponential Triple Smoothing.

Not exactly the help you were looking for?

What do you need to know about ARIMA and ETS?

Both algorithms are used to predict the future value of a variable based on the historic values you provide in your dataset. You need a reasonable amount of data (sample size) to work from, common sense should apply. For example, it rained yesterday, but does that mean it will rain today? You looked at last week's sales, does that give you an idea of how the year will go? No in both cases! You need data that represents at least 10-12 periods like the period you are trying to forecast, and if you suspect there are cycles in your data, you will need data that reflects that.

  • ARIMA
    • ARIMA works best with data that has a stable or consistent pattern over time without too many wild swings or outliers in the data. There is an underlying assumption that past data is a key indicator of future data.
    • Every data point used in an ARIMA computation counts with equal weight to all the other points…it's a moving average, the main influencer is the window of available data.
  • ETS
    • ETS handles volatility a little bit better than ARIMA. It smooths the data out (this means it is less sensitive to extremes/exceptions/outliers) and rather than assign an equal weight to each value, the values of more recent data points can be weighted higher than older data points following an exponential pattern.

Both algorithms can handle seasonality, e.g., cycles in the data, but again, you need enough data for it to be apparent that it's part of a cycle, not just a fluctuation. Take shoe sales, for example. You wouldn't be surprised to sell fewer sandals in New York in January than in June, and to see that pattern occur every year.

Time to Experiment

Let's add an old fashioned reference line based on the average of the values over time.

A reference line is simply a horizontal line plotted on the graph to indicate the average of all values.

The first forecast shows us probable values of 60-75 orders plotted on the line, with the shaded area showing prediction interval values significantly higher or lower—this is to account for 95 percent of possibilities.

Based on the average, we could be looking at something in the area of 55 orders. 

Now let's change the number of periods forecast.

When doing this, it's important to keep in mind how much data you have to work with. We have four years of data as granular as the day level (that means we should have enough month-level data points), so let's try forecasting six months.

What if we change the Prediction Interval?

Changing the Prediction Interval only affects the shaded area that appears. Instead of accounting for 95 percent of possible outcomes, it shows the area that represents 90 percent of possible outcomes.

Now we'll change the algorithm.

Let's try the ETS option instead of ARIMA; it's supposed to be good for data with "noise," meaning data that is all over the place. Leave the number of periods set to six and the prediction interval at 90 percent and change the model to ETS.

You may notice that this result actually appears to be more consistent in pattern with the historic data than the ARIMA results, but which one is right?

The irony of forecasting and prediction is that only time will tell (not comforting, I know). 

Data Scientists continuously evaluate and monitor the algorithms they use to make sure the prediction quality is not degrading. As your data becomes more complex, with more variables in play or even as more time passes, it becomes increasingly difficult to predict based on all the data. This is where you and what you know come in.

For example:

  • Use the right data for your problem. Using data from the 1980s to predict demand for hairspray and shoulder pads in 2019 just won't work; you need to be working with data that is both timely and relevant to the question you are trying to answer.
  • Know your data, quirks and all. Using a dataset that doesn't factor out the temporary spike in shoulder pad purchases after the local 80s movie fest won't get you any kudos from the boss (and let's face it, you wouldn't want to be right about that).

So what can you do?

1. Think like a data scientist, pushing the buttons and pulling the levers to see which algorithm best fits your data. You can use the copy and paste functionality to replicate the same visualization multiple times so you can do a side-by-side comparison. You can also keep this on a separate canvas for your own "monitoring" purposes even after your task is complete.

2. Use all the tools in your box. I've shown you the various forecast algorithms and the reference line, but there's more in the box. Trend lines, clusters, and outliers are there just waiting for you to explore. For example, you may want to add an outlier visualization to help other users understand the data when you present it.

3. Context is king. Be descriptive so that people with whom you share this information get the big picture. 

  1. Filters help users navigate to the areas of data they know best.
  2. Use the automatic visualizations from Explain to describe the data to consumers.
  3. Use the Narrative chart type to provide a plain language explanation of what's being shown.

At the end of the day, quality analysis relies on quality data and your expertise. We build the tools to help you do more, faster and smarter, because that's what augmented analytics is all about.

To learn how you can benefit from the new Oracle Analytics, visit Oracle.com/Analytics, and don't forget to subscribe to the Oracle Analytics Advantage blog and get the latest posts sent to your inbox.

Guest author Rachel Bland is a director of product management for Oracle Analytics.

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.