What do most effective marketing campaigns have in common? A clear purpose? A unique angle? What about machine learning?
Let us suppose that a company wants to perform a direct marketing campaign to get a response (like a subscription or a purchase) from users. It wants to run a marketing campaign for around 10,000 users out of which only 1,000 users are expected to respond. But the company doesn't have a budget to reach out to all the 10,000 customers. To minimize the cost, the company wants to reach out to the smallest number of customers as possible but at the same time reach out to most (user defined) of the customers who are likely to respond. The company can create machine learning models to predict which users are likely to respond and with what probability.
Then the question comes which model should I choose? Which machine learning model is likely to give me the optimal number of respondents with the fewest number of original respondents as possible? A cumulative Gains and Lift chart can answer these questions.
In this technical blog, we will talk about a Cumulative Gains chart and Lift chart created in Oracle Data Visualization for Binary Classification machine learning models and how these charts are useful in evaluating the performance of a classification model.
A Cumulative Gains and Lift chart is a measure of the effectiveness of a binary classification predictive model calculated as the ratio between the results obtained with and without the predictive model. Gains and Lift charts are popular techniques in direct marketing. They are visual aids for measuring model performance and contain a lift curve and baseline. The effectiveness of a model is measured by the area between the lift curve and baseline. The greater the area between lift curve and baseline, the better the model. One academic reference on how to construct these charts can be found here.
Sample Project for Cumulative Gains and Lift Chart Computation
The Oracle Analytics Library has an example project for this that was build using Marketing Campaign data of a bank. The charts below demonstrate this.
Scenario:
This marketing campaign aims to identify users who are likely to subscribe to one of their financial services. They are planning to run this campaign for close to 50,000 individuals out of which only close to 5,000 people (about 10 percent) are likely to subscribe to the service. The marketing campaign data is split into training and testing data. Using training data, we created Binary classification machine learning model using a Naive Bayes classifier to identify the likely subscribers along with prediction confidence. The actual values (i.e., whether a customer actually subscribed or not) is also available in the dataset. Now they want to find out how good the model is in identifying most number of likely subscribers by selecting a relatively small number of campaign base (i.e., 50,000).
Machine learning models are applied to the test data and receive the Predicted Value and Prediction Confidence for each prediction. This prediction data and Actual outcome data is used in a data flow to compute cumulative gain and lift values.
How to Interpret These Charts and How to Measure Effectiveness of a Model:
A Cumulative Gains chart depicts cumulative of the percentage of Actual subscribers (Cumulative Actuals) on the Y-Axis and the Total population (50,000) on the X-Axis in comparison with random prediction (Gains Chart Baseline) and Ideal prediction (Gains Chart Ideal Model Line). This depicts all the 5,000 likely subscribers are identified by selecting first 5,000 customers sorted based on Prediction Confidence for Yes. What the cumulative Actuals chart says is that by the time we covered 40 percent of the population we already identified 80 percent of the subscribers and by reaching close to 70 percent of the population we have 90 percent of the subscribers.
If we are to compare one model with another using cumulative gains chart model with a greater area between the Cumulative Actuals line and Baseline is more effective in identifying a larger portion of subscribers by selecting a relatively smaller portion of the total population.
The Lift Chart depicts how much more likely we are to receive respondents than if we contact a random sample of customers. For example, by contacting only 10 percent of customers based on the predictive models we will reach 3.20 times as many respondents as if we use no model.
The Max Gain shows at which point the difference between cumulative gains and baseline is maximum. For a Naive Bayes classifier model, this occurs when population percentage is 41 percent and maximum gain is 83.88 percent.
How to Compare Two Models Using Cumulative Gain and Lift Chart in Oracle Data Visualization:
To compare how well two machine learning models have performed we can use Lift Calculation dataflow (included in the .dva project) as a template and plug in the output of Apply Model data flow as a data source/input to the flow. Add the output dataset of Lift Calculation to the same project and add columns to the same charts as shown above to compare.
Please note that the data flow expects dataset to contain these columns (ID, ActualValue, Predicted Value, Prediction Confidence). This is how it will look like when we compare two models using same visualizations:
Of course, you can't test out these features for yourself unless you get your hands on it. To learn more about the machine learning feature download Oracle Data Visualization Desktop and tell your boss how much you like it.