Predicting Likelihood of Click with Multiple Presentations
By Michel Adar on Nov 30, 2011
When using predictive models to predict the likelihood of an ad or a banner to be clicked on it is common to ignore the fact that the same content may have been presented in the past to the same visitor. While the error may be small if the visitors do not often see repeated content, it may be very significant for sites where visitors come repeatedly.
This is a well recognized problem that usually gets handled with presentation thresholds – do not present the same content more than 6 times.
Observations and measurements of visitor behavior provide evidence that something better is needed.
For a specific visitor, during a single session, for a banner in a not too prominent space, the second presentation of the
same content is more likely to be clicked on than the first presentation. The difference can be 30% to 100% higher likelihood for the second presentation when compared to the first.
That is, for example, if the first presentation has an average click rate of 1%, the second presentation may have an average CTR of between 1.3% and 2%.
After the second presentation the CTR stays more or less the same for a few more presentations. The number of presentations in this plateau seems to vary by the location of the content in the page and by the visual attraction of the content.
After these few presentations the CTR starts decaying with a curve that is very well approximated by an exponential decay. For example, the 13th presentation may have 90% the likelihood of the 12th, and the 14th has 90% the likelihood of the 13th. The decay constant seems also to depend on the visibility of the content.
Now that we know the empirical data, we can propose modeling techniques that will correctly predict the likelihood of a click.
Use presentation number as an input to the predictive model
Probably the most straight forward approach is to add the presentation number as an input to the predictive model. While this is certainly a simple solution, it carries with it several problems, among them:
- If the model learns on each case, repeated non-clicks for the same content will reinforce the belief of the model on the non-clicker disproportionately. That is, the weight of a person that does not click for 200 presentations of an offer may be the same as 100 other people that on average click on the second presentation.
- The effect of the presentation number is not a customer characteristic or a piece of contextual data about the interaction with the customer, but it is contextual data about the content presented.
- Models tend to underestimate the effect of the presentation number.
For these reasons it is not advisable to use this approach when the average number of presentations of the same content to the same person is above 3, or when there are cases of having the presentation number be very large, in the tens or hundreds.
Use presentation number as a partitioning attribute to the predictive model
In this approach we essentially build a separate predictive model for each presentation number. This approach overcomes all of the problems in the previous approach, nevertheless, it can be applied only when the volume of data is large enough to have these very specific sub-models converge.
In the next couple of entries we will explore other solutions and a proposed modeling framework.