Learn data science best practices

  • March 23, 2017

Feature Engineering for Churn Modeling

A churn model can help you determine the most significant reasons customers decide to stop using your product or service, but it’s up to the data scientist building the model to decide which factors to test and ultimately include or exclude, a process called feature engineering.

Feature engineering is the part of the model building process that allows data scientists to tailor the model to capture why churn happens in one specific business. Identifying the factors that are most directly connected to churn in your business will involve testing a mix of features related to engagement, demographics, and customer satisfaction.


Engagement features are the first a data scientist will examine when identifying the behaviors that typically precede churn. These features capture how customers interact with your business, like the number of times they log in to your website every month or whether they have unsubscribed from marketing emails. Engagement metrics are an ideal starting point because they are relatively easy to measure with precision and they are generally good indicators of whether a customer’s intention to continue using a service or not.

Consumer Demographics

Consumer demographic features — such as the region where a customer lives or his or her income — are another feature type that’s often simple to construct. However, there is a lot of variation in which demographic variables ultimately have a measurable impact on churn in any given business. Input from your customer retention or satisfaction team is especially important to steer your data scientists in the right direction.


This graph shows that the proportion of customers who churn typically have a lower income than those who did not churn during the training period for a sample subscription company.

Customer Satisfaction

Customer satisfaction features are usually the most reliable in explaining when and why customers churn, but they are also the hardest to capture. That’s because it’s impossible to ask each customer exactly how satisfied they are with a product or service and expect that they will answer honestly, if they even answer at all. Your data scientists will have to examine your data to identify information that can serve as a proxy for customer satisfaction. Product ratings and customer support calls are a good example. These metrics can clearly indicate an individual consumer’s sentiment, but they’ll need to be generalized for this information to be useful for measuring how happy your customers are on the whole.

The key to successful feature engineering is for your data scientists to combine their domain expertise with input from stakeholders in your business to construct a comprehensive range of possible features to include in the model. The next step is measuring the importance of each feature in explaining why churn occurs in your business, and removing whatever is extraneous. For more information on how the rest of the process works, check out this churn modeling screencast presented by Data Scientist Ruslana Dalinina.

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.