July 27th, 2016 | Analytics, Big Data, Data Science

Machine Learning for Churn Reduction

Netflix stock shares tumbled last week due to the announcement made on their growth rates (see eMarketer note). Apparently the company grew, but the growth was lower than expected. The problem? Churn. eMarketer mentions that clients where leaving the company due to the price increase. But let’s be realistic, do you think the churn is just related to an increment of 2 dollars per month?

Churn or Churn Rate is one of the most important metrics for companies that offer services that are charged per month like TELCOs, CELCOs, Cable, etc. Every time a client leave his company the company lose that revenue stream and will cost it money if the company wants to maintain the same level of revenues. Most of the above mentioned companies begin their relationship with their clients with a negative result, the acquisition cost, so more they keep the client paying the more the LTV, the less they keep the client paying the lower the LTV (Life time value) at the point that they can even lose money.

Clients are all the time tempted to change to other company. There are internal (own service, customer support, etc) and external reasons (competitors, price, etc) that bring to the game hundred or thousand variables. Those variables are constantly changing generating new scenarios all the time. A competitor reducing its price can generate a drop in our customer satisfaction rate, because for the new “market price” our service is more expensive, ergo is fair expecting a better service. In a constant changing scenario, machine learning is the answer.

Here we share a simple process to reduce churn based on Machine Learning techniques.

1. Define the objective and tactic.


2. Define best Machine Learning Approach: Focused on understanding what drives a person to become inactive (churn), you can make a descriptive and exploratory analysis on the transaction database. Then, you have to identify which behavioral variables were involved in churn events. One recommended methodology for its simplicity and actionability is the Supervised Machine Learning Algorithm model “Binary Logistic Regression” which is a specific type of regression model where the target variable (also called answer) is a Dichotomous variable and represents the event to be explained. The output can be related to a set of predictor variables, which can be categorical, discrete and/or continuous. Our recommendation is running this model with R for it’s simplicity, you will find tons of examples out there on how to run a well performing queries.


3. Identify which specific events are driving people to cancel their subscription. With the above model you can get some of the following clues. The probability of churn:

  • Increases 2.8 times when the bill amount is lower than $10 per month.
  • Is 50% higher when the bill amount decreased during last three months.
  • Is 35% higher when the client is just consuming one product of the company.

4. Once you prove the hypothesis the next step would be generating an automation marketing platform that takes that information and generate actions to users in risk of churn. Another possibility is feeding the Automation Marketing Database with the result of the Supervised Machine Learning technique called Random Forest.


Some of the results we obtained with this methodology was:

churn reduction


Join our Newsletter and receive periodical news and updates: