December 14th, 2015 | Analytics, Big Data, Data Science

Machine learning will drive predictive analytics to a new level

machinelearninglogoYou’ve  probably heard about machine learning and how is it being used for AI (Artificial Intelligence) proposes.

On the other hand predictive Analytics, which is the part of analytics focused on using statistics to simplify the answer “What might happen in the future”, was pretty popular a couple of years ago until managers realised that their real power was weaker than what they thought. Tom Davenport wrote an interesting note for the Harvard Business review called “A Predictive Analytics Primer” in which he mentions three main pillars for Predictive Analytics:

  1. The Data: Lack of good data is the most common barrier to organizations seeking to employ predictive analytics. To make predictions about what customers will buy in the future, for example, you need to have good data on who they are buying (which may require a loyalty program, or at least a lot of analysis of their credit cards), what they have bought in the past, the attributes of those products (attribute-based predictions are often more accurate than the “people who buy this also buy this” type of model), and perhaps some demographic attributes of the customer (age, gender, residential location, socioeconomic status, etc.). If you have multiple channels or customer touchpoints, you need to make sure that they capture data on customer purchases in the same way your previous channels did. All in all, it’s a fairly tough job to create a single customer data warehouse with unique customer IDs on everyone, and all past purchases customers have made through all channels. If you’ve already done that, you’ve got an incredible asset for predictive customer analytics.
  2. The Statistics: Regression analysis in its various forms is the primary tool that organizations use for predictive analytics. It works like this in general: An analyst hypothesizes that a set of independent variables (say, gender, income, visits to a website) are statistically correlated with the purchase of a product for a sample of customers. The analyst performs a regression analysis to see just how correlated each variable is; this usually requires some iteration to find the right combination of variables and the best model. Let’s say that the analyst succeeds and finds that each variable in the model is important in explaining the product purchase, and together the variables explain a lot of variation in the product’s sales. Using that regression equation, the analyst can then use the regression coefficients—the degree to which each variable affects the purchase behavior—to create a score predicting the likelihood of the purchase. Voila! You have created a predictive model for other customers who weren’t in the sample. All you have to do is compute their score, and offer the product to them if their score exceeds a certain level. It’s quite likely that the high scoring customers will want to buy the product—assuming the analyst did the statistical work well and that the data were of good quality.
  3. The Assumptions: That brings us to the other key factor in any predictive model—the assumptions that underlie it. Every model has them, and it’s important to know what they are and monitor whether they are still true. The big assumption in predictive analytics is that the future will continue to be like the past. As Charles Duhigg describes in his book The Power of Habit, people establish strong patterns of behavior that they usually keep up over time. Sometimes, however, they change those behaviors, and the models that were used to predict them may no longer be valid. What makes assumptions invalid? The most common reason is time. If your model was created several years ago, it may no longer accurately predict current behavior. The greater the elapsed time, the more likely customer behavior has changed. Some Netflix predictive models, for example, that were created on early Internet users had to be retired because later Internet users were substantially different. The pioneers were more technically-focused and relatively young; later users were essentially everyone. Another reason a predictive model’s assumptions may no longer be valid is if the analyst didn’t include a key variable in the model, and that variable has changed substantially over time.

The above mentioned points gives you a clue on what Predictive Analytics fails or at least why is not good enough for what managers are needing. Humans, unlike computers, have creative intelligence so they can, for instance, identify solutions for problems or situations that have never happened before. Machine learning tends to emulate this human capacity. Machine Learning is a subfield of computer science that evolved from the study of pattern recognition and computational learning theory in artificial intelligence. Machine learning explores the study and construction of algorithms that can learn from and make predictions on data. Such algorithms operate by building a model from example inputs in order to make data-driven predictions or decisions, rather than following strictly static program instructions.

Machine learning, however, has not been very effective in environment with unlimited quantity of variables and values for those variables but it is very effective for the opposite case. Ones of the best environments for the use of Machine Learning for Predictive Analytics proposes is Digital Marketing. All the information is digital, is limited and the improvement of its activities might generate a huge impact in companies bottom line.


Join our Newsletter and receive periodical news and updates: