Is your model not predicting as well as it used to? Identify and address model degradation

Trendline Interactive

4 minute read

You’ve designed and built a machine-learning model that predicts customer demand for one of your most popular products, at different prices. The model is extremely accurate––leveraging years of historical data on sales, customers, seasonality, and other variables. Therefore, the decision-makers in your organization are excited to start implementing it to enhance their targeting and improve their ROI. However, because they operate in fluid environments, machine-learning (ML) models require continuous monitoring to maintain their predictive power. This article will explain why models become less accurate over time, how you can track their performance to determine when they need to be updated, and strategies for addressing model degradation. 

What is model degradation?

Organizations use machine learning for artificial intelligence (AI) and to discern patterns within large amounts of data. Given a sufficient data set representing the actionable population (AKA “training data”), a computer learns to recognize such patterns and can then make predictions about previously-unseen data. Despite the name, machine learning doesn’t constantly learn. Learning occurs in the first process of building a model: the training stage. Once deployed, most models are “static” and may start to worsen over time due to a variety of reasons. All reasons, however, are rooted in the fact that the algorithm receives no new information or feedback after deployment.  

Why does model degradation occur?

The phenomenon of a model losing its predictive power is known as concept drift. Concept drift happens due to unforeseen changes in the statistical properties of the problem that the model solves. The drift––and the model’s inability to self-correct (since it receives no new data)––causes model degradation. Here are some of the most common reasons this drift occurs: 

  • Abrupt Changes in BusinessThink back to the model designed to predict demand. Now imagine your competitor releases a similar product with better features one month after you deploy yours. Because your training data did not have any information about this new product, the model will not be able to predict the impact this will have on demand. Any acute or sudden changes in business will generally lead to reduced accuracy.
  • Slow Shift in Customer Trends Not all changes have to happen quickly. Slow shifts over time will also be detrimental to your model. For example, let’s say you built a model five years ago for predicting the best customer segment for cable or satellite TV. That model would be nearly useless today, since customer preferences have shifted toward streaming services. 
  • Technical Limitations or Incomplete DataThe variables and data included in the original model will impact its performance in the future. Let’s say you determined that people who participate in contests are more likely to buy than those who don’t. If you include this variable in your model but stop running contests, the variable becomes irrelevant. Thus, the predictive capability of your model decreases. 
  • System MigrationChanging systems often involves changing the way organizations capture, store, and use data. Data could also be lost or removed during the migration process. In either case, a system migration could break your model altogether.
  • Intentional Abuse or Fraud – In some cases, people intentionally try to abuse or break a model. Spam filters, for example, use machine learning to determine what goes into the junk folder. Spammers are constantly trying to find new ways to bypass this filter. 
  • Unexpected Events Many, such as a world-wide pandemic, impact business. Models built on historical data are no longer as informative about customer behavior as they were prior to the pandemic. 

How do you determine the accuracy of your model?

Models have different shelf lives depending on the application and the level of accuracy needed when making predictions. One of the easiest ways to determine the level of degradation is to record predictions and compare them against the actual, measured results. This will allow you to spot any sudden decreases in performance and evaluate the effectiveness of the model. This is particularly effective if there is a process set up to continuously monitor accuracy after each implementation has “fully-baked” results. Keep in mind that the process used to build the model is reflective of the time period in which it was built. Understanding why model degradation occurs and measuring the performance of your model against real-world results allows everyone to address loss of model power and make more informed decisions

Addressing model degradation

As your model exhibits signs of concept drift or degradation, you can take steps to repair it.  Which strategy you choose depends on the type of data included, what the model attempts to predict, the variables involved, and what’s at stake for your business if the model starts to lose power. Below are some solutions, but contact us if you’d like help making a strategy based on your data. We give great insights.

Solution 1 – Refreshing models with new data

Your model’s predictive power depends on the quality of the data it’s built upon, including how much of that data is still relevant today. (“Garbage in, garbage out,” as they say.) As your data changes, your model may need tweaking.  

  1. One method (often the simplest method) is to refresh the model with a more current data set. In this case, you’re not changing the variables or methodology. Rather, you’re just updating the weights of those variables.
  2. Another method is to use an incremental updating strategy. In this case, combine the existing model (based on an older data set) and a newer model (based on a newer data set) and/or a set of easily executable business rules. This approach is most useful when dealing with a large quantity of data that does not necessarily need to be reprocessed or when impactful changes in the business suggest an added layer to the model output. 

Solution 2 – Building a new model 

If your model’s performance is still suffering after being refreshed with recent data, consider the second solution: Build a new model entirely. Rebuilding the model is more work, but it will allow for a reassessment of variables. As the variables may no longer be relevant, this option may end up being even more beneficial than you imagined. This assessment may also reveal where to incorporate new sources of data, which may make your modeling efforts more effective. 

Ensuring accuracy in your machine-learning models

It’s necessary to maintain machine-learning models in order to sustain their predictive power. Although this process can be automated, the kinds of complex problems that cause your model to degrade require the competence of a trained data scientist. How you decide to keep your models in good condition will depend on a variety of factors, each unique in its application, rate of depreciation, and required level of accuracy. As part of the modeling process, it’s imperative that you consider how you will monitor the performance of your models to determine when they’re ripe for re-evaluation. 

Need support creating your strategy or assistance in gleaning insights from data? Contact us. We’d love to help.

Trendline Interactive

Ready to send better messages?

About the Author(s)

Trendline Interactive

Built from email marketing, Trendline Interactive is an agency and consultancy that inspires brands to create meaningful engagement through cross-channel communications. Our passion is that every single message sent is not only meaningful to the audience, but drives success for our clients.

Let's Take This to the Inbox

Sign up for our news, resources and updates. The inbox is our favorite place after all. We’ll make sure it’s worth it. (You can unsubscribe at any time, but you probably already knew that.)