Testing and Monitoring Machine Learning Models


Learn how to test & monitor production machine learning models.

What is model testing?

You’ve taken your model from a Jupyter notebook and rewritten it in your production system. Are you sure there weren’t any mistakes when you moved from the research environment to the production system? How can you control the risk before your deployment? ML-specific unit, integration and differential tests can help you to minimize the risk.

What is model monitoring?

You’ve deployed your model to production. OK now what? Is it working as you expect? How do you know? By monitoring models, we can check for unexpected changes in:

  • Incoming data
  • Model quality
  • System operations

When we think about data science, we think about how to build machine learning models, which algorithm will be more predictive, how to engineer our features and which variables to use to make the models more accurate. However, how we are going to actually test & monitor these models in a production system is often neglected. Only when we can effectively monitor our production models can we determine if they are performing as we expect.

Why take this course?

This is the a comprehensive course where you can learn how to test & monitor machine learning models. The course is realistic, yet manageable. Throughout this course you will learn all the steps and techniques required to effectively test & monitor machine learning models professionally.

In this course, you will have at your fingertips the sequence of steps that you need to follow to test & monitor a machine-learning model, plus a project template with full code, that you can adapt to your own models

Course Requirements / Who Is This Course Suitable For?

This is an intermediate course.

Never written a line of code before: This course is unsuitable

Never written a line of Python before: This course is unsuitable

Never trained a machine learning model before: This course is unsuitable. Ideally, you have already built a few machine learning models, either at work, or for competitions or as a hobby.

Never used docker before: The second part of the course will be very challenging. You need to be ready to read up on lecture notes & references.

Have only ever operated in the research environment: This course will be challenging, but if you are ready to read up on some of the concepts we will show you, the course will offer you a great deal of value.

Have a little experience writing production code: There may be some unfamiliar tools which we will show you, but generally you should get a lot from the course.

Non-technical: You may get a lot from just the theory lectures, so that you get a feel for the challenges of ML testing & monitoring, as well as the lifecycle of ML models. The rest of the course will be a stretch.


  1. Introduction
    • Course Introduction
    • Course Curriculum
    • Course Requirements
    • Approaching This Course
    • Complete Course Notes
    • All Course Slides
    • FAQ
  2. Course Scenario and Model Lifecycle
    • Deploying a Model to Production
    • Course Scenario - Predicting House Prices
    • Setup A - Python Install (Do not skip)
    • Setup B: Git Install (advanced users can skip)
    • Course Github Repo & Data
    • Download Data and Github Link
    • Setup C - Install Jupyter Notebook (Advanced users can skip)
    • Setup D - Install Initial Dependencies (advanced users can skip)
    • Introduction to the Dataset and Model Pipeline
    • The ML System Lifecycle
  3. Testing Concepts for ML Systems
    • Overview
    • Testing Focus In This Course
    • Why Test
    • Testing ML Systems (Important)
    • Testing Theory
    • Testing Concepts - Exercise 1 Instructions and Solution
    • Exercise 2 Instructions and Solution
    • Exercise 3 Instructions and Solution
    • Exercise 4 Instructions and Solution
    • Summary
  4. Unit Testing ML Systems
    • Overview
    • Python Code Conventions
    • Intro to pytest
    • Download Dataset from Kaggle
    • Using Tox
    • Codebase Overview
    • Preprocessing & Feature Engineering Theory
    • Unit Testing Preprocessing & Feature Engineering Code
    • git hygiene
    • Config Tests Theory
    • Unit Testing Config Code
    • Testing Input Data Theory
    • Unit Testing Input Data Code
    • Testing Model Quality Theory
    • Unit Testing Model Quality Code
    • Repo Tooling
    • Wrap Up
  5. Docker Refresher
    • Section Overview
    • Docker Recap
    • Why Use Docker
    • Introduction to Docker Compose
    • Docker & Docker Compose Installation
    • [Windows Only] Docker Setup
    • Docker Exercise Instructions and Solution
  6. Integration Testing ML Systems
    • Overview
    • API Conceptual Overview
    • Integration Testing Code Base
    • Using the API Part 1 and 2
    • Windows Specific Docker Setup
    • Integration Tests Theory
    • Integration Test Code
    • Benchmark Tests Theory
  7. Differential Testing
    • Overview
    • Differential Testing Theory
    • Differential Testing Implementation
  8. Shadow Mode
    • Shadow Mode Overview
    • Shadow Mode Theory
    • Testing Models In Production
    • Tests in Shadow Deployments
    • Shadow Mode DB Code Overview
    • Shadow Mode Setup Tests
    • Asynchronous Implementation
    • Populate DB with Shadow Predictions
    • Jupyter Demo - Setup and Tests in Shadow Mode
  9. Monitoring Metrics with Prometheus
    • Overview
    • Why Monitor ML Models
    • Monitoring Theory
    • Metrics for ML Systems
    • Prometheus & Grafana Overview
    • Windows Setup
    • Basic Prometheus Setup
    • Adding Prometheus Metrics
    • Setup Grafana
    • Infrastructure Level Metrics
    • Adding Metrics Monitoring to Our Example Project
    • Creating an ML System Grafana Dashboard
  10. Monitoring Logs with Kibana
    • Monitoring Logs for ML
    • Elasticstack overview
    • Kibana Exercise
    • Integrating Kibana into our Example Project
    • Setting Up a Kibana Dashboard for Model Inputs
  11. Conclusion
    • Conclusion
  12. Appendix A: Python Basics
    • String Manipulation Example
    • Dot Product - Numpy Comparison
    • Numpy arrays vs. Python Lists
    • pytest example (WIP)

About Me

Hi I'm Chris! I'm an experienced software developer who has taught over 25,000 software professionals online. If you Googled something about FastAPI before, you've probably ended up on my blog.


This is the second course I am completing from Christopher and Soledad. They have the most comprehensive courses on deploying machine learning in an industrial settings that I have taken. As a data scientist, I found the material I learned from these courses directly applicable to my daily work, and many of the tools they introduce either something I use on a daily basis or something I am going to look into using on a daily basis. Perhaps the most valuable course I have done [...] (I have completed 30++ courses)

- Anders Albert

30-Day Money Back Guarantee

If you're unhappy, email me within 30 days and I'll refund your money - no questions asked.

I want this!

30-day Money Back Guarantee

If you're unhappy, email me within 30 days and I'll refund your money - no questions asked.

Last updated Nov 22, 2023

Comprehensive understanding of how to test and monitor production ML models

Full code examples
Industry standard tools
Complete ML project integration
Theory & Hands-on exercises
Just enough complexity
Copy product URL

Testing and Monitoring Machine Learning Models

I want this!