You Built a Model. Now What?

When first getting into machine learning, most of the focus goes to model training. Collecting data, preprocessing it, training a model, and improving accuracy takes up the bulk of your time. But what happens after the model is built is rarely discussed in depth.

In reality, operating a model is far harder than building one. You need to deploy it to production, monitor its performance, and retrain it when the underlying data changes. Managing this entire lifecycle in a systematic way is what MLOps is about.

Why DevOps Alone Isn't Enough

DevOps is a mature methodology for automating the build, test, and deployment of software. So why not just apply the same methodology to ML systems?

The problem is that ML systems have fundamentally different characteristics from traditional software. In conventional software, behavior changes when the code changes. In ML systems, behavior can change completely even when the code stays the same โ€” because the data changed. This means you're managing not just code, but two additional variables: data and models.

As a result, the scope of version control expands. Tracking code versions alone is insufficient. You need to track which data was used to train which model in order to build a reproducible system.

The Reality of the ML Lifecycle

The ML project lifecycle can be broadly divided into four stages.

First is data collection and preprocessing. This involves gathering raw data, cleaning it, and transforming it into a format suitable for training. Since data quality ultimately determines model performance, this stage typically consumes the most time.

Second is model training and experimentation. This is where you try various combinations of algorithms and hyperparameters to find the optimal model. With dozens to hundreds of experiments being run, failing to systematically record each experiment's configuration and results makes it difficult to identify which combination performed best.

Third is model deployment. This is the process of integrating a trained model into an actual service. The approach varies depending on whether you're serving via REST API, using batch inference, or deploying to edge devices.

Fourth is monitoring and retraining. This involves continuously observing model performance in production and retraining with new data when performance degrades. The phenomenon where data distributions shift over time is called data drift, and detecting and responding to it is one of the core challenges of MLOps.

MLOps Maturity Levels

Google defines three levels of MLOps maturity.

LevelDescriptionCharacteristics
Level 0Manual processEverything done by hand, model trained in notebooks
Level 1ML pipeline automationTraining pipelines automated, continuous training possible
Level 2CI/CD pipeline automationThe pipeline itself is built, tested, and deployed automatically

Most organizations remain at Level 0. A data scientist trains a model in a Jupyter notebook, manually hands the results to the engineering team, and an engineer integrates it into the service. Communication errors occur during this handoff, reproducibility suffers, and deployment cycles become lengthy.

The goal of MLOps is to automate and systematize this manual process, enabling models to be deployed to production quickly and reliably.

Key Components

Implementing MLOps requires several components: data pipelines, experiment tracking, model registries, serving infrastructure, and monitoring systems, among others. This series will examine each component one by one, exploring why it's needed and how to build it.

In the next post, we'll look at data pipelines and feature engineering โ€” the starting point of any MLOps implementation.