AI TRAINING

MLOps for Production AI Teams

Build and operate reliable ML pipelines from experimentation to production with modern MLOps tooling.

Format: bootcamp
Duration: 24–40h
Level: practitioner
Group size: 6–16
Price / participant: €2K–€4K
Group price: €18K–€45K
Audience: ML engineers, data engineers, and platform/infra engineers building or scaling production ML systems
Prerequisites: Hands-on Python experience, familiarity with ML model training workflows, and basic knowledge of Docker and Git

What it covers

This practitioner-level programme covers the full MLOps lifecycle: CI/CD for models, feature stores, model registries, serving infrastructure, and production monitoring. Participants work through hands-on labs deploying real pipelines using industry-standard tools such as MLflow, Kubeflow, and Feast. The course addresses drift detection, automated retraining triggers, rollback strategies, and governance requirements. By the end, teams can design and operate a production-grade ML platform aligned with their organisation's scale and data maturity.

What you'll be able to do

Design and implement a CI/CD pipeline that automatically trains, validates, and deploys an ML model on code or data changes
Configure a feature store to serve low-latency features consistently across training and inference environments
Set up a model registry with versioning, stage transitions, and approval gates using MLflow
Instrument a deployed model with drift detection alerts and an automated retraining trigger
Execute a safe rollback from a degraded model version using a blue/green or canary deployment strategy

Topics covered

CI/CD pipelines for model training and deployment
Feature stores: design, ingestion, and serving (Feast, Tecton)
Model registries and versioning with MLflow and DVC
Model serving patterns: batch, real-time, shadow and canary deployments
Production monitoring: data drift, concept drift, and performance degradation
Automated retraining triggers and pipeline orchestration (Airflow, Kubeflow Pipelines)
Rollback strategies and blue/green deployments
Governance, lineage tracking, and audit trails

Delivery

Delivered as a 3–5 day intensive bootcamp, available in-person or remote-live. Each day combines 40% concept sessions with 60% hands-on labs on a shared cloud environment (AWS or GCP). Participants receive a pre-configured lab repo, reference architecture diagrams, and a post-bootcamp Slack channel for 30-day follow-up support. In-person delivery recommended for teams co-building a shared platform.

What makes it work

Assign a dedicated ML platform owner who maintains tooling standards and onboards new model owners
Define and automate model quality gates (accuracy thresholds, bias checks) as part of the CI pipeline from day one
Start with a single end-to-end reference pipeline on a real use case before generalising to a platform
Establish a shared model registry and naming convention so all teams discover and reuse existing model assets

Common mistakes

Treating model deployment as a one-off script rather than a reproducible, versioned pipeline
Skipping feature store adoption and duplicating feature logic between training and serving, causing training-serving skew
Monitoring only infrastructure metrics (CPU, latency) and missing model-level drift until business impact is visible
Over-engineering the MLOps stack before validating that the use case justifies the operational complexity

When NOT to take this

A team that has fewer than two models in production and no dedicated ML engineer: the overhead of a full MLOps stack will stall delivery rather than accelerate it — a lightweight experiment-tracking setup (MLflow alone) is sufficient at that stage.

Providers to consider

Sources

This training is part of a Data & AI catalog built for leaders serious about execution. Take the free diagnostic to see which trainings your team needs.

Run the diagnostic Book a call