How mature is your Data & AI organization?Take the diagnostic
All trainings

AI TRAINING

MLflow and W&B for Experiment Tracking

Master experiment tracking, model registries, and hyperparameter sweeps using MLflow and Weights & Biases.

Format
bootcamp
Duration
16–24h
Level
practitioner
Group size
4–16
Price / participant
€2K–€3K
Group price
€12K–€30K
Audience
ML engineers and data scientists who train models regularly and need rigorous experiment management
Prerequisites
Proficiency in Python and practical experience training ML models (scikit-learn, PyTorch, or TensorFlow); basic Git knowledge required

What it covers

This hands-on practitioner bootcamp covers the full ML experiment lifecycle using two industry-standard tools: MLflow and Weights & Biases. Participants learn to instrument training runs, compare experiments, manage model versions, and run automated hyperparameter sweeps. The programme also addresses team collaboration workflows, artifact management, and the trade-offs between self-hosted and SaaS deployments. Format is lab-heavy with real datasets and model training exercises throughout.

What you'll be able to do

  • Instrument any Python-based training script with MLflow or W&B logging in under 15 minutes
  • Configure and run a W&B Sweep or MLflow hyperparameter search over a real model to identify optimal configurations
  • Register, version, and promote models through staging to production using the MLflow Model Registry
  • Design a team collaboration workflow with shared experiment namespaces, tagging conventions, and access control policies
  • Evaluate and justify a self-hosted versus SaaS deployment decision based on data sensitivity, cost, and team size

Topics covered

  • MLflow tracking: logging metrics, params, artifacts, and tags
  • Weights & Biases: runs, sweeps, and the W&B dashboard
  • Model registry: versioning, staging, and promotion workflows
  • Hyperparameter optimisation with W&B Sweeps and MLflow Projects
  • Artifact management and dataset versioning
  • Team collaboration patterns: shared experiments and access controls
  • Self-hosted MLflow vs W&B SaaS: cost, security, scalability trade-offs
  • CI/CD integration for automated experiment pipelines

Delivery

Delivered over two to three days, either on-site or remote via video conferencing with a shared cloud environment (e.g., AWS SageMaker Studio or Google Colab Enterprise). Each session follows a 30% concept / 70% lab ratio. Participants receive pre-configured Docker environments and Jupyter notebooks. A capstone exercise on day 2-3 requires integrating both tools into a mini ML pipeline. Remote delivery uses breakout rooms for pair-lab exercises.

What makes it work

  • Establishing shared naming conventions and tagging standards before the first team experiment run
  • Integrating experiment tracking into CI/CD so every training job is automatically logged without developer effort
  • Nominating a model registry owner who reviews and approves promotions from staging to production
  • Starting with a small reproducibility audit of past experiments to immediately demonstrate business value

Common mistakes

  • Logging only final metrics rather than per-step metrics, making it impossible to diagnose training instability
  • Skipping the model registry and relying on file paths, leading to broken reproducibility when models move to production
  • Running W&B Sweeps without setting a stopping strategy, resulting in runaway compute costs
  • Choosing self-hosted MLflow without planning storage backend and proxy auth, causing painful migrations later

When NOT to take this

A team that has not yet standardised its training framework (some using TensorFlow, others PyTorch, others AutoML SaaS) will struggle to get value from this training — establish a common modelling stack first.

Providers to consider

Sources

This training is part of a Data & AI catalog built for leaders serious about execution. Take the free diagnostic to see which trainings your team needs.