Quel est le niveau de maturité de votre organisation Data & IA ?Faites le diagnostic
Toutes les formations

FORMATION IA

Fine-Tuning de Petits Modèles de Langage pour la Production

Construisez, évaluez et déployez des LLM fine-tunés avec LoRA et QLoRA sur des cas réels.

Format
bootcamp
Durée
24–40h
Niveau
advanced
Taille de groupe
6–16
Prix / participant
€2K–€4K
Prix groupe
€18K–€45K
Public
ML engineers and AI engineers with existing deep learning experience who want to fine-tune open-weight LLMs for production use cases
Prérequis
Solid Python and PyTorch skills, familiarity with transformer architecture basics, and access to a GPU environment (cloud or local)

Ce qu'elle couvre

Ce bootcamp pratique couvre le cycle de vie complet du fine-tuning pour des modèles open-weight tels que Llama, Mistral et Gemma. Les participants comparent le fine-tuning complet, LoRA et QLoRA, préparent des jeux de données spécialisés, exécutent des harnais d'évaluation rigoureux et déploient leurs modèles sur des endpoints d'inférence. Les sessions alternent théorie et travaux pratiques sur GPU afin que chaque ingénieur reparte avec un modèle fine-tuné fonctionnel et un workflow reproductible.

À l'issue, vous saurez

  • Select the appropriate fine-tuning strategy (full, LoRA, QLoRA) for a given constraint set on memory, compute, and target performance
  • Curate and format a domain-specific instruction or chat dataset ready for supervised fine-tuning
  • Run a complete LoRA or QLoRA training job on a 7B–13B parameter model using Axolotl or TRL with correct hyperparameter choices
  • Build an evaluation harness combining automated benchmarks and preference scoring to detect overfitting and alignment drift
  • Quantise and deploy a fine-tuned model to a production inference endpoint and measure latency/throughput trade-offs

Sujets abordés

  • Full fine-tuning vs parameter-efficient methods (LoRA, QLoRA, DoRA)
  • Dataset curation: collection, cleaning, deduplication, and formatting (instruction, chat, completion formats)
  • Training configuration: learning rate schedules, batch sizing, gradient accumulation, mixed precision
  • PEFT and Hugging Face TRL / Axolotl / LLaMA-Factory toolchains
  • Evaluation harness design: perplexity, task-specific benchmarks, human-preference scoring
  • Overfitting, catastrophic forgetting, and alignment drift diagnostics
  • Model quantisation (GGUF, GPTQ, AWQ) for efficient inference
  • Deployment to inference endpoints (vLLM, Ollama, HuggingFace Inference, cloud APIs)

Modalité

Delivered over 3–5 consecutive days, either in-person or live-remote via video call with shared GPU cloud workspace (e.g., Lambda Labs or RunPod). Each day follows an 80/20 hands-on to lecture ratio. Participants receive a starter repo, pre-processed sample datasets, and evaluation scripts. A private Slack or Discord channel provides async support for 30 days post-training. In-person delivery requires a venue with stable internet; cloud GPU costs are typically billed separately or included in the group price tier.

Ce qui fait que ça marche

  • Defining a narrow, well-scoped task with clear success metrics before touching any training code
  • Investing at least 40% of total project time in dataset curation and quality checks
  • Using automated evaluation loops (e.g., LM-Eval Harness or custom task suites) from day one to catch regressions early
  • Running a small-scale baseline experiment before committing compute to full training runs

Erreurs fréquentes

  • Training on too little or poorly cleaned data and attributing poor results to the model architecture rather than the dataset
  • Choosing QLoRA without profiling the actual GPU memory footprint, leading to unexpected OOM errors in production
  • Skipping a rigorous evaluation harness and relying on qualitative spot-checks that miss regression on held-out tasks
  • Deploying the raw adapter weights without merging or quantising, resulting in inference latency far above baseline

Quand NE PAS suivre cette formation

This bootcamp is the wrong fit for teams that have not yet identified a concrete downstream task — organisations still exploring whether LLMs are relevant to their problem should start with an awareness or literacy programme before investing in fine-tuning infrastructure.

Fournisseurs à considérer

Sources

Cette formation fait partie d'un catalogue Data & IA construit pour les leaders sérieux sur l'exécution. Lancez le diagnostic gratuit pour voir quelles formations sont prioritaires pour votre équipe.