FORMATION IA
Génération Augmentée par Récupération (RAG) en Production
Construisez, évaluez et opérez des pipelines RAG en production, rapides, précis et rentables.
Ce qu'elle couvre
Ce programme de niveau praticien accompagne les ingénieurs du RAG fondamental jusqu'au déploiement en production : stratégies d'ingestion de documents, choix de chunking et d'embeddings, architectures de retrievers et rerankers, et frameworks d'évaluation. Les participants implémentent des pipelines bout-en-bout sur des données réelles, les instrumentent pour l'observabilité, et appliquent des techniques de mise en cache et de maîtrise des coûts. Le format associe sessions de live coding, revues d'architecture et ateliers pratiques avec des outils open source (LangChain, LlamaIndex, Weaviate, RAGAS). À l'issue du programme, les participants sont capables de livrer et de surveiller un système RAG respectant les exigences de latence, de qualité et de budget.
À l'issue, vous saurez
- Design and implement a multi-stage RAG pipeline with chunking, embedding, retrieval, and reranking stages tuned for a real dataset
- Select and justify the right vector store and retriever architecture for a given latency and accuracy trade-off
- Evaluate RAG pipeline quality using RAGAS metrics (faithfulness, context precision, answer relevancy) and iterate systematically
- Instrument a RAG system with distributed tracing and set up alerts for retrieval quality degradation in production
- Apply semantic caching and query routing to reduce LLM API costs by at least 30% without sacrificing answer quality
Sujets abordés
- Document ingestion pipelines and preprocessing strategies
- Chunking strategies: fixed, semantic, recursive, and late chunking
- Embedding model selection and fine-tuning for domain-specific retrieval
- Vector stores, hybrid search, and retriever architectures
- Reranking with cross-encoders and LLM-based rerankers
- RAG evaluation frameworks (RAGAS, TruLens, LangSmith)
- Caching, query routing, and cost control patterns
- Observability, tracing, and production monitoring for RAG systems
Modalité
Delivered as a blended programme over 3–5 days (on-site or remote), with approximately 60% hands-on labs and 40% instructor-led architecture sessions. Participants work in small teams on a capstone project using their own or provided datasets. All labs run in pre-configured cloud environments; no local GPU required. Printed architecture cheat sheets and a private GitHub repository with all lab code are included. Remote delivery uses Zoom breakout rooms with a lab assistant per group of four.
Ce qui fait que ça marche
- Establishing an offline evaluation dataset with ground-truth QA pairs before writing any pipeline code
- Instrumenting retrieval and generation steps from day one with a tracing tool such as LangSmith or Arize Phoenix
- Running chunking and embedding ablations on a representative sample of real production documents before committing to an architecture
- Treating prompt templates and retrieval parameters as versioned artifacts subject to the same CI/CD discipline as application code
Erreurs fréquentes
- Using fixed-size chunking for all document types without considering semantic boundaries, leading to poor retrieval precision
- Skipping systematic evaluation and relying on anecdotal spot-checks, so quality regressions go unnoticed in production
- Ignoring reranking entirely and assuming top-k dense retrieval is sufficient for complex, multi-hop questions
- Treating RAG as a one-time build rather than an observable system, leaving latency spikes and cost overruns undetected
Quand NE PAS suivre cette formation
This programme is not appropriate for teams that have not yet shipped any LLM feature to users — organisations still evaluating whether to use AI at all will find the production-operations depth overwhelming and should start with an LLM literacy or prompt-engineering workshop instead.
Fournisseurs à considérer
Sources
Cette formation fait partie d'un catalogue Data & IA construit pour les leaders sérieux sur l'exécution. Lancez le diagnostic gratuit pour voir quelles formations sont prioritaires pour votre équipe.