FORMATION IA

LlamaIndex pour les Applications RAG Intensives

Construisez des systèmes de récupération augmentée en production avec les connecteurs, index et moteurs de requêtes LlamaIndex.

Format: bootcamp
Durée: 16–24h
Niveau: practitioner
Taille de groupe: 4–14
Prix / participant: €2K–€3K
Prix groupe: €12K–€30K
Public: Software engineers and ML engineers building knowledge retrieval or document QA systems
Prérequis: Python proficiency, familiarity with REST APIs and vector embeddings; basic LLM API usage (OpenAI or equivalent)

Ce qu'elle couvre

Ce programme technique intensif couvre l'architecture LlamaIndex de bout en bout : pipelines d'ingestion de données, index vectoriels et par mots-clés, composition de moteurs de requêtes et raisonnement multi-documents. Les participants implémentent de vrais systèmes RAG, apprennent quand LlamaIndex surpasse LangChain pour les charges documentaires intensives, et repartent avec des patterns de code réutilisables prêts pour la production. Le format allie ateliers de codage en direct et modules conceptuels courts, avec environ 60 % de pratique.

À l'issue, vous saurez

Build an end-to-end RAG pipeline using LlamaIndex data connectors and a vector store index from scratch
Configure and compare at least three index types and justify the choice for a given retrieval use case
Implement a multi-document SubQuestion query engine capable of synthesising answers across heterogeneous sources
Evaluate retrieval quality using hit rate and faithfulness metrics and iterate on chunking and embedding strategies
Deploy a LlamaIndex-powered query service with logging, caching, and token-cost controls

Sujets abordés

LlamaIndex architecture: nodes, documents, and the indexing pipeline
Data connectors and loaders for PDFs, databases, APIs, and web sources
Vector store indexes vs. list indexes vs. tree indexes — trade-offs and selection
Query engine composition and router query engines for multi-index retrieval
Multi-document reasoning with SubQuestion and knowledge graph query engines
Retrieval evaluation: hit rate, MRR, and faithfulness scoring
LlamaIndex vs. LangChain: decision framework for RAG-heavy workloads
Deploying LlamaIndex pipelines in production with observability and caching

Modalité

Delivered over 2–3 days either in-person or live-remote via video call with shared coding environment (JupyterHub or GitHub Codespaces). Each module follows a pattern of 20-minute concept walkthrough followed by 40-minute guided lab. Participants receive a private GitHub repo with starter notebooks, solution branches, and a capstone project brief. A shared Slack or Discord channel provides async support for up to four weeks post-training.

Ce qui fait que ça marche

Start with a real internal document corpus during training so labs are immediately relevant
Instrument retrieval pipelines with evaluation metrics from day one, not as an afterthought
Designate a technical owner post-training to maintain the LlamaIndex version and connector dependencies
Pair the bootcamp with a follow-up architectural review two weeks after deployment

Erreurs fréquentes

Using a flat list index for large corpora, causing slow and expensive full-scan queries
Skipping retrieval evaluation — teams ship RAG systems without measuring retrieval quality, then blame the LLM
Over-engineering with LangChain abstractions when LlamaIndex's native document store covers the use case more cleanly
Ignoring chunk size and overlap tuning, leading to poor context windows and hallucinated summaries

Quand NE PAS suivre cette formation

This training is not the right fit for a team that has not yet chosen an LLM stack or is still evaluating whether RAG is the correct architecture — they need a broader LLM application design workshop first.

Fournisseurs à considérer

Sources

Cette formation fait partie d'un catalogue Data & IA construit pour les leaders sérieux sur l'exécution. Lancez le diagnostic gratuit pour voir quelles formations sont prioritaires pour votre équipe.

Lancer le diagnostic Réserver un appel