FORMATION IA

Déploiement de l'IA en périphérie pour les équipes embarquées et IoT

Déployez des modèles IA optimisés directement sur les appareils, en équilibrant précision, latence, consommation et contraintes thermiques.

Format: bootcamp
Durée: 24–40h
Niveau: practitioner
Taille de groupe: 4–14
Prix / participant: €2K–€4K
Prix groupe: €18K–€45K
Public: Embedded software engineers, firmware developers, and IoT platform architects deploying ML inference on-device
Prérequis: Solid Python or C/C++ programming skills; working knowledge of basic ML concepts (model training, inference); familiarity with at least one embedded or IoT platform (Raspberry Pi, STM32, ESP32, mobile, or similar)

Ce qu'elle couvre

Ce programme de niveau praticien donne aux ingénieurs embarqués et IoT toutes les compétences nécessaires pour mettre en production des inférences IA sur du matériel contraint. Les participants travaillent en pratique avec ONNX, TensorFlow Lite, Core ML et des runtimes LLM edge comme Llama.cpp et llamafile, en couvrant la quantisation, l'élagage et les optimisations spécifiques au matériel. Les sessions abordent les contraintes réelles : budget batterie, limitation thermique, mémoire restreinte et mises à jour de modèles en OTA. Le format combine des modules conceptuels courts avec des exercices de laboratoire sur des appareils physiques ou émulés.

À l'issue, vous saurez

Convert a trained PyTorch or TensorFlow model to ONNX, TFLite, and Core ML formats and validate parity across runtimes
Apply INT8 post-training quantisation and measure accuracy-latency trade-offs on a target device
Run a quantised LLM (Llama.cpp or llamafile) on an edge device and profile tokens-per-second against thermal and battery budgets
Design and implement a power-aware inference pipeline that respects duty-cycle constraints on battery-powered hardware
Build and execute an OTA model update workflow with rollback safety on a representative IoT device

Sujets abordés

Model conversion and interoperability: ONNX, TensorFlow Lite, Core ML
Quantisation (INT8, FP16) and structured pruning for edge targets
Edge LLM runtimes: Llama.cpp, llamafile, MLC LLM
Hardware accelerators: NPUs, DSPs, GPU microcontrollers (ARM Ethos, Apple Neural Engine)
Battery budget analysis and power-aware inference scheduling
Thermal management and throttling strategies
OTA model updates and versioning on constrained devices
Benchmarking latency, throughput, and memory footprint on real hardware

Modalité

Delivered as a 3-5 day intensive bootcamp, on-site or remote with hardware kits shipped to participants in advance. Approximately 60% hands-on lab time, 40% guided instruction. Participants receive a reference board (e.g. Raspberry Pi 5 or STM32 dev kit) or use their own target platform. Labs use Docker-based toolchains to minimise setup friction. Remote delivery uses shared cloud-hosted hardware via SSH where physical shipping is not feasible.

Ce qui fait que ça marche

Start with a hardware-in-the-loop benchmark early in the project to set realistic constraints before model selection
Adopt a model-card discipline that records accuracy, latency, power draw, and thermal behaviour for every candidate model
Involve firmware and ML engineers in joint design reviews so power budgets are agreed before training begins
Use automated regression tests that run the inference pipeline on the target device in CI/CD, catching regressions before release

Erreurs fréquentes

Attempting to deploy full-precision FP32 models without quantisation, then discovering the device lacks the memory and compute budget at integration time
Ignoring thermal throttling during sustained inference, leading to unpredictable latency spikes in production
Treating model accuracy on desktop benchmarks as a proxy for on-device accuracy without re-validating after quantisation
Skipping OTA update planning until late in the product lifecycle, resulting in fragile manual reflashing processes

Quand NE PAS suivre cette formation

If the team is still experimenting with model architecture and has not yet reached stable accuracy on desktop benchmarks, edge deployment optimisation is premature — the model will need to be retrained, invalidating all quantisation and conversion work done during this bootcamp.

Fournisseurs à considérer

Sources

Cette formation fait partie d'un catalogue Data & IA construit pour les leaders sérieux sur l'exécution. Lancez le diagnostic gratuit pour voir quelles formations sont prioritaires pour votre équipe.

Lancer le diagnostic Réserver un appel