Quel est le niveau de maturité de votre organisation Data & IA ?Faites le diagnostic
Toutes les formations

FORMATION IA

Déploiement de l'IA en périphérie pour les équipes embarquées et IoT

Déployez des modèles IA optimisés directement sur les appareils, en équilibrant précision, latence, consommation et contraintes thermiques.

Format
bootcamp
Durée
24–40h
Niveau
practitioner
Taille de groupe
4–14
Prix / participant
€2K–€4K
Prix groupe
€18K–€45K
Public
Embedded software engineers, firmware developers, and IoT platform architects deploying ML inference on-device
Prérequis
Solid Python or C/C++ programming skills; working knowledge of basic ML concepts (model training, inference); familiarity with at least one embedded or IoT platform (Raspberry Pi, STM32, ESP32, mobile, or similar)

Ce qu'elle couvre

Ce programme de niveau praticien donne aux ingénieurs embarqués et IoT toutes les compétences nécessaires pour mettre en production des inférences IA sur du matériel contraint. Les participants travaillent en pratique avec ONNX, TensorFlow Lite, Core ML et des runtimes LLM edge comme Llama.cpp et llamafile, en couvrant la quantisation, l'élagage et les optimisations spécifiques au matériel. Les sessions abordent les contraintes réelles : budget batterie, limitation thermique, mémoire restreinte et mises à jour de modèles en OTA. Le format combine des modules conceptuels courts avec des exercices de laboratoire sur des appareils physiques ou émulés.

À l'issue, vous saurez

  • Convert a trained PyTorch or TensorFlow model to ONNX, TFLite, and Core ML formats and validate parity across runtimes
  • Apply INT8 post-training quantisation and measure accuracy-latency trade-offs on a target device
  • Run a quantised LLM (Llama.cpp or llamafile) on an edge device and profile tokens-per-second against thermal and battery budgets
  • Design and implement a power-aware inference pipeline that respects duty-cycle constraints on battery-powered hardware
  • Build and execute an OTA model update workflow with rollback safety on a representative IoT device

Sujets abordés

  • Model conversion and interoperability: ONNX, TensorFlow Lite, Core ML
  • Quantisation (INT8, FP16) and structured pruning for edge targets
  • Edge LLM runtimes: Llama.cpp, llamafile, MLC LLM
  • Hardware accelerators: NPUs, DSPs, GPU microcontrollers (ARM Ethos, Apple Neural Engine)
  • Battery budget analysis and power-aware inference scheduling
  • Thermal management and throttling strategies
  • OTA model updates and versioning on constrained devices
  • Benchmarking latency, throughput, and memory footprint on real hardware

Modalité

Delivered as a 3-5 day intensive bootcamp, on-site or remote with hardware kits shipped to participants in advance. Approximately 60% hands-on lab time, 40% guided instruction. Participants receive a reference board (e.g. Raspberry Pi 5 or STM32 dev kit) or use their own target platform. Labs use Docker-based toolchains to minimise setup friction. Remote delivery uses shared cloud-hosted hardware via SSH where physical shipping is not feasible.

Ce qui fait que ça marche

  • Start with a hardware-in-the-loop benchmark early in the project to set realistic constraints before model selection
  • Adopt a model-card discipline that records accuracy, latency, power draw, and thermal behaviour for every candidate model
  • Involve firmware and ML engineers in joint design reviews so power budgets are agreed before training begins
  • Use automated regression tests that run the inference pipeline on the target device in CI/CD, catching regressions before release

Erreurs fréquentes

  • Attempting to deploy full-precision FP32 models without quantisation, then discovering the device lacks the memory and compute budget at integration time
  • Ignoring thermal throttling during sustained inference, leading to unpredictable latency spikes in production
  • Treating model accuracy on desktop benchmarks as a proxy for on-device accuracy without re-validating after quantisation
  • Skipping OTA update planning until late in the product lifecycle, resulting in fragile manual reflashing processes

Quand NE PAS suivre cette formation

If the team is still experimenting with model architecture and has not yet reached stable accuracy on desktop benchmarks, edge deployment optimisation is premature — the model will need to be retrained, invalidating all quantisation and conversion work done during this bootcamp.

Fournisseurs à considérer

Sources

Cette formation fait partie d'un catalogue Data & IA construit pour les leaders sérieux sur l'exécution. Lancez le diagnostic gratuit pour voir quelles formations sont prioritaires pour votre équipe.