Welcome to the ExecuTorch Documentation#
ExecuTorch is PyTorch’s solution for efficient AI inference on edge devices — from mobile phones to embedded systems.
Key Value Propositions#
Portability: Run on diverse platforms, from high-end mobile to constrained microcontrollers
Performance: Lightweight runtime with full hardware acceleration (CPU, GPU, NPU, DSP)
Productivity: Use familiar PyTorch tools from authoring to deployment
🗺️ Find Your Path#
Not sure where to start? Use the guided pathways to navigate ExecuTorch based on your experience level, goal, and target platform.
Step-by-step learning sequence from installation to your first on-device deployment. Includes concept explanations and worked examples.
Skip the theory — get a model running in 15 minutes. Includes export cheat sheets, backend selection tables, and platform quick starts.
Quantization, custom backends, C++ runtime, LLM deployment, and compiler internals for production-grade systems.
Not sure which pathway fits? The decision matrix routes you by experience level, target platform, model status, and developer role to the exact documentation you need.
🎯 Wins & Success Stories#
Explore Documentation#
Overview, architecture, and core concepts — Understand how ExecuTorch works and its benefits
Get started with ExecuTorch — Install, export your first model, and run inference
Android, iOS, Desktop, Embedded — Platform-specific deployment guides and examples
CPU, GPU, NPU/Accelerator backends — Hardware acceleration and backend selection
LLM export, optimization, and deployment — Complete LLM workflow for edge devices
Quantization, memory planning, custom passes — Deep customization and optimization
Developer tools, profiling, debugging — Comprehensive development and debugging suite
API Reference Usages & Examples — Detailed Python, C++, and Java API references
FAQ, troubleshooting, contributing — Get help and contribute to the project