-
Notifications
You must be signed in to change notification settings - Fork 5
Comparing changes
Open a pull request
base repository: Arm-Examples/CMSIS-Executorch
base: main
head repository: Arm-Examples/CMSIS-Executorch
compare: feature/pack-based-layer
- 20 commits
- 1,773 files changed
- 2 contributors
Commits on Jan 28, 2026
-
Configuration menu - View commit details
-
Copy full SHA for e575dce - Browse repository at this point
Copy the full SHA e575dceView commit details -
Configuration menu - View commit details
-
Copy full SHA for d3d50fa - Browse repository at this point
Copy the full SHA d3d50faView commit details -
Configuration menu - View commit details
-
Copy full SHA for b8800b6 - Browse repository at this point
Copy the full SHA b8800b6View commit details -
Merge pull request #15 from jthuangarm/whole-archive
Increase ExecuTorch thread stack size
Configuration menu - View commit details
-
Copy full SHA for 753d064 - Browse repository at this point
Copy the full SHA 753d064View commit details
Commits on Jan 29, 2026
-
Configuration menu - View commit details
-
Copy full SHA for bd95fa9 - Browse repository at this point
Copy the full SHA bd95fa9View commit details
Commits on Feb 2, 2026
-
Configuration menu - View commit details
-
Copy full SHA for 274d3bf - Browse repository at this point
Copy the full SHA 274d3bfView commit details -
Merge branch 'whole-archive' of https://github.com/Arm-Examples/CMSIS…
…-Executorch into whole-archive
Configuration menu - View commit details
-
Copy full SHA for 40bd49e - Browse repository at this point
Copy the full SHA 40bd49eView commit details
Commits on Feb 16, 2026
-
Configuration menu - View commit details
-
Copy full SHA for 37cee2f - Browse repository at this point
Copy the full SHA 37cee2fView commit details
Commits on Feb 22, 2026
-
Configuration menu - View commit details
-
Copy full SHA for b3ddf6c - Browse repository at this point
Copy the full SHA b3ddf6cView commit details
Commits on Feb 24, 2026
-
Configuration menu - View commit details
-
Copy full SHA for 81c34eb - Browse repository at this point
Copy the full SHA 81c34ebView commit details -
Revert "Use tosa-tools binary wheels"
This reverts commit 81c34eb.
Configuration menu - View commit details
-
Copy full SHA for 225f59e - Browse repository at this point
Copy the full SHA 225f59eView commit details
Commits on Feb 25, 2026
-
perf: Optimize Docker build by pre-installing tosa-tools from Test PyPI
- Install tosa-tools==0.0.4 from Test PyPI using --extra-index-url (pre-built wheel) - Avoids ~3 minute C++ source compilation of reference_model and serialization libs - Patch setup.sh to skip tosa-tools source builds via sed (git clone still runs) - Reduces total Docker build time by ~3 minutes (from ~32min to ~29min) - Uses --extra-index-url to resolve numpy from main PyPI while fetching tosa-tools from Test PyPI
Configuration menu - View commit details
-
Copy full SHA for 713a68a - Browse repository at this point
Copy the full SHA 713a68aView commit details
Commits on Feb 27, 2026
-
chore: Update tosa-tools to official PyPI release
- Replace tosa-tools==0.0.4 from Test PyPI with tosa-tools==2026.2.0 from official PyPI - Remove --extra-index-url workaround (no longer needed) - Simplify installation to use main PyPI directly
Configuration menu - View commit details
-
Copy full SHA for 7be98f5 - Browse repository at this point
Copy the full SHA 7be98f5View commit details
Commits on Mar 6, 2026
-
feat: add pack-based AI layer generation from operator metadata
Add scripts to translate ExecuTorch operator names (aten::, quantized_decomposed::) into PyTorch::ExecuTorch CMSIS-Pack component references, enabling AI layer generation without Docker builds. New files: - scripts/generate_pack_clayer.py: Python script with operator-to-component mapping for 143 portable + 9 quantized operators. Supports input from .pte model files, operators list files, selected_operators.yaml, or command-line lists. - scripts/generate_pack_layer.sh: Shell wrapper combining model conversion and pack-based layer generation. - documentation/PACK_BASED_LAYER.md: Comprehensive documentation including mapping tables, usage examples, and comparison with the Docker-based workflow. Modified: - .vscode/tasks.json: Added 'Pack: Generate AI Layer from Model' and 'Pack: Generate AI Layer from Operators File' VS Code tasks.
Configuration menu - View commit details
-
Copy full SHA for 28112ca - Browse repository at this point
Copy the full SHA 28112caView commit details -
refactor: simplify local_workflow.sh to use pack-based layer generation
Replace the 7-step source-compilation workflow (stage1/stage2 CMake builds, source layer generation, header patching, artifact packaging) with a 3-step pack-based workflow: 1. Convert PyTorch model to .pte (aot_model.py) 2. Convert .pte to C header (pte_to_header.py) 3. Generate pack-based ai_layer.clayer.yml (generate_pack_clayer.py) The generated layer references PyTorch::ExecuTorch pack components instead of compiled source files, eliminating the need for ExecuTorch source compilation.
Configuration menu - View commit details
-
Copy full SHA for 715d5f7 - Browse repository at this point
Copy the full SHA 715d5f7View commit details -
feat: add portable ops to model for pack integration testing
Extend the model from a simple Add (fully delegated to Ethos-U) to AddWithPostProcessing which includes CPU-side portable operators: - view_copy, mul, add, sigmoid, unsqueeze_copy, softmax These ops stay on the CPU (not delegated to NPU) so they exercise the pack-based operator component selection in the generated clayer.yml.
Configuration menu - View commit details
-
Copy full SHA for f4db3ab - Browse repository at this point
Copy the full SHA f4db3abView commit details -
fix: use set_module_name to only quantize inner_add submodule
The previous model used set_global() which quantized ALL ops including the post-processing chain (view, mul, sigmoid, softmax). The EthosU partitioner then delegated everything to the NPU, leaving zero CPU ops. Fix: split the add into an InnerAdd submodule and use quantizer.set_module_name('inner_add', config) so only the add gets quantized and delegated. The post-processing ops stay as float and will appear in the .pte as portable CPU operators.Configuration menu - View commit details
-
Copy full SHA for 7cd6cd5 - Browse repository at this point
Copy the full SHA 7cd6cd5View commit details -
feat: add generate_report.py to auto-create REPORT.md from build logs
Parses model_conversion (Vela) and generate_pack_layer logs to produce a structured report with selected operators table, TOSA graphs, NPU performance summary, network summary, and final exported program graph. Integrated as Step 4 in local_workflow.sh (runs after pack layer gen).
Configuration menu - View commit details
-
Copy full SHA for 6889282 - Browse repository at this point
Copy the full SHA 6889282View commit details -
Configuration menu - View commit details
-
Copy full SHA for 63c4c07 - Browse repository at this point
Copy the full SHA 63c4c07View commit details -
Configuration menu - View commit details
-
Copy full SHA for 8879193 - Browse repository at this point
Copy the full SHA 8879193View commit details
This comparison is taking too long to generate.
Unfortunately it looks like we can’t render this comparison for you right now. It might be too big, or there might be something weird with your repository.
You can try running this command locally to see the comparison on your machine:
git diff main...feature/pack-based-layer