Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: Arm-Examples/CMSIS-Executorch
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: main
Choose a base ref
...
head repository: Arm-Examples/CMSIS-Executorch
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: feature/pack-based-layer
Choose a head ref
Checking mergeability… Don’t worry, you can still create the pull request.
  • 20 commits
  • 1,773 files changed
  • 2 contributors

Commits on Jan 28, 2026

  1. Configuration menu
    Copy the full SHA
    e575dce View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    d3d50fa View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    b8800b6 View commit details
    Browse the repository at this point in the history
  4. Merge pull request #15 from jthuangarm/whole-archive

    Increase ExecuTorch thread stack size
    MatthiasHertelArm authored Jan 28, 2026
    Configuration menu
    Copy the full SHA
    753d064 View commit details
    Browse the repository at this point in the history

Commits on Jan 29, 2026

  1. Configuration menu
    Copy the full SHA
    bd95fa9 View commit details
    Browse the repository at this point in the history

Commits on Feb 2, 2026

  1. Configuration menu
    Copy the full SHA
    274d3bf View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    40bd49e View commit details
    Browse the repository at this point in the history

Commits on Feb 16, 2026

  1. Configuration menu
    Copy the full SHA
    37cee2f View commit details
    Browse the repository at this point in the history

Commits on Feb 22, 2026

  1. Configuration menu
    Copy the full SHA
    b3ddf6c View commit details
    Browse the repository at this point in the history

Commits on Feb 24, 2026

  1. Configuration menu
    Copy the full SHA
    81c34eb View commit details
    Browse the repository at this point in the history
  2. Revert "Use tosa-tools binary wheels"

    This reverts commit 81c34eb.
    MatthiasHertelArm committed Feb 24, 2026
    Configuration menu
    Copy the full SHA
    225f59e View commit details
    Browse the repository at this point in the history

Commits on Feb 25, 2026

  1. perf: Optimize Docker build by pre-installing tosa-tools from Test PyPI

    - Install tosa-tools==0.0.4 from Test PyPI using --extra-index-url (pre-built wheel)
    - Avoids ~3 minute C++ source compilation of reference_model and serialization libs
    - Patch setup.sh to skip tosa-tools source builds via sed (git clone still runs)
    - Reduces total Docker build time by ~3 minutes (from ~32min to ~29min)
    - Uses --extra-index-url to resolve numpy from main PyPI while fetching tosa-tools from Test PyPI
    MatthiasHertelArm committed Feb 25, 2026
    Configuration menu
    Copy the full SHA
    713a68a View commit details
    Browse the repository at this point in the history

Commits on Feb 27, 2026

  1. chore: Update tosa-tools to official PyPI release

    - Replace tosa-tools==0.0.4 from Test PyPI with tosa-tools==2026.2.0 from official PyPI
    - Remove --extra-index-url workaround (no longer needed)
    - Simplify installation to use main PyPI directly
    MatthiasHertelArm committed Feb 27, 2026
    Configuration menu
    Copy the full SHA
    7be98f5 View commit details
    Browse the repository at this point in the history

Commits on Mar 6, 2026

  1. feat: add pack-based AI layer generation from operator metadata

    Add scripts to translate ExecuTorch operator names (aten::, quantized_decomposed::)
    into PyTorch::ExecuTorch CMSIS-Pack component references, enabling AI layer
    generation without Docker builds.
    
    New files:
    - scripts/generate_pack_clayer.py: Python script with operator-to-component mapping
      for 143 portable + 9 quantized operators. Supports input from .pte model files,
      operators list files, selected_operators.yaml, or command-line lists.
    - scripts/generate_pack_layer.sh: Shell wrapper combining model conversion and
      pack-based layer generation.
    - documentation/PACK_BASED_LAYER.md: Comprehensive documentation including mapping
      tables, usage examples, and comparison with the Docker-based workflow.
    
    Modified:
    - .vscode/tasks.json: Added 'Pack: Generate AI Layer from Model' and
      'Pack: Generate AI Layer from Operators File' VS Code tasks.
    MatthiasHertelArm committed Mar 6, 2026
    Configuration menu
    Copy the full SHA
    28112ca View commit details
    Browse the repository at this point in the history
  2. refactor: simplify local_workflow.sh to use pack-based layer generation

    Replace the 7-step source-compilation workflow (stage1/stage2 CMake builds,
    source layer generation, header patching, artifact packaging) with a 3-step
    pack-based workflow:
      1. Convert PyTorch model to .pte (aot_model.py)
      2. Convert .pte to C header (pte_to_header.py)
      3. Generate pack-based ai_layer.clayer.yml (generate_pack_clayer.py)
    
    The generated layer references PyTorch::ExecuTorch pack components instead of
    compiled source files, eliminating the need for ExecuTorch source compilation.
    MatthiasHertelArm committed Mar 6, 2026
    Configuration menu
    Copy the full SHA
    715d5f7 View commit details
    Browse the repository at this point in the history
  3. feat: add portable ops to model for pack integration testing

    Extend the model from a simple Add (fully delegated to Ethos-U) to
    AddWithPostProcessing which includes CPU-side portable operators:
      - view_copy, mul, add, sigmoid, unsqueeze_copy, softmax
    
    These ops stay on the CPU (not delegated to NPU) so they exercise the
    pack-based operator component selection in the generated clayer.yml.
    MatthiasHertelArm committed Mar 6, 2026
    Configuration menu
    Copy the full SHA
    f4db3ab View commit details
    Browse the repository at this point in the history
  4. fix: use set_module_name to only quantize inner_add submodule

    The previous model used set_global() which quantized ALL ops including
    the post-processing chain (view, mul, sigmoid, softmax). The EthosU
    partitioner then delegated everything to the NPU, leaving zero CPU ops.
    
    Fix: split the add into an InnerAdd submodule and use
    quantizer.set_module_name('inner_add', config) so only the add gets
    quantized and delegated. The post-processing ops stay as float and
    will appear in the .pte as portable CPU operators.
    MatthiasHertelArm committed Mar 6, 2026
    Configuration menu
    Copy the full SHA
    7cd6cd5 View commit details
    Browse the repository at this point in the history
  5. feat: add generate_report.py to auto-create REPORT.md from build logs

    Parses model_conversion (Vela) and generate_pack_layer logs to produce
    a structured report with selected operators table, TOSA graphs, NPU
    performance summary, network summary, and final exported program graph.
    
    Integrated as Step 4 in local_workflow.sh (runs after pack layer gen).
    MatthiasHertelArm committed Mar 6, 2026
    Configuration menu
    Copy the full SHA
    6889282 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    63c4c07 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    8879193 View commit details
    Browse the repository at this point in the history
Loading