Skip to content

Add native Metal backend for macOS on Apple silicon#3709

Open
saarlemo wants to merge 2 commits into
arrayfire:masterfrom
saarlemo:metal
Open

Add native Metal backend for macOS on Apple silicon#3709
saarlemo wants to merge 2 commits into
arrayfire:masterfrom
saarlemo:metal

Conversation

@saarlemo

@saarlemo saarlemo commented Jun 24, 2026

Copy link
Copy Markdown

This pull request adds a native Metal compute backend for ArrayFire on Apple silicon using Apple's metal-cpp headers. This integrates Metal with the backend-specific and Unified APIs, with Metal preferred over OpenCL on macOS.

Description

This is a new backend feature for macOS arm64 systems, plus one small portability fix in shared API code.

The PR adds a native Metal backend targeting macOS 12 or later on Apple silicon. It introduces the AF_BUILD_METAL build option, the ArrayFire::afmetal CMake target, the public AF_BACKEND_METAL backend enum value, CMake package export support, and Unified API loading/selection support. The Unified backend priority is updated to:

CUDA -> oneAPI -> Metal -> OpenCL -> CPU

as OpenCL support is deprecated on MacOS devices. The Metal backend implementation uses modular .metal source files that are embedded and compiled through ArrayFire's source-generation infrastructure. It covers core ArrayFire functionality including array operations and JIT evaluation, reductions, scans, sorting, indexing, FFT and convolution, BLAS and linear algebra, image processing and computer vision, sparse arrays, and random-number generation.

The PR also updates Metal build, usage, priority, and limitation documentation. Building the backend requires macOS 12 or later, Apple silicon (arm64, M1 or newer), Xcode or Xcode Command Line Tools, C++17, and Apple's header-only metal-cpp distribution.

The shared confidence-connected implementation is adjusted to rely on standard std::abs overload resolution instead of explicit template arguments. This keeps the code portable across Apple clang/libc++ and GCC/libstdc++ without changing the intended numerical behavior.

Initial version of code included in this PR was written manually, with the final version having been reviewed and formatted with assistance of the GPT-5.5 large language model.

Current Metal backend limitations:

  • Only the system-default Metal device is exposed.
  • Pinned host-memory allocation is not supported.
  • Metal shaders do not provide native FP64 arithmetic. Double-precision paths use host or library fallbacks where implemented.
  • Two-dimensional morphology masks larger than 19 x 19 are unsupported for non-boolean inputs.
  • Initial backend startup includes runtime Metal pipeline compilation.

Resolves #2878.

Changes to Users

  • Adds the AF_BUILD_METAL CMake option.
  • Adds the ArrayFire::afmetal CMake target and Metal backend library.
  • Adds AF_BACKEND_METAL to the public backend enumeration.
  • Adds Metal backend discovery and selection through the Unified API.
  • On macOS systems with Metal enabled, Unified now prefers Metal over OpenCL by default.
  • Users building the Metal backend must provide Apple's metal-cpp headers via METALCPP_INCLUDE_DIR.
  • Existing user code does not need source changes unless it explicitly depends on backend ordering or wants to select the new Metal backend.

Checklist

  • Rebased on latest master
  • Code compiles
  • Tests pass
  • Functions added to unified API
  • Functions documented

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Metal Backend

1 participant