This is an introduction to developing within the MAX open source project. If you plan to contribute changes back to the repo, first read everything in CONTRIBUTING.md.
If you just want to build with MAX and aren't interested in developing in the source code, instead see the MAX quickstart guide.
First, make sure your system meets the
MAX system requirements.
The same requirements that apply to the modular package apply to developing in
this repo.
In particular, if you're on macOS, make sure you have Metal utilities (for GPU
programming in recent versions of Xcode)—try xcodebuild -downloadComponent MetalToolchain.
Then you can get started:
-
Fork the repo, clone it, and create a branch.
-
Optionally, install
pixi. We use it in our code examples for package management and virtual environments.curl -fsSL https://pixi.sh/install.sh | sh -
Optionally, install the Mojo extension in VS Code or Cursor.
That's it.
The build system uses Bazel, and if you don't have it,
the bazelw script in the next step installs it.
From the repo root, run this bazelw command to run all the MAX tests:
./bazelw test //max/...If it's your first time, it starts by installing the Bazel version manager, Bazelisk, which then installs Bazel.
Not every MAX test has the same local requirements. Before running a broad target, check which of these constraints apply:
-
Hugging Face auth: Some tests exercise gated Hugging Face repos or fetch remote config files. Prefer signing in once with the Hugging Face CLI so the local cache and library calls can reuse the saved credentials:
hf auth login
Use
HF_TOKENwhen you need non-interactive auth, or when a specific test target explicitly requires that environment variable to be inherited into the Bazel action:export HF_TOKEN="hf_..."
-
Model downloads: Some integration tests resolve model snapshots through the local Hugging Face cache and expect the snapshot to be present already. On a fresh machine, warm the cache first with:
bazel run //max/tests/integration/tools:download_models_for_testing -- \ meta-llama/Llama-3.2-1B-Instruct
-
GPU requirements: Many integration targets are tagged
gpuand will not run successfully on a CPU-only machine. Prefer CPU or pure unit targets if your change does not touch GPU execution. -
Networked tests: Targets tagged
requires-networkmay contact Hugging Face or other remote endpoints and are more likely to fail in restricted or offline environments.
You can run all the tests within a specific subdirectory by simply
specifying the subdirectory and using /.... For example:
./bazelw test //max/tests/integration/graph/...
./bazelw test //max/tests/tests/torch/...To find all the test targets, you can run:
./bazelw query 'tests(//max/tests/...)'For local iteration, start with a small target set that matches your change, then expand to the full relevant suites before sending a PR. The following commands are good local starting points, not a substitute for broader test coverage:
| Change type | Suggested command | Typical prerequisites |
|---|---|---|
| Core Python logic and lightweight local regression | ./bazelw test //max/tests/tests:cpu_local_tests |
No GPU; usually no HF_TOKEN |
| Serve process-control unit tests | ./bazelw test //max/tests/tests/serve/unit:tests |
CPU-only, but slower than the default local suite |
| Pipeline library or architecture logic that should stay CPU-safe | ./bazelw test //max/tests/tests/pipelines/... //max/tests/integration/pipelines:tests |
Network may be needed for some pipeline tests |
| Tokenization or HF-backed pipeline integration | ./bazelw test //max/tests/integration/pipelines/tokenization:tests //max/tests/integration/architectures/internvl_network_tests:tests |
Hugging Face auth, network, and a GPU-capable machine |
| GPU runtime, graph, or kernel-facing changes | ./bazelw test //max/tests/tests:test_interpreter_ops_gpu //max/tests/integration/pipelines:tests_gpu |
GPU required; network often required |
If you are unsure whether a target needs network or GPUs, inspect its Bazel
rule for tags such as gpu or requires-network, or for env_inherit = ["HF_TOKEN"].
When adding new CPU-safe tests with lightweight local prerequisites, prefer
including them in //max/tests/tests:cpu_local_tests so contributors have a
fast, shared baseline suite for local iteration.
When developing a new model architecture, or testing MAX API changes against existing models, you can use the following Bazel commands to run inference.
Note
Some models require Hugging Face authentication to load model weights, so prefer signing in once with the Hugging Face CLI:
hf auth loginIf you need non-interactive auth in CI or a shell session dedicated to Bazel,
you can still export HF_TOKEN instead.
For example, this entrypoints:pipelines generate command is equivalent to
running inference with max generate:
./bazelw run //max/python/max/entrypoints:pipelines -- generate \
--model OpenGVLab/InternVL3-8B-Instruct \
--prompt "Hello, world!"And this is equivalent to creating an endpoint with max serve:
./bazelw run //max/python/max/entrypoints:pipelines -- serve \
--model OpenGVLab/InternVL3-8B-Instruct \
--trust-remote-codeHere are some docs to help start developing in the MAX framework:
- Contributing new model architectures
- Benchmarking a MAX endpoint
- Benchmarking Mojo kernels with
kbench - Kernel profiling with Nsight Compute
- Contributing changes to the repo
For more documentation, see docs.modular.com.