[cuda graphs] Integrate kernel annotations into inductor cudagraph trees#179769
[cuda graphs] Integrate kernel annotations into inductor cudagraph trees#179769yushangdi wants to merge 8 commits intogh/yushangdi/25/basefrom
Conversation
Add config.triton.cudagraph_kernel_annotations (default False) and wire up clear/resolve/remap calls in CUDAGraphNode._record() so that mark_kernels() scopes firing during cudagraph capture are automatically processed. Authored with Claude. [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/179769
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ❌ 2 New Failures, 2 Unrelated FailuresAs of commit 2b68f16 with merge base a4a69be ( NEW FAILURES - The following jobs have failed:
BROKEN TRUNK - The following jobs failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
…udagraph trees" Add config.triton.cudagraph_kernel_annotations (default False) and wire up clear/resolve/remap calls in CUDAGraphNode._record() so that mark_kernels() scopes firing during cudagraph capture are automatically processed. Authored with Claude. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo [ghstack-poisoned]
…udagraph trees" Add config.triton.cudagraph_kernel_annotations (default False) and wire up clear/resolve/remap calls in CUDAGraphNode._record() so that mark_kernels() scopes firing during cudagraph capture are automatically processed. **WIP: current progress: it works, but has bugs, the annotation is not accurate yet** Authored with Claude. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo [ghstack-poisoned]
…udagraph trees" Add config.triton.cudagraph_kernel_annotations (default False) and wire up clear/resolve/remap calls in CUDAGraphNode._record() so that mark_kernels() scopes firing during cudagraph capture are automatically processed. **WIP: current progress: it works, but has bugs, the annotation is not accurate yet** Authored with Claude. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo [ghstack-poisoned]
…udagraph trees" Add config.triton.cudagraph_kernel_annotations (default False) and wire up clear/resolve/remap calls in CUDAGraphNode._record() so that mark_kernels() scopes firing during cudagraph capture are automatically processed. **WIP: current progress: it works, but has bugs, the annotation is not accurate yet** Authored with Claude. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo [ghstack-poisoned]
…udagraph trees" Add config.triton.cudagraph_kernel_annotations (default False) and wire up clear/resolve/remap calls in CUDAGraphNode._record() so that mark_kernels() scopes firing during cudagraph capture are automatically processed. **WIP: current progress: it works, but has bugs, the annotation is not accurate yet** Authored with Claude. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo [ghstack-poisoned]
…udagraph trees" Add config.triton.cudagraph_kernel_annotations (default False) and wire up clear/resolve/remap calls in CUDAGraphNode._record() so that mark_kernels() scopes firing during cudagraph capture are automatically processed. **WIP: current progress: it works, but has bugs, the annotation is not accurate yet** Authored with Claude. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo [ghstack-poisoned]
…udagraph trees" Add config.triton.cudagraph_kernel_annotations (default False) and wire up clear/resolve/remap calls in CUDAGraphNode._record() so that mark_kernels() scopes firing during cudagraph capture are automatically processed. **WIP: current progress: it works, but has bugs, the annotation is not accurate yet** Authored with Claude. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy kadeng muchulee8 amjames chauhang aakhundov coconutruben jataylo [ghstack-poisoned]
Stack from ghstack (oldest at bottom):
Add config.triton.cudagraph_kernel_annotations (default False) and wire
up clear/resolve/remap calls in CUDAGraphNode._record() so that
mark_kernels() scopes firing during cudagraph capture are automatically
processed.
WIP: current progress: it works, but has bugs, the annotation is not accurate yet
Authored with Claude.
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben @jataylo