[autograd] Thread-safe Python wrapping for Node#179767
[autograd] Thread-safe Python wrapping for Node#179767colesbury wants to merge 11 commits intogh/colesbury/6/basefrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/179767
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ✅ No FailuresAs of commit fb0d22d with merge base 83de8b8 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
|
Got Claude to run some benchmarks on this stack of PRs and didn't see a noticeable change, maybe a tiny bit faster for a few cases, but importantly doesn't look like a regression anywhere: https://gist.github.com/colesbury/8ac150f3d6e43d3adff2f888e9aeb5f5 |
Replace the raw PyObject* on Node with an atomic PyObjectSlot and use the same strategy that we already use for Tensor and Storage PyObject wrappers. Authored with Claude. ghstack-source-id: 176276e Pull-Request: pytorch#179767
Replace the raw PyObject* on Node with an atomic PyObjectSlot and use the same strategy that we already use for Tensor and Storage PyObject wrappers. Authored with Claude. ghstack-source-id: 176276e Pull-Request: pytorch#179767
|
|
||
| // Enable PyObject support for intrusive_ptr<Node> so that the refcount | ||
| // machinery calls incref_pyobject/decref_pyobject on transitions. | ||
| namespace c10::detail { |
There was a problem hiding this comment.
Claude seems to take offence at this one:
- Missing #ifndef C10_MOBILE guard on TargetTraits specialization —
torch/csrc/autograd/node.h:707-715. Both TensorImpl.h:3096 and StorageImpl.h:407 wrap their
TargetTraits specialization in #ifndef C10_MOBILE. The intrusive_ptr refcount machinery relies on
this: in mobile builds, has_pyobject paths are disabled and guarded by
TORCH_INTERNAL_ASSERT_DEBUG_ONLY(!detail::has_pyobject(combined)) (see intrusive_ptr.h:1191). If
node.h is ever transitively included in a mobile build, can_have_pyobject = true for Node would fire
that assertion. Even if autograd headers aren't used on mobile today, the guard should be present for
consistency and safety.
There was a problem hiding this comment.
I'll add the C10_MOBILE ifdef guard
Stack from [ghstack](https://github.com/ezyang/ghstack/tree/0.14.0) (oldest at bottom): * #179767 * -> #179766 * #179765 * #179764 Replace std::shared_ptr<Node> with c10::intrusive_ptr<Node> throughout the autograd system. Node now inherits from c10::intrusive_ptr_target instead of std::enable_shared_from_this<Node>. Authored with Claude. Pull Request resolved: #179766 Approved by: https://github.com/albanD ghstack dependencies: #179764, #179765
Replace the raw PyObject* on Node with an atomic PyObjectSlot and use the same strategy that we already use for Tensor and Storage PyObject wrappers. Authored with Claude. [ghstack-poisoned]
Stack from [ghstack](https://github.com/ezyang/ghstack/tree/0.14.0) (oldest at bottom): * #179767 * -> #179766 * #179765 * #179764 Replace std::shared_ptr<Node> with c10::intrusive_ptr<Node> throughout the autograd system. Node now inherits from c10::intrusive_ptr_target instead of std::enable_shared_from_this<Node>. Authored with Claude. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx kadeng chauhang amjames Lucaskabela jataylo azahed98 xmfan aditvenk [ghstack-poisoned]
Stack from ghstack (oldest at bottom):
Replace the raw PyObject* on Node with an atomic PyObjectSlot
and use the same strategy that we already use for Tensor and Storage
PyObject wrappers.
Authored with Claude.