Add GPU tensor support to decoupled API#154
Merged
Merged
Conversation
87fb69c to
e677a70
Compare
GuanLuo
reviewed
May 13, 2022
| "GPU buffers size does not match the provided buffers: ") + | ||
| std::to_string(gpu_tensors.size()) + | ||
| " != " + std::to_string(*gpu_buffer_count)); | ||
| return; |
Member
Author
There was a problem hiding this comment.
This is more like assert and will never be triggered. I can add a comment for it.
| // limitation in the legacy CUDA IPC API that doesn't allow getting the | ||
| // handle of an exported pointer. If the cuda handle exists, it indicates | ||
| // that the cuda shared memory was used and the input is in a single | ||
| // buffer. [FIXME] for the case where the input is in cuda shared memory |
| reinterpret_cast<ResponseSendMessage*>(send_message.data_.get()); | ||
| std::unique_ptr<PbString> error_message; | ||
| ScopedDefer _([send_message_payload] { | ||
| ScopedDefer _([&send_message_payload] { |
Contributor
There was a problem hiding this comment.
Why take reference of a pointer?
Member
Author
There was a problem hiding this comment.
No particular reason. I'll revert this change.
tanmayv25
requested changes
May 17, 2022
rmccorm4
requested changes
May 20, 2022
rmccorm4
left a comment
Contributor
There was a problem hiding this comment.
Using Go-style defer is cool to see 🙂
724b1ce to
f349db5
Compare
tanmayv25
approved these changes
May 20, 2022
tanmayv25
approved these changes
May 20, 2022
GuanLuo
reviewed
May 20, 2022
| ++index; | ||
| } | ||
|
|
||
| // Additional round trip so that the stub can fill the GPU output buffers. |
Contributor
There was a problem hiding this comment.
Why there is another round trip? the backend process signals back that there is CUDA IPC handle and wait for stub process to copy the data to within the CUDA IPC handle?
Do I summarize the workflow correctly:
- stub requests for output buffer (and passing buffer via shared memory at the same time if CPU tensor)
- backend acquires output buffer, copy if CPU tensor, otherwise, passing the CUDA IPC handle back to stub.
- (only for GPU tensor) stub then copy GPU tensor into the CUDA IPC handle
Member
Author
There was a problem hiding this comment.
Exactly. The third bullet point is what requires the round trip.
rmccorm4
approved these changes
May 21, 2022
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.