Skip to content
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
74 changes: 71 additions & 3 deletions explainer/index.bs
Original file line number Diff line number Diff line change
Expand Up @@ -268,7 +268,7 @@ the time it was issued.
event on `GPUDevice`<sup>3</sup> (and could issue a console warning as well).

<sup>1</sup>
In the plan to add [[#multi-threading]], error scope state to actually be **per-device, per-realm**.
In the plan to add [[#multithreading]], error scope state to actually be **per-device, per-realm**.
That is, when a GPUDevice is posted to a Worker for the first time, the error scope stack for
that device+realm is always empty.
(If a GPUDevice is copied *back* to an execution context it already existed on, it shares its
Expand All @@ -282,7 +282,7 @@ In poorly-formed applications, this mechanism can prevent the events from having
performance impact on the system.

<sup>3</sup>
More specifically, with [[#multi-threading]], this event would only exists on the *originating*
More specifically, with [[#multithreading]], this event would only exists on the *originating*
`GPUDevice` (the one that came from `createDevice`).
It doesn't exist on `GPUDevice`s produced by sending messages.

Expand Down Expand Up @@ -713,7 +713,75 @@ When using advanced methods to transfer data to the GPU (with a rolling list of
</pre>
</div>

## Multi-Threading ## {#multi-threading}
## Multithreading ## {#multithreading}

Multithreading is a key part of modern graphics APIs.
Unlike OpenGL, newer APIs allow applications to encode commands, submit work, upload resources, and
Comment thread
kainino0x marked this conversation as resolved.
Outdated
so on, from multiple threads at once, alleviating CPU bottlenecks.
This is especially relevant to WebGPU, since IDL bindings are generally much slower than C calls.

WebGPU does not *yet* allow multithreaded use of a single `GPUDevice`, but the API has been
designed from the ground up with this in mind.
This section describes the tentative plan for how it will work.

As described in [[#gpu-process]], most WebGPU objects are actually just "handles" that refer to
objects in the browser's GPU process.
As such, it is relatively straightforward to allow these to be shared among threads.
For example, a `GPUTexture` object can simply be `postMessage()`d to another thread, creating a
new `GPUTexture` JavaScript object containing a handle to the *same* GPU process object.

Several objects, like `GPUBuffer`, have client-side state.
Applications still need to use them from multiple threads without `transfer`ring them back
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit unclear to me. Where are we controlling if something is transferred or not?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Transferability is WebIDL [Transferable] and is controlled via the transfer list in postMessage.

and forth (which would also create new wrapper objects, breaking old references).
These objects will also be `[Serializable]` but have **shared client-side state**, just like
`SharedArrayBuffer`.
For example, for threads Main and Worker:

- Main: createBuffer &rarr; B1.
- Main: postMessage to Worker.
- Worker: receive message &rarr; B2.
- Worker: `B2.mapAsync()` &rarr; successfully puts the buffer in the "map pending" state.
- Main: `B1.mapAsync()` &rarr; **throws an exception**.
- Main: Encode some command that uses `B1`, like:

```js
encoder.copyBufferToTexture(B1, T);
const commandBuffer = encoder.finish();
```

&rarr; succeeds, because this doesn't depend on the buffer's client side state.
- Main: `queue.submit(commandBuffer)` &rarr; **asynchronous WebGPU error**,
because the CPU currently owns the buffer.
- Worker: waits for the mapping, writes to it, then calls `B2.unmap()`.
- Main: `queue.submit(commandBuffer)` &rarr; succeeds
- Main: `B1.mapAsync()` &rarr; successfully puts the buffer in the "map pending" state

Further discussion can be found in [#354](https://github.com/gpuweb/gpuweb/issues/354)
(note not all of it reflects current thinking).

### Unsolved: Synchronous Object Transfer ### {#multithreading-transfer}

Some application architectures require objects to be passed between threads without having to
asynchronously wait for a message to arrive on the receiving thread.

The most crucial of these is WebAssembly applications:
Comment thread
kainino0x marked this conversation as resolved.
Outdated
Programs using native C/C++/Rust/etc. bindings for WebGPU will want to assume object handles
are plain-old-data (e.g. `typedef struct WGPUBufferImpl* WGPUBuffer;`)
that can be passed between threads freely.
Unfortunately, this cannot be implemented in C-on-JS bindings (e.g. Emscripten) without complex,
hidden, and slow asynchronicity (yielding on the receiving thread, interrupting the sending
thread to send a message, then waiting for the object on the receiving thread).

Some alternatives are mentioned in issue [#747](https://github.com/gpuweb/gpuweb/issues/747):

- `SharedObjectTable`, an object with shared-state (like `SharedArrayBuffer`) containing a table of
`[Serializable]` values. Effectively, a store into the table would serialize once, and then any
thread with the `SharedObjectTable` could (synchronously) deserialize the object on demand.
- A synchronous `MessagePort.receiveMessage()` method.
This would be less ideal as it would require any thread that creates one of these objects to
eagerly send it to every thread, just in case they need it later.
- Allow "exporting" a numerical ID for an object that can be used to "import" the object on
another thread. This bypasses the garbage collector and makes it easy to leak memory.


## Command Encoding and Submission ## {#command-encoding}
Expand Down