-
Notifications
You must be signed in to change notification settings - Fork 370
GPU Web 2025 07 30
Corentin Wallez edited this page Aug 19, 2025
·
1 revision
Chair: CW
Scribe: KR
Location: Google Meet
- Administrivia
- CTS Update
- Only 32 timestamp queries allowed on Metal MacOS? #5261
- Should we disallow swizzle with depth textures? #5266
- GPUBindGroupLayoutEntry is accepting an empty dictionary for externalTexture, should accept boolean instead #5249
- (quick) Should we raise an error if non-default swizzle is used without feature enabled? #5264
- WGSL: Widen tanh acceptable range (output values) #5199
- WGSL: add support for Prefix Operators i.e. ++i #5181
- Socialize: wgsl: support swizzle assignment #5268 *
- Investigation reportbacks: primitive ID and barycentrics
- Agenda for next meeting
- Apple
- Mike Wyrzykowski
- Google
- Brandon Jones
- Corentin Wallez
- Dan Sinclair
- David Neto
- James Price
- Kai Ninomiya
- Ken Russell
- Peter McNeeley
- Alan Baker
- Microsoft
- Rafael Cintron
- Mozilla
- Jim Blandy
- Nvidia
- Anders Leino
- Mehmet Oguz Derin
- CW: Please add to the F2F doc.
- Gregg has continue implementing tests for new features!
Only 32 timestamp queries allowed on Metal MacOS? #5261
- MW: a few options. Can do what we do for samplers. These can use at most 4 ints per render pass, so could have space for 2048 per sample buffer, and with 32 sample buffers, can support tens of thousands of timestamp queries before we have to resort to the sampler trick.
- CW: sounds reasonable
- JB: sample buffers are not the same as individual queries?
- MW: that's impl detail. That's how Safari implements them. But nobody stopping an impl from allocating a 32 KB counter sample buffer and using that across multiple query sets. Virtualization of that.
- CW: on Dawn can virtualize a bunch per device.
- MW: if my counter sample buffer's 32 KB then I can distribute my indices across it. At least 2048 timestamp queries per counter sample buffer, all in flight at once. Needs to be verified.
- JB: need more investigation about WebGPU timestamp queries on Metal.
- CW: OK, try things and come back to this later.
Should we disallow swizzle with depth textures? #5266
- KN: more investigation needed, skip for now.
GPUBindGroupLayoutEntry is accepting an empty dictionary for externalTexture, should accept boolean instead #5249
- KN: any dictionary that has no required members should accept boolean. Needs citation. These are not the only dictionaries we have that can be passed with no arguments - SamplerDescriptor and a few other things. I assume this doesn't make sense. Unless there's a written guideline, I'll say we're not interested in it.
- CW: We need to ask the reporter where this guideline comes from because this doesn't make sense a priori for these WebGPU APIs. E.g. a boolean instead of a descriptor isn't useful.
(quick) Should we raise an error if non-default swizzle is used without feature enabled? #5264
- CW: e.g. if you don't enable it and pass the default swizzle of RGBA, is that an error?
- KN: if an extension adds a dictionary member, it's generally OK to set it to the default value. Setting it to a non-default value is a validation error.
- JB: gives content a bit more flexibility. Don't have to avoid providing the value at all.
WGSL: Widen tanh acceptable range (output values) #5199
- PM: this is NVIDIA. Either accept wider range, or polyfill. Polyfill is at least 2x slower. David found that NVIDIA was choosing lower precision.
- DN: this is only on recent GPUs with tensor cores.
- PM: sigmoid function's used in ML, so assume NVIDIA's optimizing its speed.
- DN: have similar things for acos / asin on various Intel GPUs for example. Suggest we relax it.
- JB: users can write (?? a specific approximation?) if they want.
- PM: agree. 2x slower in my benchmarking though.
- CW: consensus via voiced agreement to lower the precision.
WGSL: add support for Prefix Operators i.e. ++i #5181
- JB: think all Google, Apple and Mozilla have thumbs-up'd dneto's proposal that we shouldn't do this.
- CW: think we should say no, and provide justification. I do hope that we can add more useful language features.
- DN: the backward compatibility was the long pole in the tent for me. I'd have no problem doing predecrement and preincrement as statements, but that doesn't buy you anything anyway.
Socialize: wgsl: support swizzle assignment #5268
- (Revised)
- DN: people have wanted swizzle assignment for a long time. We can give them what they want. Can show them what will work and what won't.
- DN: subtle: order of operations in e.g. compound assignment. Eval of RHS can modify memory you're evaluating on the LHS. Have already spoken with Oguz and Alan about it.
- JP: This happens today in assignment to component of a vector.
- Primitive ID issue: #1786
- https://dawn.googlesource.com/dawn/+/refs/heads/main/docs/tint/extensions/chromium_experimental_primitive_id.md
- https://dawn-review.googlesource.com/c/dawn/+/254434/3/docs/tint/extensions/chromium_experimental_barycentric_coord.md
- BJ: picked up some work Dan did implementing primitive_id in Tint, plumbed up to Dawn/Blink, got CTS tests. Answered some questions about that API that're in the notes Dan has here.
- If primitive_id overflows 2^32-1 it goes to 0. In WebGPU it doesn't matter because we can't draw that many primitives in a single shot, and instances don't count.
- Primitive restart and strip topology. Need to check on mac. Think everything handles it the same. Doesn't reset primitive_id to 0; just skips over that one.
- Has good support across the board. Can get primitive_id from a few places. Think it's a good candidate to release as a feature.
- BJ: barycentrics. Had hoped these would be similarly well supported. But, no.
- Much smaller range of hardware they're supported on. Vulkan extension only has ~10% device support.
- Next step is to finalize primitive_id then.
- JB: reason barycentric stuff isn't widely available even though everyone's computing them is that sometimes they compute them temporarily and then throw them away.
- KN: can we synthesize in geometry shader?
- BJ: can construct your mesh to contain barycentrics, just duplicate geometry. Don't know what that looks like in tessellation/mesh shader.
- KN: geom shader happens after primitive assembly. Vertex data's been duplicated at that point.
- CW: geom shaders have weird performance characteristics, can be a major deoptimization.
- DS: Metal has barycentric coordinates so wouldn't need to emulate there. Rafael and Mike can you please look at the primitive_id Markdown w.r.t. HLSL and MSL?
- RC: D3D dev (Jesse) and I agree primitive id experimental feature can be implemented on D3D.
- Cancel next week's meeting
- Try to make progress on issues offline
- Register for F2F, even remote attendance