Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 31 additions & 2 deletions correspondence/index.bs
Original file line number Diff line number Diff line change
Expand Up @@ -301,15 +301,15 @@ User agents are not required to use these formulas and may expose whatever they
<td>`viewportBoundsRange[0]` (= -2 &times; `max(maxViewportDimensions[0..1])`)
<td><p class=issue>*No documented limit?*
<td>-32768 = `D3D12_VIEWPORT_BOUNDS_MIN`

Note: equal to -2 &times; `maxTextureDimension2D`
<tr>
<th>Max Viewport Bounds (implied)
<td>[#373](https://github.com/gpuweb/gpuweb/issues/373)
<td>`viewportBoundsRange[1]` (= 2 &times; `max(maxViewportDimensions[0..1])` - 1)
<td><p class=issue>*No documented limit?*
<td>32767 = `D3D12_VIEWPORT_BOUNDS_MAX`

Note: equal to 2 &times; `maxTextureDimension2D` - 1
</table>

Expand All @@ -325,3 +325,32 @@ Warning:
[Imagination](https://github.com/gpuweb/gpuweb/issues/3631#issuecomment-1498747606) drivers.
On these drivers, the combined limit may need to be ignored.
</p>

## `subgroup-size-control` ## {#subgroup-size-control}

The `subgroup-size-control` feature allows the use of the WGSL `subgroup_size` attribute
in compute shaders to request a specific subgroup size for pipeline creation.

No explicit limits (such as min/max subgroup size or max workgroup subgroup count) are exposed
on `GPUAdapterInfo` for this feature
([#6241](https://github.com/gpuweb/gpuweb/issues/6241)).
Instead, if the implementation cannot create a pipeline with the requested subgroup size,
it results in an uncategorized error during pipeline creation.
At least one power-of-two subgroup size between `subgroupMinSize` and `subgroupMaxSize` must
be supported.

Note:
The native APIs do expose limits related to explicit subgroup size control:
- Vulkan: `VkPhysicalDeviceSubgroupSizeControlProperties::{minSubgroupSize, maxSubgroupSize, maxComputeWorkgroupSubgroups}`
- D3D12: `D3D12_FEATURE_DATA_D3D12_OPTIONS1::{waveLaneCountMin, waveLaneCountMax}`

These are not surfaced in WebGPU because:
- The D3D12 `waveLaneCountMax` is not reliable according to [DirectXShaderCompiler Wiki](https://github.com/microsoft/DirectXShaderCompiler/wiki/Wave-Intrinsics/#caps-flags).
- The D3D12 `waveLaneCountMin` may differ from the actual minimum subgroup sizes used in fragment shaders on some Intel GPUs.
- `maxComputeWorkgroupSubgroups` only exists on Vulkan and has no D3D12 equivalent.

Metal does not natively support explicit subgroup size control. According to the
[Metal documentation](https://developer.apple.com/documentation/apple-silicon/porting-your-metal-code-to-apple-silicon#Determine-the-SIMD-Group-Size-at-Runtime):
"The size of a SIMD group varies between different GPUs, particularly Mac GPUs. Don't assume the size of SIMD groups."
Browsers may choose to expose the feature on Metal+Apple Silicon with a single constant subgroup
size, but they do so at their own risk since the SIMD width is not guaranteed.
41 changes: 10 additions & 31 deletions proposals/subgroup-size-control.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,46 +27,25 @@ This feature has not been approved by the working group yet.
| Metal | Not Supported | According to [Metal document](https://developer.apple.com/documentation/apple-silicon/porting-your-metal-code-to-apple-silicon#Determine-the-SIMD-Group-Size-at-Runtime): <br>"The size of a SIMD group varies between different GPUs, particularly Mac GPUs. Don't assume the size of SIMD groups." |


Note that on Vulkan we need `computeFullSubgroups == VK_TRUE` because we should set `VK_PIPELINE_SHADER_STAGE_CREATE_REQUIRE_FULL_SUBGROUPS_BIT` when creating the compute pipelien to ensure the subgroup sizes must be launched with all invocations active in the compute stage.
Note that
* On Vulkan we need `computeFullSubgroups == VK_TRUE` because we should set `VK_PIPELINE_SHADER_STAGE_CREATE_REQUIRE_FULL_SUBGROUPS_BIT` when creating the compute pipeline to ensure the subgroup sizes must be launched with all invocations active in the compute stage.
* Metal does not natively support controlling subgroup size, but the browsers may choose to expose the feature on Metal at their own risk.

TODO: Shall we support this extension on the newer Apple silicons with a single acceptable subgroup size?
2. No additional explicit limits are exposed on `GPUAdapterInfo`.

2. Three new limitations for the WGSL attribute `subgroup_size`.
Previously, three limits were proposed (`explicitComputeSubgroupMinSize`, `explicitComputeSubgroupMaxSize`, `maxComputeWorkgroupSubgroups`) to expose the range of valid subgroup sizes and workgroup subgroup counts. The working group decided not to expose them ([#6241](https://github.com/gpuweb/gpuweb/issues/6241)). The reasons:
* Most users of this feature are expert developers coding for a specific architecture who already know which subgroup sizes to use.
* The native API limits do not always accurately represent the range of usable sizes (e.g. on Intel Gen12, `waveLaneCountMin` is 8 but the minimum requestable compute subgroup size is 16; on D3D12, `waveLaneCountMax` is not reliable).
* `maxComputeWorkgroupSubgroups` only exists on Vulkan and has no D3D12 equivalent.

(1) `minExplicitComputeSubgroupSize` specifies the minimum value that can be used as the attribute `subgroup_size`.

| Platform | Implementation |
|----------|------|
|Vulkan|`VkPhysicalDeviceSubgroupSizeControlPropertiesEXT::minSubgroupSize` |
|D3D12|`D3D12_FEATURE_DATA_D3D12_OPTIONS1::waveLaneCountMin` |

(2) `maxExplicitComputeSubgroupSize` specifies the maximum value that can be used as the attribute `subgroup_size`.

| Platform | Implementation |
|----------|------|
|Vulkan|`VkPhysicalDeviceSubgroupSizeControlPropertiesEXT::maxSubgroupSize` |
|D3D12|`D3D12_FEATURE_DATA_D3D12_OPTIONS1::waveLaneCountMax` |

(3) `maxComputeWorkgroupSubgroups` limits the total workgroup size when the attribute `subgroup_size` is used.

| Platform | Implementation |
|----------|------|
|Vulkan|`VkPhysicalDeviceSubgroupSizeControlProperties.maxComputeWorkgroupSubgroups` |
|D3D12| Not supported |

Note that we need new limitations instead of the existing `subgroupMinSize` and `subgroupMaxSize` is because:
1. D3D12 runtime validates `[WaveSize]` with `waveLaneCountMin` and `waveLaneCountMax`
2. On D3D12 we don't always use `waveLaneCountMin` as `subgroupMinSize` because on some Intel GPUs, it is possible to run some pixel shaders with wave lane count 8, while on that platform `waveLaneCountMin` is 16, meaning in compute shaders the wave lane count will always be at least 16.
3. On D3D12 we don't always use `waveLaneCountMax` as `subgroupMaxSize` because in [D3D12 document](https://github.com/Microsoft/DirectXShaderCompiler/wiki/Wave-Intrinsics#:~:text=UINT%20WaveLaneCountMax) "the WaveLaneCountMax queried from D3D12 API is not reliable and the meaning is unclear.
Instead, if the implementation cannot create a pipeline with the requested subgroup size (e.g. due to the size being outside the supported range, register pressure, or hardware-specific workgroup subgroup count limits), it results in an **uncategorized error** during pipeline creation. The implementation must support at least one power-of-two subgroup size between `subgroupMinSize` and `subgroupMaxSize` (from `GPUAdapterInfo`).


# Behavior
* The attribute `subgroup_size` is restricted to `compute` shaders (in HLSL `[WaveSize()]` is [only supported in compute shaders](https://microsoft.github.io/DirectX-Specs/d3d/HLSL_SM_6_6_WaveSize.html#hlsl-attribute)).
* The parameter must be a const-expression or an override-expression that resolves to an `i32` or `u32`.
* The parameter must be must be a power-of-two (required by [D3D12](https://microsoft.github.io/DirectX-Specs/d3d/HLSL_SM_6_6_WaveSize.html#allowed-wave-sizes)).
* The parameter must be greater than or equal to the `minExplicitComputeSubgroupSize` on the current adapter (required by [D3D12](https://microsoft.github.io/DirectX-Specs/d3d/HLSL_SM_6_6_WaveSize.html#runtime-validation)).
* The parameter must be less than or equal to the `maxExplicitComputeSubgroupSize` on the current adapter (required by [D3D12](https://microsoft.github.io/DirectX-Specs/d3d/HLSL_SM_6_6_WaveSize.html#runtime-validation)).
* The total workgroup size (`workgroupSize.x * workgroupsize.y * workgroupsize.z`) must be less than or equal to the product of the attribute `subgroup_size` and `maxComputeWorkgroupSubgroups` (required by [Vulkan](https://docs.vulkan.org/refpages/latest/refpages/source/VkPipelineShaderStageCreateInfo.html#VUID-VkPipelineShaderStageCreateInfo-pNext-02756)).
* If the implementation cannot create a pipeline with the requested subgroup size (e.g. due to the size being outside the supported range, register pressure, or hardware-specific workgroup subgroup count limits), it results in an uncategorized error during pipeline creation. The implementation must support at least one power-of-two subgroup size between `subgroupMinSize` and `subgroupMaxSize`.
* `workgroupSize.x` must be a multiple of the attribute `subgroup_size` (required by [Vulkan](https://docs.vulkan.org/refpages/latest/refpages/source/VkPipelineShaderStageCreateInfo.html#VUID-VkPipelineShaderStageCreateInfo-pNext-02757)).

# WGSL Specification
Expand Down
25 changes: 24 additions & 1 deletion spec/index.bs
Original file line number Diff line number Diff line change
Expand Up @@ -144,6 +144,9 @@ spec: WGSL; urlPrefix: https://gpuweb.github.io/gpuweb/wgsl/#
text: dual_source_blending; url: extension-dual_source_blending
text: subgroups; url: extension-subgroups
text: primitive_index; url: extension-primitive_index
text: subgroup_size_control; url: extension-subgroup_size_control
for: attribute
text: subgroup_size; url: subgroup-size-attr
for: language-extension
text: readonly_and_readwrite_storage_textures; url: language_extension-readonly_and_readwrite_storage_textures
text: packed_4x8_integer_dot_product; url: language_extension-packed_4x8_integer_dot_product
Expand Down Expand Up @@ -3049,6 +3052,7 @@ enum GPUFeatureName {
"texture-formats-tier2",
"primitive-index",
"texture-component-swizzle",
"subgroup-size-control",
};
</script>

Expand Down Expand Up @@ -8301,6 +8305,8 @@ dictionary GPUComputePipelineDescriptor
[|device|.limits.{{supported limits/maxComputeWorkgroupSizeX}},
|device|.limits.{{supported limits/maxComputeWorkgroupSizeY}},
|device|.limits.{{supported limits/maxComputeWorkgroupSizeZ}}].
1. If |entryPoint| has a `subgroup_size` attribute with value |subgroupSize|:
1. The x-dimension of the |entryPoint|'s `workgroup_size` |must| be a multiple of |subgroupSize|.
</div>

1. If any [=pipeline-creation error|pipeline-creation=] [=uncategorized errors=]
Expand Down Expand Up @@ -11921,7 +11927,7 @@ dictionary GPUComputePassDescriptor
- all of |workgroupCountX|, |workgroupCountY| and |workgroupCountZ| are &le;
|this|.device.limits.{{supported limits/maxComputeWorkgroupsPerDimension}}.
- let |workgroupSize| be the computed workgroup size for
|bindingState|.{{GPUComputePassEncoder/[[pipeline]]}}.
|bindingState|.{{GPUComputePassEncoder/[[pipeline]]}}.
- the entry point uses the [=builtin/workgroup_index=]
built-in value and |workgroupCountX| &times; |workgroupCountY|
&times; |workgroupCountZ|
Expand Down Expand Up @@ -17400,6 +17406,23 @@ This feature adds the following [=optional API surfaces=]:
- New {{GPUTextureViewDescriptor}} dictionary members:
- {{GPUTextureViewDescriptor/swizzle}}

<h3 id=dom-gpufeaturename-subgroup-size-control data-dfn-type=enum-value data-dfn-for=GPUFeatureName>`"subgroup-size-control"`
</h3>

Allows the use of the [=attribute/subgroup_size=] attribute in WGSL to control compute pipeline subgroup size.

This feature adds no [=optional API surfaces=].

Note: Not all subgroup sizes may be supported in all cases.
If the implementation cannot create a pipeline with the requested subgroup size
(for example, due to register pressure or hardware limitations),
it will result in an [=uncategorized error=] during pipeline creation.
At least one subgroup size between {{GPUAdapterInfo/subgroupMinSize}} and
{{GPUAdapterInfo/subgroupMaxSize}} must be supported.

- New WGSL extensions:
- [=extension/subgroup_size_control=]

# Appendices # {#appendices}

## Texture Format Capabilities ## {#texture-format-caps}
Expand Down
36 changes: 36 additions & 0 deletions wgsl/index.bs
Original file line number Diff line number Diff line change
Expand Up @@ -286,6 +286,7 @@ spec: WebGPU; urlPrefix: https://gpuweb.github.io/gpuweb/#
type: enum-value
for: GPUFeatureName
text: "primitive-index"; url: dom-gpufeaturename-primitive-index
text: "subgroup-size-control"; url: dom-gpufeaturename-subgroup-size-control
</pre>

# Introduction # {#intro}
Expand Down Expand Up @@ -1402,6 +1403,7 @@ The [=syntax/attribute=] names are:
* <a for=attribute lt=size>`'size'`</a>
* <a for=attribute lt=vertex>`'vertex'`</a>
* <a for=attribute lt=workgroup_size>`'workgroup_size'`</a>
* <a for=attribute lt=subgroup_size>`'subgroup_size'`</a>

### Built-in Value Names ### {#builtin-value-names}

Expand Down Expand Up @@ -1480,6 +1482,7 @@ The [=enable-extension=] names are:
* <a for=extension lt=dual_source_blending>`'dual_source_blending'`</a>
* <a for=extension lt=subgroups>`'subgroups'`</a>
* <a for=extension lt=primitive_index>`'primitive_index'`</a>
* <a for=extension lt=subgroup_size_control>`'subgroup_size_control'`</a>

The valid [=language extension=] names are listed in [[#language-extensions-sec]] but in general have the same form as an [=identifier=]:

Expand Down Expand Up @@ -1855,6 +1858,10 @@ The valid [=enable-extensions=] are listed in the following table.
<td>The built-in variable [=built-in values/primitive_index=] is valid to use in the WGSL
module. Otherwise, using [=built-in values/primitive_index=] will result in a
[=shader-creation error=].
<tr><td><dfn noexport dfn-for="extension">`subgroup_size_control`</dfn>
<td>{{GPUFeatureName/"subgroup-size-control"}}
<td>The attribute [=attribute/subgroup_size=] is valid to use in the WGSL module.
Otherwise, using [=attribute/subgroup_size=] will result in a [=shader-creation error=].
</table>

<div class='example wgsl using extensions expect-error' heading="Using hypothetical enable-extensions">
Expand Down Expand Up @@ -8480,6 +8487,7 @@ The [=return type=], if specified, [=shader-creation error|must=] be [=construct
WGSL defines the following attributes that can be applied to function declarations:
* the [=shader stage attributes=]: [=attribute/vertex=], [=attribute/fragment=], and [=attribute/compute=]
* [=attribute/workgroup_size=]
* [=attribute/subgroup_size=]

WGSL defines the following attributes that can be applied to function
parameters and return types:
Expand Down Expand Up @@ -9175,6 +9183,30 @@ path: syntax/size_attr.syntax.bs.include
<td>[=shader-creation error|Must=] be a [=const-expression=] that [=type rules|resolves=] to an [=i32=] or [=u32=].<br>
[=shader-creation error|Must=] be positive.

</table>

## `subgroup_size` ## {#subgroup-size-attr}

<table class='data builtin'>
<caption><dfn noexport dfn-for="attribute">`subgroup_size`</dfn> Attribute</caption>
<tr>
<td style="width:10%">Description
<td>Specifies the subgroup size for a compute shader invocation.

[=shader-creation error|Must=] only be applied to a [=compute shader stage|compute shader=] entry point function.

<tr>
<td>Requires
[=shader-creation error|Must=] only be used when the [=extension/subgroup_size_control=] extension is enabled.
Comment thread
Jiawei-Shao marked this conversation as resolved.

<tr>
<td>Parameters
<td>[=shader-creation error|must=] be a [=const-expression=] or an [=override-expression=] that [=type rules|resolves=] to an [=i32=] or [=u32=].<br>
If the value is not a power of two, then:
* It is a [=shader-creation error=] if the expression is a [=const-expression=].
* It is a [=pipeline-creation error=] if the expression is an [=override-expression=].


</table>

## `workgroup_size` ## {#workgroup-size-attr}
Expand Down Expand Up @@ -9347,6 +9379,7 @@ It will stabilize in a finite number of steps.
WGSL defines the following attributes that can be applied to entry point declarations:
* the [=shader stage attributes=]: [=attribute/vertex=], [=attribute/fragment=], and [=attribute/compute=]
* [=attribute/workgroup_size=]
* [=attribute/subgroup_size=]

<div class='example wgsl global-scope' heading='workgroup_size Attribute'>
<xmp highlight=wgsl>
Expand Down Expand Up @@ -10078,6 +10111,9 @@ Each is described in detail in subsequent sections.
<tr><td style="width:10%">Description
<td>
The [=subgroup size=] of current invocation's subgroup.

For compute shaders with a [=attribute/subgroup_size=] attribute, this value
is equal to the specified attribute value.
</table>

##### `subgroup_id` ##### {#subgroup-id-builtin-value}
Expand Down
Loading