This is a document with pseudo-code for the parts of the API related to pipeline objects. Naming and the UX of the API are just an example and could change, the important parts are what’s available and what goes where. For example this document uses C structures when a C++ API might want to use builder objects, and a Javascript API could use dictionaries.
For type safety, compute and graphics pipeline are separate types.
To create a pipeline, a structure containing all the relevant information is passed to DeviceCreate<TYPE>Pipeline.
To create a compute pipeline the only things needed are some shader code present in a ShaderModule object, and a PipelineLayout object describing how the pipeline interacts with the binding model.
struct ComputePipelineDescriptor {
ShaderModule module;
const char* entryPoint;
PipelineLayout layout;
};
ComputePipeline CreateComputePipeline(Device device, const ComputePipelineDescriptor* descriptor);Translation to the backing APIs would be the following:
- D3D12: Translates to
ID3D12::CreateComputePipelineState, aD3D12_SHADER_BYTECODEis created from the(module, entryPoint)pair, and theID3D12RootSignatureis equivalent to thePipelineLayout. - Metal: Translates to
MTLDevice::makeComputePipelineState, theMTLFunctionis created from the(module, entryPoint, layout)tuple by adapting the generated MSL to the resource slot allocation done inlayout. - Vulkan: Translates to
vkCreateComputePipelineswith one pipeline. ThevkShaderStageInfocorresponds to(module, entryPoint)and thevkPipelineLayoutcorresponds tolayout.
Question: How do we take advantage of the pipeline caching present in D3D12 and Vulkan? Do we expose it to the application or is it done magically in the WebGPU implementation?
Answer: deferred to post-MVP.
Render pipelines need ShaderModule and a PipelineLayout like compute pipelines and in addition require information about:
- Layout for vertex inputs
- Layout for fragment outputs
- All the fixed-function state
For simplicity we assume most fixed-function state is created in separate object.
For example a DepthStencilState object would be allocated and a pointer to it would be stored in the RenderPipelineDescriptor. This is part of the UX of the API and could be replaced with chained structure like Vulkan or member structure like D3D12.
Mismatch:
- Metal has primitive restart always enabled.
- D3D12 needs to know whether the primitive restart index is
0xFFFFor0xFFFFFFFFat pipeline creation time. - Metal doesn’t have a sample mask.
- Vulkan can have some state like scissor and viewport set on the pipeline as an optimization on some GPUs.
- Vulkan allows creating pipelines in bulk, this is not only a UX things but allows reusing some results for faster creation.
enum IndexFormat {
IndexFormatUint16,
IndexFormatUint32,
};
struct RenderPipelineDescriptor {
// Same translation as for compute pipelines
ShaderModule vsModule;
const char* vsEntryPoint;
ShaderModule fsModule;
const char* fsEntryPoint;
PipelineLayout layout;
// Pipeline input / outputs
InputState* inputState;
IndexFormat indexFormat;
RenderPass* renderPass;
int subpassIndex;
// Fixed function state
DepthStencilState* depthStencil;
BlendState* blend[kMaxColorAttachments];
PrimitiveTopology topology;
// TODO other state: rasterizer state, “multisample state”
};
RenderPipeline CreateRenderPipeline(Device device, const RenderPipelineDescriptor* descriptor);Translation to the backing APIs would be the following:
- D3D12: Translates to
ID3D12::CreateGraphicsPipelineState.IBStripCutValuewill always be set with its value being chosen depending onindexFormat. - Metal: Translates to
MTLDevice::makeRenderPipelineState - Vulkan: Translates to
vkCreateGraphicsPipelines.VkPipelineInputAssemblyStateCreateInfo'sprimitiveRestartEnableis always set to true. All dynamic states are set on all pipelines.
Question: Should the type of the indices be set in RenderPipelineDescriptor? If not, how is the D3D12 IBStripCutValue chosen?
Answer: While indexFormat isn't necessary in any of the three APIs, we chose to include it in the pipeline state because primitive restart must always be enabled (because of Metal) and a D3D12 needs to choose the correct IBStripCutValue. The alternative would have been to compile two D3D12 pipelines for every WebGPU pipelines, or defer compilation.
The translation of individual members of RenderPipelineDescriptor is described below.
This describes how the vertex buffers are stepped through (stride, instance vs. vertex, instance divisor), and how the attributes are extracted from the buffers (buffer index, format, offset).
Mismatches:
- D3D12 takes the stride along with the vertex buffers in
ID3D12GraphicsCommandList::IASetVertexBufferswhereas Vulkan and Metal take it at pipeline compilation time. - Vulkan doesn’t support a divisor for its step rate.
enum StepRate {
StepRateVertex,
StepRateInstance,
};
Enum VertexFormat {
// TODO make a list of portable vertex formats
};
struct InputStateDescriptor {
struct {
bool enabled;
VertexFormat format;
int offsetInBuffer;
int bufferIndex;
} attributes[MAX_ATTRIBUTES];
struct {
StepRate rate;
int stride;
} buffers[MAX_VERTEX_BUFFERS];
};
InputState* CreateInputState(Device* device, InputStateDescriptor* descriptor);Translation to the backing APIs would be the following:
- D3D12: Translates to a
D3D12_INPUT_DESC. Each enabled attribute corresponds to aD3D12_INPUT_ELEMENT_DESCwithInputSlotbeing the index of the attribute. Other members of theD3D12_INPUT_ELEMENT_DESCare translated trivially. The stride is looked up in the pipeline state before calls toID3D12GraphicsCommandList::IASetVertexBuffers.IASetVertexBuffersmight be deferred until before a draw and vertex buffers might be invalidated by pipeline changes. - Metal: Translates to a
MTLVertexDescriptor, with attributes corresponding toMTLVertexDescriptor::attributesand buffers corresponding toMTLVertexDescriptor::layouts. Attributes translate trivially toMTLVertexAttributeDescriptorstructures and buffers toMTLVertexBufferLayoutDescriptorstructures. Extra care only needs to be taken to translate a zero stride to a constant step rate. - Vulkan: Translates to a
VkPipelineVertexInputStateCreateInfo. Buffers translate trivially toVkVertexInputBindingDescriptionand attributes toVkVertexInputAttributeDescription.
Question: Should the vertex attributes somehow be included in the PipelineLayout so vertex buffers are treated as other resources and changed in bulk with them?
Answer: We decided against innovating in this area.
The RenderPass will contain for each subpass a list of the attachment formats for color attachments and depth-stencil attachments.
Information from the RenderPass is used to fill the following:
- D3D12:
RTVFormats,DSVFormatsandNumRenderTargetsinD3D12_GRAPHICS_PIPELINE_STATE_DESC. - Metal:
colorAttachments[N].pixelFormat,depthAttachmentPixelFormatandstencilAttachmentPixelFormatinMTLRenderPipelineDescriptor. - Vulkan:
renderPassandsubpassinVkGraphicsPipelineCreateInfo.
Question: does the sample count of the pipeline state come from the RenderPass too?
Answer: deferred post-MVP.
Mismatch:
- Metal and D3D12 only require “point vs. line vs. triangle” at pipeline compilation time, the exact topology is set via
ID3D12GraphicsCommandList::IASetPrimitiveTopologyor passed in theMTLRenderCommandEncoder::draw*. Vulkan requires the exact topology at compilation time. - Vulkan supports triangle fans but Metal and D3D12 don’t.
enum PrimitiveTopology {
PrimitiveTopologyPoints,
PrimitiveTopologyLineList,
PrimitiveTopologyLineStrip,
PrimitiveTopologyTriangleList,
PrimitiveTopologyTriangleStrip,
};Translation to the backing APIs would be the following:
- D3D12 and Metal: The primitive topology type is set on the
D3D12_GRAPHICS_PIPELINE_STATE_DESCandMTLRenderPipelineDescriptor. At draw-time, the exact topology is queried from the pipeline. - Vulkan: The primitive topology type is set in the
VkGraphicsPipelineCreateInfo.
Mismatch:
- In Vulkan per-attachment blending and dual source blending are exposed as optional features.
independentBlendis supported almost everywhere but Adreno 4XX whiledualSrcBlendis also not supported on Mali GPUs. - Metal doesn’t have logic ops.
enum BlendOperation {
BlendOperationAdd,
BlendOperationSubtract,
BlendOperationReverseSubtract,
BlendOperationMin,
BlendOperationMax,
};
enum BlendFactor {
BlendFactorOne,
BlendFactorSrcColor,
BlendFactorOneMinusSrcColor,
BlendFactorSrcAlpha,
BlendFactorOneMinusSrcAlpha,
BlendFactorDstColor,
BlendFactorOneMinusDstColor,
BlendFactorDstAlpha,
BlendFactorOneMinusDstAlphe,
BlendFactorSrcAlphaSaturated,
BlendFactorBlendColor,
BlendFactorOneMinusBlendColor,
};
struct BlendStateDescriptor {
bool enabled;
BlendFactor srcColorFactor;
BlendFactor dstColorFactor;
BlendFactor srcAlphaFactor;
BlendFactor dstAlphaFactor;
BlendOperation colorOperation;
BlendOperation alphaOperation;
int writeMask;
};
BlendState* CreateBlendState(Device* device, BlendStateDescriptor* descriptor);Translation to backing APIs would be the following:
- D3D12: when filling the
D3D12_GRAPHICS_PIPELINE_DESC,BlendStatewill be filled with data coming from theBlendStatesreferenced in theRenderPipelineDescriptor. Translation from aBlendStateto aD3D12_RENDER_TARGET_BLEND_DESCis trivial. - Metal: the
BlendStateswill be used to fill all of the data for aMTLRenderPipelineColorAttachmentDescriptorbutpixelFormat. Translation of individual members is trivial. - Vulkan: the
BlendStateswill be translated to elements ofpAttachmentsin theVkPipelineColorBlendStateCreateInfo. Translation of individual members is trivial.
Open question: Should enablement of independent attachment blend state be explicit like in D3D12 or explicit?
Open question: Should alpha to coverage be part of the multisample state or the blend state?
Mismatch:
- D3D12 doesn’t have per-face stencil read and write masks.
- In Metal the depth stencil state is built and bound separately from the pipeline state.
enum CompareFunction {
CompareFunctionNever,
CompareFunctionLess,
CompareFunctionLessEqual,
CompareFunctionGreater,
CompareFunctionGreaterEqual,
CompareFunctionEqual,
CompareFunctionNotEqual,
CompareFunctionAlways,
};
enum StencilOperation {
StencilOperationKeep,
StencilOperationZero,
StencilOperationReplace,
StencilOperationInvert,
StencilOperationIncrementClamp,
StencilOperationDecrementClamp,
StencilOperationIncrementWrap,
StencilOperationDecrementWrap,
};
struct StencilFaceDescriptor {
CompareFunction stencilCompare;
StencilOperation stencilPass;
StencilOperation stencilFail;
StencilOperation depthFail;
};
struct DepthStencilStateDescriptor {
CompareFunction depthCompare;
StencilFaceDescriptor front;
StencilFaceDescriptor back;
int stencilReadMask;
Int stencilWriteMask;
};
DepthStencilState* CreateDepthStencilState(Device* device, DepthStencilDescriptor* descriptor);Translation to backing APIs would be the following:
- D3D12:
DepthStencilStatetranslates trivially to aD3D12_DEPTH_STENCIL_DESC.DepthEnablewould be set asdepthCompare != Always. - Metal:
DepthStencilStatetranslates trivially toMTLDepthStencilDescriptorexcept that front and back stencil masks have to be set to the single stencil mask value from WebGPU. When a pipeline is bound, the corresponding depth-stencil state is bound at the same time. - Vulkan:
DepthStencilStatetranslates trivially toVkPipelineDepthStencilStateCreateInfoxceptexcept that front and back stencil masks have to be set to the single stencil mask value from WebGPU.depthTestEnablewould be set todepthCompare != Always.
Question: What about Vulkan’s VkPipelineDepthStencilStateCreateInfo::depthBoundTestEnable and D3D12's D3D12_DEPTH_STENCIL_DESC1::DepthBoundsTestEnable?
Answer: deferred post-MVP.
Open question: Should “depth test enable” be implicit or explicit?