Khronos Finalizes Vulkan Video Extensions for Accelerated H.264 and H.265 Encode with the Release of Vulkan 1.3.274

The FPS Review may receive a commission if you purchase something after clicking a link in this article.

Image: Khronos

In April 2021, the Vulkan Working Group at Khronos released a set of provisional extensions, collectively referred to as ‘Vulkan Video’ which provide seamless encoding and decoding of video streams using a variety of video coding standards. The December 2022 release of Vulkan 1.3.238 saw the finalization of the extensions to decode H.264 and H.265, and today, with the release of Vulkan 1.3.274, Khronos has finalized their counterpart: the extensions to enable encoding of H.264 and H.265 video streams. Leveraging the Vulkan framework, they provide a standardized, seamless, low-overhead, and highly controllable way to produce H.264 and H.265 video via hardware accelerators, with applications ranging from real-time, low-latency streaming to offline server-scale transcoding.

Incorporating industry feedback, the extensions saw many improvements since their introduction, from a bidirectional interface (overrides) to help with coding and exposing advanced hardware capabilities, to rate control configuration parameters and an interface to aid with quality vs. performance trade-offs. This feedback also prompted the release of the first video maintenance extension. In addition, given the high industry demand for AV1 codec support, an AV1 decode extension release is imminent, with an AV1 encode extension development also underway. Figure 1 depicts Vulkan Video extensions along with their status and relations.

Image: Khronos

The encode extensions grant low-level control over much of the encoding process, while still keeping the efficiency and performance of hardware encoding acceleration. Implementers have the freedom to tweak details such as quantization index, per-slice bit allocation, arithmetic coder, deblocking, and more. Given this flexibility and complexity, a balanced programming interface for rate control gives users a choice between more automated operation and low-level tweaking of frame parameters.

Changes to Video Encode

Encoder Rate Control

Often the most important aspect of encoder configuration for applications, the encoder rate control API was given special attention in Vulkan Video. From exposing parameters for standard rate control modes (e.g. CBR/VBR), to allowing applications to provide hints about other intended stream encoding parameters (e.g. picture/reference patterns), to providing the ability to configure per-layer rate control parameters (e.g. for streams with multiple temporal layers), the rate control API offers a rich set of features for various use cases and lays a solid foundation for future extensions. Encoder rate control configuration is performed using the vkCmdControlVideoCodingKHR command.

Encoder Quality Levels

Video encoder implementations often fine tune the use of various encoding tools and rate control parameters depending on the desired quality versus performance/latency trade-offs of different use cases. Now implementations report the number of quality levels supported for a given video profile and usage. A new API vkGetPhysicalDeviceVideoEncodeQualityLevelPropertiesKHR may be used to retrieve implementation recommendations for various encoding parameters and configurations (e.g. rate control).

Implementation Overrides

Due to the complex nature of video encoding, and the ever-changing nature of hardware encoders and their capabilities, an interface, known as overrides, permits bidirectional communication that guarantees that the output video stream will be compliant. In addition, applications may opt-in for optimization overrides to allow implementations more flexibility to optimize for the specified usage and hints. Full disclosure about the occurrence of overrides for video session parameters or frame parameters is also reported for developers interested in more detailed analysis of such overrides.

Retrieval of encoded video session parameters bitstream segments

To facilitate implementation overrides for bitstream compliance and optimizations, applications are expected to retrieve the encoded video session parameter bitstream segments (e.g. H.264 SPS/PPS) from the implementation using the new API call vkGetEncodedVideoSessionParametersKHR against the given VkVideoSessionParametersKHR object.

Encoder Feedback Query

To allow future extension of encoder feedback statistics in a manner similar to pipeline statistics, the new VK_QUERY_TYPE_VIDEO_ENCODE_FEEDBACK_KHR is now used to retrieve the video bitstream offset and size.

Figure 2 depicts the Vulkan Video encoding process, which remains largely unchanged compared to previous descriptions.

Image: Khronos

Changes to Video Decode & Encode

VK_KHR_video_maintenance1

Along with the video encoding extensions, Khronos is releasing a maintenance extension incorporating community and industry feedback, which improves flexibility for both decoding and encoding. This extension permits decoding implementations to create images usable with video decoding without the need to explicitly specify the video profiles they will be used with. The same applies for encoding, where an attached per-image video profile limits usability with large and complex transcoding frameworks.

In addition to flexibility improvements, a new, simpler interface for specifying video queries inline with video decode and encode operation commands has been added, known as inline queries.

Requiring pSetupReferenceSlotKHR for non-reference pictures

When the Vulkan Video decode extensions were finalized applications were required to provide a reconstructed picture resource and DPB slot (via VkVideoDecodeInfoKHR::pSetupReferenceSlot) only if the picture being decoded will become a reference. However, no shipping implementation actually supported specifying NULLfor pSetupReferenceSlot, and further some implementations discovered cases that require the use of the reconstructed picture resource and/or DPB slot for transient storage during decoding a non-reference picture. A similar situation applies to encoding non-reference pictures. As a result, the vulkan video extensions were updated to require providing pSetupReferenceSlotKHR for non-reference pictures.

More information can be found in Khrono Group’s full press release here.

Join the discussion in our forums...

Tsing Mui
News poster at The FPS Review.

Recent News