HDR10+ AV1 Metadata Handling Specification

v1.0.0

AOM Final Deliverable,

This version:
https://aomediacodec.github.io/av1-hdr10plus/v1.0.0.html
Latest version:
https://AOMediaCodec.github.io/av1-hdr10plus
Issue Tracking:
GitHub
Editors:
Paul Hearty (Samsung)
Bill Mandel (Samsung)
Cyril Concolato (Netflix)

Copyright 2022, AOM

Licensing information is available at http://aomedia.org/license/

The MATERIALS ARE PROVIDED “AS IS.” The Alliance for Open Media, its members, and its contributors expressly disclaim any warranties (express, implied, or otherwise), including implied warranties of merchantability, non-infringement, fitness for a particular purpose, or title, related to the materials. The entire risk as to implementing or otherwise using the materials is assumed by the implementer and user. IN NO EVENT WILL THE ALLIANCE FOR OPEN MEDIA, ITS MEMBERS, OR CONTRIBUTORS BE LIABLE TO ANY OTHER PARTY FOR LOST PROFITS OR ANY FORM OF INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES OF ANY CHARACTER FROM ANY CAUSES OF ACTION OF ANY KIND WITH RESPECT TO THIS DELIVERABLE OR ITS GOVERNING AGREEMENT, WHETHER BASED ON BREACH OF CONTRACT, TORT (INCLUDING NEGLIGENCE), OR OTHERWISE, AND WHETHER OR NOT THE OTHER MEMBER HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.


Abstract

This document specifies how to use HDR10+ metadata within [AV1] bitstreams, including when carried in [CMAF].

1. Introduction

This document specifies how to use HDR10+ metadata within [AV1] bitstreams, including when carried in [CMAF]. HDR10+ metadata enables devices to optimize rendering of HDR content based on the display capabilities and on a scene-by-scene and frame-by-frame basis.

Various tools, services and devices support creation and use of HDR10+ metadata, which can be easily utilized directly in [AV1] systems. Carriage of HDR10+ metadata in [AV1] leverages mechanisms specified in [T35] and [CTA-861]. HDR10+ metadata is placed in metadata OBUs of metadata_type equal to METADATA_TYPE_ITUT_T35. This document covers details of the placement of these OBUs in [AV1] bitstreams.

2. Use of HDR10+ in AV1 bitstreams

2.1. HDR10+ Metadata

In the context of this specification, the syntax and semantics of the HDR10+ Metadata are defined in [CTA-861] and [ST-2094-40] respectively.

An HDR10+ Metadata OBU is defined as HDR10+ Metadata carried in a Metadata OBU. The metadata_type of such Metadata OBU is set to METADATA_TYPE_ITUT_T35 and the itu_t_t35_country_code of the corresponding Metadata ITUT T35 element is set to 0xB5. The remaining syntax element of Metadata ITUT T35, itu_t_t35_payload_bytes, is interpreted using the syntax defined in Annex S of [CTA-861], starting with the itu_t_t35_terminal_provider_code, and the semantics defined in [ST-2094-40].

According to the definition of the HDR10+ Metadata, the first 6 bytes of the itu_t_t35_payload_bytes of the HDR10+ Metadata OBU are set as follows:

All the remaining bytes of the itu_t_t35_payload_bytes identify as the HDR10+ Metadata and are associated with the syntax elements of the user_data_registered_itu_t_t35 structure that is defined in Annex S of [CTA-861]. For convenience, the structure of the HDR10+ Metadata OBU is illustrated in Figure 1.

Figure 1. HDR10+ Metadata OBU

NOTE: [AV1] defines the general Metadata OBU syntax for HDR10 Static Metadata and ITU-T T.35 Metadata.

HDR10 Static Metadata is defined as a combination of three types of HDR related metadata, including MDCV, MaxCLL and MaxFALL. MDCV shall be present while MaxCLL and/or MaxFALL may be present.

2.2. HDR10+ bitstream constraints

The following sections define constraints that apply to [AV1] bitstreams when carrying HDR10+ Metadata.

2.2.1. Color Configuration

Streams suitable for incorporating HDR10+ metadata as described in this specification shall use the following values for the [AV1] color_config:

Additionally, the following recommendations apply:

2.2.2. Placement of HDR10+ Metadata OBUs

As defined in [AV1], an AV1 coded video sequence consists of one or more temporal units. A temporal unit contains a series of OBUs starting from a Temporal Delimiter OBU, optional Sequence Header OBUs, optional Metadata OBUs, a sequence of one or more Frame Header OBUs, each followed by zero or more Tile Group OBUs as well as optional Padding OBUs.

Consequently, for each frame with show_frame = 1 or show_existing_frame = 1, there shall be one and only one HDR10+ metadata OBU preceding the Frame Header OBU for this frame and located after the last OBU of the previous frame (if any) or after the Sequence Header OBU (if any) or after the start of the temporal unit (e.g. after the Temporal Delimiter OBU, for storage formats where Temporal Delimiter OBUs are preserved).

HDR10+ Metadata OBUs are not provided when show_frame = 0. For non-layered streams, there is only one HDR10+ Metadata OBU per temporal unit. For [AV1] bitstreams encoded with multiple layers, HDR10+ Metadata may apply to one or more layers. However, the details are out of scope of this version of the specification.

Figure 2 shows a simplified example of placement of HDR10+ Metadata OBUs in an AV1 bitstream.

Figure 2. Example of placement of HDR10+ Metadata OBUs in an AV1 bitstream

2.2.3. Provision for Film Grain Processing

It is possible that some [AV1] bitstreams may contain both HDR10+ Metadata and film grain synthesis information. It is recommended that decoders in such scenarios perform the film grain synthesis prior to any HDR10+ Metadata processing.

3. Storage and Transport considerations

3.1. Constraints on AV1CodecConfigurationRecord

For formats that use the AV1CodecConfigurationRecord when storing [AV1] bitstreams (e.g. ISOBMFF and MPEG-2 TS), HDR10+ Metadata OBUs shall not be present in the configOBUs field of the AV1CodecConfigurationRecord.

3.2. ISOBMFF Constraints

AV1 Metadata sample group defined in [AV1-ISOBMFF] shall not be used.

[AV1-ISOBMFF] indicates that Metadata OBUs may be protected. This specification requires that HDR10 Static Metadata and HDR10+ Metadata OBUs are unprotected.

An ISOBMFF file or CMAF AV1 track as defined in [AV1-ISOBMFF] that also conforms to this specification (i.e. that contains HDR10+ metadata OBUs and complies to the constraints from this specification) should use the brand cdm4 defined in [CTA-5001] in addition to the brand av01. If the brand cdm4 is used in conjunction with [AV1] bitstreams, the constraints defined in this specification shall be respected.

3.3. HTTP Streaming Constraints

The value of the Codecs Parameter String for [AV1] bitstreams that is used when using HTTP streaming technologies shall remain unchanged when HDR10+ Metadata OBUs are included in the [AV1] stream.

Additionally, [DASH] content following [DASH-IOP] should include a Supplemental Descriptor with @schemeIdUri set to "http://dashif.org/metadata/hdr" and @value set to "SMPTE2094-40" in manifest files. This can aid players to identify tracks containing HDR10+ Metadata OBUs.

4. Example Streams and Tools

Information on this topic is found in the Wiki for this project.

Conformance

Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in RFC 2119. However, for readability, these words do not appear in all uppercase letters in this specification.

All of the text of this specification is normative except sections explicitly marked as non-normative, examples, and notes. [RFC2119]

Examples in this specification are introduced with the words “for example” or are set apart from the normative text with class="example", like this:

This is an example of an informative example.

Informative notes begin with the word “Note” and are set apart from the normative text with class="note", like this:

Note, this is an informative note.

Index

Terms defined by this specification

Terms defined by reference

References

Normative References

[AV1]
AV1 Bitstream & Decoding Process Specification. Standard. URL: https://aomediacodec.github.io/av1-spec/av1-spec.pdf
[AV1-ISOBMFF]
AV1 Codec ISO Media File Format Binding. Standard. URL: https://aomediacodec.github.io/av1-isobmff/
[CICP]
Recommendation ITU-T H.273 | ISO/IEC 23091-2, Coding-independent code points for video signal type identification. Standard. URL: https://www.itu.int/rec/T-REC-H.273
[CMAF]
ISO/IEC 23000-19, Information technology — Multimedia application format (MPEG-A) — Part 19: Common media application format (CMAF) for segmented media. Standard. URL: https://www.iso.org/standard/71975.html
[CTA-5001]
CTA-5001-C, Web Application Video Ecosystem - Content Specification. Standard. URL: https://shop.cta.tech/products/web-application-video-ecosystem-content-specification
[CTA-861]
ANSI/CTA-861-H, A DTV Profile for Uncompressed High Speed Digital Interfaces. Standard. URL: https://shop.cta.tech/products/a-dtv-profile-for-uncompressed-high-speed-digital-interfaces-cta-861-h
[DASH]
ISO/IEC 23009-1, Information technology — Dynamic adaptive streaming over HTTP (DASH) — Part 1: Media presentation description and segment formats. Standard. URL: https://www.iso.org/standard/79329.html
[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Best Current Practice. URL: https://datatracker.ietf.org/doc/html/rfc2119
[ST-2086]
SMPTE ST 2086:2018, Mastering Display Color Volume Metadata supporting High Luminance and Wide Color Gamut Images. Standard. URL: https://ieeexplore.ieee.org/document/8353899
[ST-2094-40]
SMPTE ST 2094-40:2020, Dynamic Metadata for Color Volume Transform - Application #4. Standard. URL: https://ieeexplore.ieee.org/document/9095450
[T35]
Recommendation ITU-T T.35 (02/2000), Procedure for the allocation of ITU-T defined codes for non standard facilities. Standard. URL: https://www.itu.int/rec/T-REC-T.35-200002-I

Informative References

[BT-2020]
Recommendation ITU-R BT.2020-2 (10/2015), Parameter values for ultra-high definition television systems for production and international programme exchange. Standard. URL: https://www.itu.int/rec/R-REC-BT.2020
[BT-2100]
Recommendation ITU-R BT.2100-2 (07/2018), Image parameter values for high dynamic range television for use in production and international programme exchange. Standard. URL: https://www.itu.int/rec/R-REC-BT.2100
[DASH-IOP]
Guideline for Implementation: DASH-IF Interoperability Points V4.3: On-Demand and Mixed Services, HDR Dynamic Metadata and other Improvements. Guidelines. URL: https://dashif.org
[ST-2084]
SMPTE ST 2084:2014, High Dynamic Range Electro-Optical Transfer Function of Mastering Reference Displays. Standard. URL: https://ieeexplore.ieee.org/document/7291452