1. Scope
[AV1] defines the syntax and semantics of an AV1 bitstream. The AV1 Image File Format (AVIF) defined in this document supports the storage of a subset of the syntax and semantics of an AV1 bitstream in a [HEIF] file. The AV1 Image File Format defines multiple profiles, which restrict the allowed syntax and semantics of the AV1 bitstream with the goal to improve interoperability, especially for hardware implementations. The profiles defined in this specification follow the conventions of the [MIAF] specification. Images encoded with AV1 and not meeting the restrictions of the defined profiles may still be compliant to this AV1 Image File Format if they adhere to the general AVIF requirements.
AV1 Image File Format supports High Dynamic Range (HDR) and Wide Color Gamut (WCG) images as well as Standard Dynamic Range (SDR). It supports monochrome images as well as multi-channel images with all the bit depths and color spaces specified in [AV1].
AV1 Image File Format also supports multi-layer images as specified in [AV1] to be stored both in image items and image sequences.
An AVIF file is designed to be a conformant [HEIF] file for both image items and image sequences. Specifically, this specification follows the recommendations given in "Annex I: Guidelines On Defining New Image Formats and Brands" of [HEIF].
This specification reuses syntax and semantics used in [AV1-ISOBMFF].
2. Image Items and properties
2.1. AV1 Image Item
When an item is of type av01, it is called an AV1 Image Item, and shall obey the following constraints:
-
The AV1 Image Item shall be a conformant MIAF image item.
-
The AV1 Image Item shall be associated with an AV1 Item Configuration Property.
-
The content of an AV1 Image Item is called the AV1 Image Item Data and shall obey the following constraints:
-
The AV1 Image Item Data shall be identical to the content of an AV1 Sample marked as sync, as defined in [AV1-ISOBMFF].
-
The AV1 Image Item Data shall have exactly one Sequence Header OBU.
-
If the AV1 Image Item Data consists of a single frame (i.e. when using a single layer),
-
It should have its
still_picture
flag set to 1. -
It should have its
reduced_still_picture_header
flag set to 1.
-
-
2.2. Image Item Properties
2.2.1. AV1 Item Configuration Property
Box Type: av1C Property type: Descriptive item property Container: ItemPropertyContainerBox Mandatory (per item): Yes, for an image item of type 'av01' Quantity: One for an image item of type 'av01'
The syntax and semantics of the AV1 Item Configuration Property are identical to those of the AV1CodecConfigurationBox defined in [AV1-ISOBMFF], with the following constraints:
-
Sequence Header OBUs should not be present in the AV1CodecConfigurationBox.
-
If a Sequence Header OBU is present in the AV1CodecConfigurationBox, it shall match the Sequence Header OBU in the AV1 Image Item Data.
-
The values of the fields in the AV1CodecConfigurationBox shall match those of the Sequence Header OBU in the AV1 Image Item Data.
-
Metadata OBUs, if present, shall match the values given in other item properties, such as the PixelInformationProperty or ColourInformationBox.
This property should be marked as essential.
2.2.2. Image Spatial Extents Property
The semantics of the ispe property as defined in [HEIF] apply. More specifically, for AV1 images, the values of image_width and image_height shall respectively equal the values of FrameWidth and FrameHeight as defined in [AV1] but for a specific frame in the item payload. The exact frame depends on the presence and content of the lsel and OperatingPointSelectorProperty properties as follows:
-
In the absence of a lsel property associated with the item, or if it is present and its layer_id value is set to 0xFFFF:
-
If no OperatingPointSelectorProperty is associated with the item, the ispe shall document the dimensions of the last frame decoded when processing the operating point whose index is 0.
-
If an OperatingPointSelectorProperty is associated with the item, the ispe property shall document the dimensions of the last frame decoded when processing the corresponding operating point.
NOTE: The dimensions of possible intermediate output images might not match the ones given in the ispe property. If they display these intermediate images, renderers are expected to scale the output image to match the ispe property.
-
-
If a lsel property is associated with an item and its layer_id is different from 0xFFFF, the ispe property documents the dimensions of the output frame produced by decoding the corresponding layer.
NOTE: The dimensions indicated in the ispe property might not match the values max_frame_width_minus1+1 and max_frame_height_minus1+1 indicated in the AV1 bitstream.
NOTE: The values of render_width_minus1 and render_height_minus1 possibly present in the AV1 bistream are not exposed in the AVIF container level.
2.2.3. Other Item Properties
In addition to the Image Properties defined in [HEIF], such as colr, pixi or pasp, AV1 image items MAY also be associated with clli, cclv and mdcv introduced in [MIAF].
In general, it is recommended to use properties instead of Metadata OBUs in the AV1 Item Configuration Property.
NOTE: Although the clean aperture property (clap) defined in [HEIF] is applicable to AVIF, implementers of authoring tools should be aware of the possibility of unintended consequences since users may not realize image data outside the clap region is still in the file. A future revision of this specification may place normative restrictions on how clap can be used.
2.3. AV1 Layered Image Items
2.3.1. Overview
[AV1] supports encoding a frame using multiple spatial layers. A spatial layer may improve the resolution or quality of the image decoded based on one or more of the previous layers. A layer may also provide an image that does not depend on the previous layers. Additionally, not all layers are expected to produce an image meant to be rendered. Some decoded images may be used only as intermediate decodes. Finally, layers are grouped into one or more Operating Points. The Sequence Header OBU defines the list of Operating Points, provides required decoding capabilities, and indicates which layers form each Operating Point.
[AV1] delegates the selection of which Operating Point to process to the application, by means of a function called choose_operating_point()
. AVIF defines the OperatingPointSelectorProperty to control this selection. In the absence of an OperatingPointSelectorProperty associated with an AV1 Image Item, the AVIF renderer is free to process any Operating Point present in the AV1 Image Item Data. In particular, when the AV1 Image Item is composed of a unique Operating Point, the OperatingPointSelectorProperty should not be present. If an OperatingPointSelectorProperty is associated with an AV1 Image Item, the op_index
field indicates which Operating Point is expected to be processed for this item.
NOTE: When an author wants to offer the ability to render multiple Operating Points from the same AV1 image (e.g. in the case of multi-view images), multiple AV1 Image Items can be created that share the same AV1 Image Item Data but have different OperatingPointSelectorPropertys.
[AV1] expects the renderer to display only one frame within the selected Operating Point, which should be the highest spatial layer that is both within the Operating Point and present within the temporal unit, but [AV1] leaves the option for other applications to set their own policy about which frames are output, as defined in the general output process. AVIF sets a different policy, and defines how the lsel property (mandated by [HEIF] for layered images) is used to control which layer is rendered. According to [HEIF], the interpretation of the layer_id field in the lsel property is codec specific. In this specification, the value 0xFFFF is reserved for a special meaning. If a lsel property is associated with an AV1 Image Item but its layer_id value is set to 0xFFFF, the renderer is free to render either only the output image of the highest spatial layer, or to render all output images of all the intermediate layers and the highest spatial layer, resulting in a form of progressive decoding. If a lsel property is associated with an AV1 Image Item and the value of layer_id is not 0xFFFF, the renderer is expected to render only the output image for that layer.
NOTE: When such a progressive decoding of the layers within an Operating Point is not desired or when an author wants to expose each layer as a specific item, multiple AV1 Image Items sharing the same AV1 Image Item Data can be created and associated with different lsel properties, each with a different value of layer_id.
2.3.2. Properties
2.3.2.1. Operating Point Selector Property
2.3.2.1.1. Definition
Box Type: a1op
Property type: Descriptive item property
Container: ItemPropertyContainerBox
Mandatory: No
Quantity: Zero or one
2.3.2.1.2. Description
An OperatingPointSelectorProperty may be associated with an AV1 Image Item to provide the index of the operating point to be processed for this item. If associated, it shall be marked as essential.
2.3.2.1.3. Syntax
class OperatingPointSelectorProperty extends ItemProperty('a1op') { unsigned int(8) op_index; }
2.3.2.1.4. Semantics
op_index indicates the index of the operating point to be processed for this item. Its value shall be between 0 and operating_points_cnt_minus_1.
2.3.2.2. Layer Selector Property
The lsel property defined in [HEIF] may be associated with an AV1 Image Item. The layer_id indicates the value of the spatial_id to render. The value shall be between 0 and 3, or the special value 0xFFFF. When a value between 0 and 3 is used, the corresponding spatial layer shall be present in the bitstream and shall produce an output frame. Other layers may be needed to decode the indicated layer. When the special value 0xFFFF is used, progressive decoding is allowed as described in § 2.3.1 Overview.2.3.2.3. Layered Image Indexing Property
2.3.2.3.1. Definition
Box Type: a1lx
Property type: Descriptive item property
Container: ItemPropertyContainerBox
Mandatory: No
Quantity: Zero or one
2.3.2.3.2. Description
The AV1LayeredImageIndexingProperty property may be associated with an AV1 Image Item. It should not be associated with AV1 Image Items consisting of only one layer.
The AV1LayeredImageIndexingProperty documents the size in bytes of each layer (except the last one) in the AV1 Image Item Data, and enables determining the byte ranges required to process one or more layers of an Operating Point. If associated, it shall not be marked as essential.
2.3.2.3.3. Syntax
class AV1LayeredImageIndexingProperty extends ItemProperty('a1lx') { unsigned int(7) reserved = 0; unsigned int(1) large_size; FieldLength = (large_size + 1) * 16; unsigned int(FieldLength) layer_size[3]; }
2.3.2.3.4. Semantics
layer_size indicates the number of bytes corresponding to each layer in the item payload, except for the last layer. Values are provided in increasing order of spatial_id. A value of zero means that all the layers except the last one have been documented and following values shall be 0. The number of non-zero values shall match the number of layers in the image minus one.
NOTE: The size of the last layer can be determined by subtracting the sum of the sizes of all layers indicated in this property from the entire item size.
3. Image Sequences
An AV1 Image Sequence is defined as a set of AV1 Temporal Units stored in an AV1 track as defined in [AV1-ISOBMFF] with the following constraints:
-
The track shall be a valid MIAF image sequence.
-
The track handler for an AV1 Image Sequence shall be
pict
. -
The track shall have only one sample description entry.
-
If multiple Sequence Header OBUs are present in the track payload, they shall be identical.
4. Auxiliary Image Items and Sequences
An AV1 Auxiliary Image Item (respectively an AV1 Auxiliary Image Sequence) is an AV1 Image Item (respectively AV1 Image Sequence) with the following additional constraints:
-
It shall be a compliant MIAF Auxiliary Image Item (respectively MIAF Auxiliary Image Sequence).
-
The
mono_chrome
field in the Sequence Header OBU shall be set to 1. -
The
color_range
field in the Sequence Header OBU shall be set to 1.
An AV1 Alpha Image Item (respectively an AV1 Alpha Image Sequence) is an AV1 Auxiliary Image Item (respectively an AV1 Auxiliary Image Sequence), and as defined in [MIAF], with the aux_type
field of the AuxiliaryTypeProperty
(respectively AuxiliaryTypeInfoBox
) set to urn:mpeg:mpegB:cicp:systems:auxiliary:alpha
. An AV1 Alpha Image Item (respectively an AV1 Alpha Image Sequence) shall be encoded with the same bit depth as the associated master AV1 Image Item (respectively AV1 Image Sequence).
For AV1 Alpha Image Item and AV1 Alpha Image Sequence, the ColourInformationBox should be omitted. If present, readers shall ignore it.
An AV1 Depth Image Item (respectively an AV1 Depth Image Sequence) is an AV1 Auxiliary Image Item (respectively an AV1 Auxiliary Image Sequence), and as defined in [MIAF], with the aux_type
field of the AuxiliaryTypeProperty
(respectively AuxiliaryTypeInfoBox
) set to urn:mpeg:mpegB:cicp:systems:auxiliary:depth
.
NOTE: [AV1] supports encoding either 3-component images (whose semantics are given by the matrix_coefficients
element), or 1-component images (monochrome). When an image requires a different number of components, multiple auxiliary images may be used, each providing additional component(s), according to the semantics of their aux_type
field. In such case, the maximum number of components is restricted by number of possible items in a file, coded on 16 or 32 bits.
5. Brands, Internet media types and file extensions
5.1. Brands overview
As defined by [ISOBMFF], the presence of a brand in the compatible_brands
list in the FileTypeBox
can be interpreted as the permission for those AV1 Image File Format readers/parsers and AV1 Image File Format renderers that only implement the features required by the brand, to process the corresponding file and only the parts (e.g. items or sequences) that comply with the brand.
An AV1 Image File Format file may conform to multiple brands. Similarly, an AV1 Image File Format reader/parser or AV1 Image File Format renderer may be capable of processing the features associated with one or more brands.
If any of the brands defined in this document is specified in the major_brand
field of the FileTypeBox, the file extension and Internet Media Type should respectively be ".avif
" and "image/avif
" as defined in § 8 AVIF Media Type Registration.
5.2. AVIF image and image collection brand
The brand to identify AV1 image items is avif.Files that indicate this brand in the compatible_brands field of the FileTypeBox shall comply with the following:
-
The primary item shall be an AV1 Image Item or be a derived image that references directly or indirectly one or more items that all are AV1 Image Items.
-
AV1 auxiliary image items may be present in the file.
Files that conform with these constraints should include the brand avif
in the compatible_brands field of the FileTypeBox.
Additionally, the brand avio is defined. If the file indicates the brand avio
in the compatible_brands field of the FileTypeBox, then the primary item or all the items referenced by the primary item shall be AV1 image items made only of Intra Frames. Conversely, if the previous constraint applies, the brand avio
should be used in the compatible_brands field of the FileTypeBox.
5.3. AVIF image sequence brands
The brand to identify AVIF image sequences is avis.Files that indicate this brand in the compatible_brands field of the FileTypeBox shall comply with the following:
-
they shall contain one or more AV1 image sequences.
-
they may contain AV1 auxiliary image sequences.
Files that conform with these constraints should include the brand avis
in the compatible_brands field of the FileTypeBox.
Additionally, if a file contains AV1 image sequences and the brand avio
is used in the compatible_brands field of the FileTypeBox, the item constraints for this brand shall be met and at least one of the AV1 image sequences shall be made only of AV1 Samples marked as sync. Conversely, if such a track exists and the constraints of the brand avio
on AV1 image items are met, the brand should be used.
NOTE: As defined in [MIAF], a file that is primarily an image sequence still has at least an image item. Hence, it can also declare brands for signaling the image item.
6. General constraints
The following constraints are common to files compliant with this specification:
-
The file shall be compliant with the [MIAF] specification and list miaf in the compatible_brands field of the FileTypeBox.
-
The file shall list
'avif'
or'avis'
in the compatible_brands field of the FileTypeBox. -
If transformative properties are used in derivation chains (as defined in [MIAF]), they shall only be associated with items that are not referenced by another derived item. For example, if a file contains a grid item and its referenced coded image items, cropping, mirroring or rotation transformations are only permitted on the grid item itself.
NOTE: This constraint further restricts files compared to [MIAF].
7. Profiles
7.1. Overview
The profiles defined in this section are for enabling interoperability between AV1 Image File Format files and AV1 Image File Format readers/parsers. A profile imposes a set of specific restrictions and is signaled by brands defined in this specification.
The FileTypeBox should declare at least one profile that enables decoding of the primary image item. It is not an error for the encoder to include an auxiliary image that is not allowed by the specified profile(s).
If 'avis'
is declared in the FileTypeBox and a profile is declared in the FileTypeBox, the profile shall also enable decoding of at least one image sequence track. The profile should allow decoding of any associated auxiliary image sequence tracks, unless it is acceptable to decode the image sequence without its auxiliary image sequence tracks.
It is possible for a file compliant to this AV1 Image File Format to not be able to declare an AVIF profile, if the corresponding AV1 encoding characteristics do not match any of the defined profiles.
NOTE: [AV1] supports 3 bit depths: 8, 10 and 12 bits, and the maximum dimensions of a coded image is 65536x65536, when seq_level_idx is set to 31 (maximum parameters level).
7.2. AVIF Baseline Profile
This section defines the MIAF AV1 Baseline profile of [HEIF], specifically for [AV1] bitstreams, based on the constraints specified in [MIAF] and identified by the brand MA1B.
If the brand MA1B
is in the list of compatible_brands of the FileTypeBox, the common constraints in the section § 5 Brands, Internet media types and file extensions shall apply.
The following additional constraints apply to all AV1 Image Items and all AV1 Image Sequences:
-
The AV1 profile shall be the Main Profile and the level shall be 5.1 or lower.
NOTE: AV1 tiers are not constrained because timing is optional in image sequences and are not relevant in image items or collections.
NOTE: Level 5.1 is chosen for the Baseline profile to ensure that no single coded image exceeds 4k resolution, as some decoder may not be able to handle larger images. More precisely, following [AV1] level definitions, coded image items compliant to the AVIF Baseline profile may not have a number of pixels greater than 8912896, a width greater than 8192 or a height greater than 4352. It is still possible to use the Baseline profile to create larger images using grid derivation.
avif, mif1, miaf, MA1B
A file containing a pict track compliant with this profile is expected to list the following brands, in any order, in the compatible_brands of the FileTypeBox:
avis, msf1, miaf, MA1B
A file containing a pict track compliant with this profile and made only of samples marked sync is expected to list the following brands, in any order, in the compatible_brands of the FileTypeBox:
avis, avio, msf1, miaf, MA1B
7.3. AVIF Advanced Profile
This section defines the MIAF AV1 Advanced profile of [HEIF], specifically for [AV1] bitstreams, based on the constraints specified in [MIAF] and identified by the brand MA1A.
If the brand MA1A
is in the list of compatible_brands of the FileTypeBox, the common constraints in the section § 5 Brands, Internet media types and file extensions shall apply.
The following additional constraints apply to all AV1 Image Items:
-
The AV1 profile shall be the High Profile and the level shall be 6.0 or lower.
NOTE: Following [AV1] level definitions, coded image items compliant to the AVIF Advanced profile may not have a number of pixels greater than 35651584, a width greater than 16384 or a height greater than 8704. It is still possible to use the Advanced profile to create larger images using grid derivation.
The following additional constraints apply only to AV1 Image Sequences:
-
The AV1 profile shall be either Main Profile or High Profile.
-
The AV1 level for Main Profile shall be 5.1 or lower.
-
The AV1 level for High Profile shall be 5.1 or lower.
avif, mif1, miaf, MA1A
A file containing a pict track compliant with this profile is expected to list the following brands, in any order, in the compatible_brands of the FileTypeBox:
avis, msf1, miaf, MA1A
8. AVIF Media Type Registration
The media type "image/avif"
is officially registered with IANA and available at: https://www.iana.org/assignments/media-types/image/avif.
9. Changes since v1.0.0 release
-
Constrain image sequence to one sample description entry and constant sequence header.
-
Clarify that constraints on still picture flags apply to non-layered images.
-
Replace Media Type section with link to IANA official registration.
-
Define properties for layered images to allow selective or progressive decoding of layers.
-
Add restriction on transformative properties in derivation chains.
-
Extend semantics of avio brand to image items and clarify brand usage.
-
Remove wrong recommendations regarding still picture flags in image sequences.
-
constrain auxiliary images to be full range, and ignore colr for alpha planes.
-
Rephrase statement about auxiliary images and profiles. (Editorial change)