Carriage of ID3 Timed Metadata in the Common Media Application Format (CMAF)

v1.0.0, 6 April 2020

This version:

GitHub Version

Issue tracking:

GitHub Issues


Krasimir Kolarov,; John Simmons,

Copyright 2020, The Alliance for Open Media

Licensing information is available at

The MATERIALS ARE PROVIDED “AS IS.” The Alliance for Open Media, its members, and its contributors expressly disclaim any warranties (express, implied, or otherwise), including implied warranties of merchantability, non-infringement, fitness for a particular purpose, or title, related to the materials. The entire risk as to implementing or otherwise using the materials is assumed by the implementer and user. IN NO EVENT WILL THE ALLIANCE FOR OPEN MEDIA, ITS MEMBERS, OR CONTRIBUTORS BE LIABLE TO ANY OTHER PARTY FOR LOST PROFITS OR ANY FORM OF INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES OF ANY CHARACTER FROM ANY CAUSES OF ACTION OF ANY KIND WITH RESPECT TO THIS DELIVERABLE OR ITS GOVERNING AGREEMENT, WHETHER BASED ON BREACH OF CONTRACT, TORT (INCLUDING NEGLIGENCE), OR OTHERWISE, AND WHETHER OR NOT THE OTHER MEMBER HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.


This specification defines how ID3 metadata can be carried as timed metadata in Common Media Application Format (CMAF) compatible fragmented MP4 streams using Event Message (‘emsg’) boxes.



HTTP Live Streaming (HLS) [HLS] supports the inclusion of timed metadata in ID3 format [ID3] in various container formats, as described in [TM-HLS].

A large ecosystem has built up around carrying timed ID3 metadata in HLS for applications such as ad delivery & audience measurement. There are many benefits to adopting CMAF for HLS media delivery, but without a specification for carrying ID3 as sparse timed metadata in CMAF, deployment by companies in this ecosystem is blocked.

This specification describes how such ID3 metadata can be carried as timed metadata in a CMAF-compatible fragmented MP4 (fMP4) stream [CMAF] as used by the HLS protocol.

CMAF-compatible fragmented MP4 can also be used in DASH. The elements defined in this specification may also be used with DASH.


Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in [RFC2119]. For readability, these words do not appear in all uppercase letters in this specification.

Timed Metadata in a CMAF-compatible stream


Timed Metadata in a CMAF-compatible stream is signaled via one or more Event Message boxes (‘emsg’) [CMAF] per segment.

Event messages with the scheme specified in this document will identify boxes that carry ID3v2 metadata [ID3].

ID3 Metadata in an Event Message Box


One or more Event Message boxes (‘emsg’) [CMAF] can be included per segment. Version 1 of the Event Message box [DASH] must be used.


For convenience, the follow box definition is reproduced from [DASH], section

aligned(8) class DASHEventMessageBox extends FullBox('emsg', version, flags = 0) {
    if (version==0) {
        string              scheme_id_uri;
        string              value;
        unsigned int(32)    timescale;
        unsigned int(32)    presentation_time_delta;
        unsigned int(32)    event_duration;
        unsigned int(32)    id;
    } else if (version==1) {
        unsigned int(32)    timescale;
        unsigned int(64)    presentation_time;
        unsigned int(32)    event_duration;
        unsigned int(32)    id;
        string              scheme_id_uri;
        string              value;
    unsigned int(8) message_data[];

scheme_id_uri MUST be set to to identify ID3v2 metadata [ID3].

value may either be an absolute or relative user-specified URI which defines the semantics of the id field. Any relative URI is considered to be relative to the scheme_id_uri.

message_data MUST contain complete ID3 version 2.4 data [ID3].

In general, ID3 don’t carry a duration and in those cases the event_duration field should be set to 0xFFFFFFFF. If in a particular case, the ID3 message carries a duration, it should be reflected in the event_duration field.

The presentation_time must be within the time interval of the fragment.

The id field is not restricted in this version of the specification.


Files compliant to this specification should signal it using the brand aid3 as part of the list compatible brands in the file type box. Manifest formats using files compliant to this specification may signal these files using the following URN: urn:aomedia:cmaf:id3.


The following documents are cited in this specification.

Normative References

  • [CMAF]
    International Organization for Standardization, “Information technology – Multimedia application format (MPEG-A) – Part 19: Common media application format (CMAF) for segmented media”, ISO/IEC 23000-19:2018(E), 2018,

  • [DASH]
    International Organization for Standardization, “Information technology – Dynamic adaptive streaming over HTTP (DASH) – Part 1: Media presentation description and segment formats”, ISO/IEC 23009-1:2014(E): Draft third edition, 2018-07-26,

  • [RFC2119]
    Internet Engineering Task Force, “Key words for use in RFCs to Indicate Requirement Levels”, S. Bradner, RFC2119, March 1997,

Informative References