Carriage of ID3 Timed Metadata in the Common Media Application Format (CMAF)

v1.0.0

AOM Final Deliverable,

This version:
https://AOMediaCodec.github.io/id3-emsg
Issue Tracking:
GitHub
Editors:

Copyright 2020, AOM

Licensing information is available at http://aomedia.org/license/

The MATERIALS ARE PROVIDED “AS IS.” The Alliance for Open Media, its members, and its contributors expressly disclaim any warranties (express, implied, or otherwise), including implied warranties of merchantability, non-infringement, fitness for a particular purpose, or title, related to the materials. The entire risk as to implementing or otherwise using the materials is assumed by the implementer and user. IN NO EVENT WILL THE ALLIANCE FOR OPEN MEDIA, ITS MEMBERS, OR CONTRIBUTORS BE LIABLE TO ANY OTHER PARTY FOR LOST PROFITS OR ANY FORM OF INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES OF ANY CHARACTER FROM ANY CAUSES OF ACTION OF ANY KIND WITH RESPECT TO THIS DELIVERABLE OR ITS GOVERNING AGREEMENT, WHETHER BASED ON BREACH OF CONTRACT, TORT (INCLUDING NEGLIGENCE), OR OTHERWISE, AND WHETHER OR NOT THE OTHER MEMBER HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.


Abstract

This specification defines how ID3 metadata can be carried as timed metadata in Common Media Application Format (CMAF) compatible fragmented MP4 streams using Event Message ('emsg') boxes.

1. Introduction

HTTP Live Streaming (HLS) [HLS] supports the inclusion of timed metadata in ID3 format [ID3] in various container formats, as described in [TM-HLS].

A large ecosystem has built up around carrying timed ID3 metadata in HLS for applications such as ad delivery & audience measurement. There are many benefits to adopting CMAF for HLS media delivery, but without a specification for carrying ID3 as sparse timed metadata in CMAF, deployment by companies in this ecosystem is blocked.

This specification describes how such ID3 metadata can be carried as timed metadata in a CMAF-compatible fragmented MP4 (fMP4) stream [CMAF] as used by the HLS protocol.

CMAF-compatible fragmented MP4 can also be used in DASH. The elements defined in this specification may also be used with DASH.

2. Timed Metadata in a CMAF-compatible stream

2.1. Overview

Timed Metadata in a CMAF-compatible stream is signaled via one or more Event Message boxes (emsg) [CMAF] per segment.

Event messages with the scheme specified in this document will identify boxes that carry ID3v2 metadata [ID3].

2.2. ID3 Metadata in an Event Message Box

2.2.1. Introduction

One or more Event Message boxes (emsg) [CMAF] can be included per segment. Version 1 of the Event Message box [DASH] must be used.

2.2.2. Syntax

For convenience, the follow box definition is reproduced from [DASH], section 5.10.3.3.3.

aligned(8) class DASHEventMessageBox extends FullBox('emsg', version, flags = 0)
{
  if (version==0) {
    string              scheme_id_uri;
    string              value;
    unsigned int(32)    timescale;
    unsigned int(32)    presentation_time_delta;
    unsigned int(32)    event_duration;
    unsigned int(32)    id;
  } else if (version==1) {
    unsigned int(32)    timescale;
    unsigned int(64)    presentation_time;
    unsigned int(32)    event_duration;
    unsigned int(32)    id;
    string              scheme_id_uri;
    string              value;
  }
  unsigned int(8) message_data[];
}

2.2.3. Semantics

scheme_id_uri MUST be set to https://aomedia.org/emsg/ID3 to identify ID3v2 metadata [ID3].

value may either be an absolute or relative user-specified URI which defines the semantics of the id field. Any relative URI is considered to be relative to the scheme_id_uri.

message_data MUST contain complete ID3 version 2.4 data [ID3].

In general, ID3 don’t carry a duration and in those cases the event_duration field should be set to 0xFFFFFFFF. If in a particular case, the ID3 message carries a duration, it should be reflected in the event_duration field.

The presentation_time must be within the time interval of the fragment.

The id field is not restricted in this version of the specification.

2.3. Signaling

Files compliant to this specification should signal it using the brand aid3 as part of the list compatible brands in the file type box. Manifest formats using files compliant to this specification may signal these files using the following URN: urn:aomedia:cmaf:id3.

Conformance

Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in RFC 2119. However, for readability, these words do not appear in all uppercase letters in this specification.

All of the text of this specification is normative except sections explicitly marked as non-normative, examples, and notes. [RFC2119]

Examples in this specification are introduced with the words “for example” or are set apart from the normative text with class="example", like this:

This is an example of an informative example.

Informative notes begin with the word “Note” and are set apart from the normative text with class="note", like this:

Note, this is an informative note.

Index

Terms defined by this specification

Terms defined by reference

References

Normative References

[CMAF]
Information technology — Multimedia application format (MPEG-A) — Part 19: Common media application format (CMAF) for segmented media. Standard. URL: http://www.iso.org/iso/catalogue_detail?csnumber=71975
[DASH]
Information technology — Dynamic adaptive streaming over HTTP (DASH) — Part 1: Media presentation description and segment formats. Standard. URL: https://www.iso.org/standard/65274.html
[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Best Current Practice. URL: https://datatracker.ietf.org/doc/html/rfc2119

Informative References

[HLS]
HTTP Live Streaming. Standard. URL: https://tools.ietf.org/html/rfc8216
[ID3]
The ID3 audio file data tagging format. Standard. URL: http://www.id3.org/Developer_Information
[TM-HLS]
Timed Metadata for HTTP Live Streaming. Documentation. URL: https://developer.apple.com/library/archive/documentation/AudioVideo/Conceptual/HTTP_Live_Streaming_Metadata_Spec/Introduction/Introduction.html