[ogg-dev] Skeletal relations
giles at xiph.org
Fri Feb 15 15:56:42 PST 2008
We have new drafts of CMML 4.0 as a text codec and ROE as an xml
stream abstract, subsuming the authoring support in CMML 3.1 and
Another thing we talked about at LCA is a how to specify
relationships between the various streams in Ogg so that a server,
muxer or player can make intelligent decisions about the contained
tracks. The general idea is to use the (http-style) Message Headers
in the Skeleton track to describe each logical bitstream, but no one
has ever written anything down. This is a proposal to get the ball
* Distinguish alternates based on language
* Distinguish among subtitles for translation,
for hearing impaired
* Distinguish commentary tracks
* Distinguish overlays from alternates from primaries
= Self description =
The following message headers describe the corresponding track,
Machine parsable locale string describing the language the track is
in. Used for example to choose the default audio track based on user
Free form qualifier used to mark the category of the track content.
We will define a basic set of role-types with standard meaning for
machine interpretation. Example role-types:
commentary (e.g. Director's commentary on a film)
transcription (e.g. detailed record)
interpretation (contains additional information)
slides (visual aides accompanying another track)
Of these, commentary is the only one I'd really like to have. Some
other ideas: logo, ticker, credits, translation. The last is
effectively the default though. Logo or ticker might be useful to
have a different default for whether they are overlaid or not.
Human readable description of the track, intended for display in a
user interface. This can be localized by appending '.<locale>' to
We could also copy general metadata here, e.g. title, creator, date,
location, license. That's perhaps more interesting in the fishead
packet which describes the stream as a whole rather than the
Arbitrary tag for distinguishing a group of tracks from an unrelated
group they happen to be multiplexed with. For example, three separate
programs might be sent over the same link multiplexed together, but
only audio and video tracks with the same value for the Program
message header should be played together. The default 'empty' program
is a valid program, every fishbone without this message header marks
the corresponding track as being in the default program.
= Relations =
The self-description allows us to prioritize tracks implicitly, based
on user preferences for showing audio, video, text, or some
combination, preferred languages and roles. But there remain areas of
ambiguity, so we define a way to mark relationships with other
streams. The value of each of these is an Ogg stream serial number.
This track doesn't (necessarily) stand on its own but is meant to be
laid on top of another track. This distinguishes, for example a MNG
video (no Overlays) with MNG subtitles (Overlays: corresponding
theora video). Another example might be a vocal audio track that can
be mixed with a music-only karaoke track.
Indicates that a track is an alternate or substitute for another.
Indicates that a track is an alternate language or media version of
Indicates that a track should be played together with another,
instead of being treated as alternates.
Of these, Overlays is the only one I'm really clear on the use case
for. I think the others could be handled just as well by specifying
heuristics: tracks of the same media type and role with different
lang Translate each other. Tracks with the same media type, program,
role that don't overlay another are Parallels.
Question: Is it better to specify multiple relations with a list of
serial numbers, or with multiple message headers?
More information about the ogg-dev