[ogg-dev] Ogg/Kate preliminary documentation

Fri Feb 8 16:03:15 PST 2008

On 2/8/08, ogg.k.ogg.k at googlemail.com <ogg.k.ogg.k at googlemail.com> wrote:

> While this is possible to do it this way (and probably a good idea for the
> examples like a clock in a corner), it implies that all the placements and
> logically different "items" are known at the start of the stream (since the
> Ogg spec says a stream can't start midway through another stream, an
> interesting restriction, but which is there nonetheless). While this is fine
> for a file based stream, it is not if the stream is generated in realtime.

Right. This was, in fact, one of the roles of "chaining" where you'd
mark such changed components with a chain boundary, at which such
things are explicitly allowed to change. The drawbacks are the
overhead of resending all the setup data for configurable codecs like
vorbis and theora, and the semantic conflict between 'chain boundary
flags an edit point' and 'chain boundary flags a program change' which
have confused people implementing playlist-style representations of
chained streams for some time. CMML has a similar confusion as it can
be used for chapter markers as easily as dialog markup within a single
scene.

There are certainly arguments for doing it both ways, but from the
Annodex point of view it is nice to push as much of that onto the
mux/skeleton level as possible, for all the reasons Silvia described.
Do you have a counter illustration of where adding a new category
suddenly, on the fly is contra-compelling?

> While it is not used at the moment, I do have a "category" field in the ID
> header, meant to be a tag used by a player to know what is supplied by
> a particular stream (eg, the user may want to select a number of categories,
> such as "transcript" and "commentary", and a language, and two streams
> would be displayed by the player.

CMML 3.1 had a 'track' attribute that could be supplied to clips to
make a similar distinction. We discussed this quite a bit at LCA last
week and the general feeling was that we should remove that from CMML
itself, focussing on its role as a text track codec for the 4.0
revision, and push the multiple-stream affects up to the authoring
level, with either a new xml format for describing stream contents, or
in the stream itself.

We need something like a catagory (we were using "role", "lang", and a
couple of other labels) for the Ogg Skeleton message headers for use
by other media types in the stream in any case. For example, to
distinguish the main audio tracks from commentary, music or effects
only, and so on. To say that a particular visual overlay should be
applied to a particular video stream, and whether to do so by default,
or optionally. That sort of thing. So this mechanism must exist at the
per-stream level regardless of what a particular codec supports.

 -r