[ogg-dev] Skeletal relations

Silvia Pfeiffer silviapfeiffer1 at gmail.com
Sat Mar 22 07:25:46 PDT 2008


Ralph,

I was thinking about this proposal quite a bit today.

I agree, using ROE as a description format on the server for tracks
that are available to a media resource, say
http://example.org/video.ogx with the following potential tracks
{A,B,C,D}, we now have a means of communicating between server and
client about what data the server should actually put into the stream.

I am not sure I entirely agree with the method of communication
between client and server.

At one point at FOMS/LCA we discussed using URLs as a communication mechanism:

  http://example.org/video.ogx?tracks=A,B,D&t=15

  would return the resource video.ogx with only tracks A,B, and D and
only from time offset 15sec onwards.

Such a scenario is possible as soon as the client has received the ROE
file that describes video.ogx, so can determine itself which tracks to
include in the request.


Your example below is rather different: the client will ask from the
server a particular language and the server will itself select the
correct tracks that go with it. I believe this sort of selection
should be restricted to Content-Language: <locale>, and rather more
generically it should be restricted to the values of the "distinction"
attribute of the switch statement in ROE (see
http://wiki.xiph.org/index.php/ROE).

In contrast, the "provides" attributes should be specified in the URL
with the "tracks" query parameter as described above, e.g.
http://example.org/video.ogx?tracks=video:v1, audio:a1b2,
text_overlay:t1, logo: 1. This is rather explicit, while the HTTP
header message fields do it implicitly.

I believe we need both methods and it might be time to start adding it
to the wiki at http://wiki.xiph.org/index.php/ROE.

Cheers,
Silvia.



On Sat, Feb 16, 2008 at 10:56 AM, Ralph Giles <giles at xiph.org> wrote:
> We have new drafts of CMML 4.0 as a text codec and ROE as an xml
>  stream abstract, subsuming the authoring support in CMML 3.1 and
>  earlier.
>
>  Another thing we talked about at LCA is a how to specify
>  relationships between the various streams in Ogg so that a server,
>  muxer or player can make intelligent decisions about the contained
>  tracks. The general idea is to use the (http-style) Message Headers
>  in the Skeleton track to describe each logical bitstream, but no one
>  has ever written anything down. This is a proposal to get the ball
>  rolling.
>
>  Requirements:
>
>  * Distinguish alternates based on language
>  * Distinguish among subtitles for translation,
>    for hearing impaired
>  * Distinguish commentary tracks
>  * Distinguish overlays from alternates from primaries
>
>  = Self description =
>
>  The following message headers describe the corresponding track,
>  metadata essentially.
>
>  Lang: <locale>
>
>  Machine parsable locale string describing the language the track is
>  in. Used for example to choose the default audio track based on user
>  preferences.
>
>  Role: <role-type>
>
>  Free form qualifier used to mark the category of the track content.
>  We will define a basic set of role-types with standard meaning for
>  machine interpretation. Example role-types:
>
>    commentary (e.g. Director's commentary on a film)
>    transcription (e.g. detailed record)
>    interpretation (contains additional information)
>    slides (visual aides accompanying another track)
>
>  Of these, commentary is the only one I'd really like to have. Some
>  other ideas: logo, ticker, credits, translation. The last is
>  effectively the default though. Logo or ticker might be useful to
>  have a different default for whether they are overlaid or not.
>
>  Description: <string>
>
>  Human readable description of the track, intended for display in a
>  user interface. This can be localized by appending '.<locale>' to
>  Description.
>
>  We could also copy general metadata here, e.g. title, creator, date,
>  location, license. That's perhaps more interesting in the fishead
>  packet which describes the stream as a whole rather than the
>  individual tracks.
>
>  Program: <string>
>
>  Arbitrary tag for distinguishing a group of tracks from an unrelated
>  group they happen to be multiplexed with. For example, three separate
>  programs might be sent over the same link multiplexed together, but
>  only audio and video tracks with the same value for the Program
>  message header should be played together. The default 'empty' program
>  is a valid program, every fishbone without this message header marks
>  the corresponding track as being in the default program.
>
>
>  = Relations =
>
>  The self-description allows us to prioritize tracks implicitly, based
>  on user preferences for showing audio, video, text, or some
>  combination, preferred languages and roles. But there remain areas of
>  ambiguity, so we define a way to mark relationships with other
>  streams. The value of each of these is an Ogg stream serial number.
>
>  Overlays: <serialno>
>
>  This track doesn't (necessarily) stand on its own but is meant to be
>  laid on top of another track. This distinguishes, for example a MNG
>  video (no Overlays) with MNG subtitles (Overlays: corresponding
>  theora video). Another example might be a vocal audio track that can
>  be mixed with a music-only karaoke track.
>
>  Substitutes: <serialno>
>
>  Indicates that a track is an alternate or substitute for another.
>
>  Translates: <serialno>
>
>  Indicates that a track is an alternate language or media version of
>  another track.
>
>  Parallels: <serialno>
>
>  Indicates that a track should be played together with another,
>  instead of being treated as alternates.
>
>  Of these, Overlays is the only one I'm really clear on the use case
>  for. I think the others could be handled just as well by specifying
>  heuristics: tracks of the same media type and role with different
>  lang Translate each other. Tracks with the same media type, program,
>  role that don't overlay another are Parallels.
>
>  Question: Is it better to specify multiple relations with a list of
>  serial numbers, or with multiple message headers?
>
>  Thoughts?
>   -r
>  _______________________________________________
>  ogg-dev mailing list
>  ogg-dev at xiph.org
>  http://lists.xiph.org/mailman/listinfo/ogg-dev
>


More information about the ogg-dev mailing list