[vorbis-dev] ogg stream-id options

Ralph Giles giles at ashlu.bc.ca
Thu Nov 16 20:24:14 PST 2000



In http://advogato.net/person/rakholh/diary.html?start=165 Ali wrote:

>  For same reason I am arguing with people on vorbis-dev - but I don't
> understand what the argument is about (considering that the vorbis
> developers proposed a solution which mjs and I thought was reasonable,
> and then some developers decided to criticize us again for no reason).

Goodness, get dropped from the list and miss the return of the son of the
favorite flamewar. :^)

I don't know what you're arguing about either. But monty's talking about
adding a toc substream sooner rather than later, so here's how I see our
options:

Right now, the only thing we produce are 'degenerate' ogg files. They
*only* contain a single vorbis audio stream. We've been telling people the
file extension is .ogg, to to magic detection on the initial OggS, and
that the mime-type is application/x-ogg.

We reevaluated the extension issue (the answer was no) and the mime-type
issue (we were swayed) and decided we'd recommend multiple mimetypes when
we have the video codec working, and add efficient discrimiation to the
requirements for the toc/metadata substream we'd always planned on.

None of this is a pressing issue; the useability arguments are moot until
we actually have both audio and video data.

That's the story so far.

Now, what are our options for implementation?

I'd proposed we combine the toc header with the kitchen-sink metadata
people have requested, and that we use xml-encoded rdf based on the Dublic
Core element set to do it. I still think this is the best option. XML is
the most obvious way to encode text streams (what subtitles should be) so
we can share part of the code, and conceptually the substream type. It
also offers good interoperability with indexing/catalog systems and plenty
of flexibility for future requirements.

Note that this doesn't really allow mime magic detection of the 'sequence
x at offset n' type. What I meant earlier about substring searching is
that you first look for the initial OggS, then search for '<useage>' in
bytes 15-200 and case on whatever comes immediately after it.

But the time for that isn't now. Aside from not having the resources to
implement it, the standards for this sort of thing are very much still in
flux. Rob (of musicbrainz.org) and I couldn't even reach an agreement on
the encoding, and to support the general case is both unwieldy and
expensive, and likely to be obsolete next year. If we can wait 6-12
months, there should be much more of an external standard we can
incorporate. The librarians think this is a hard problem too.

Better, our solution will be a much closer fit if we give ourselves a
chance to evolve the format through usage while we're developing the video
codec.

If we had to do it NOW, I'd suggest something based on the vorbis
comment header, with (possibly hierarchic) text vectors in a set order.
The substream would consist of a head page and an empty tail page.

The first element would be something like "STREAMCLASS=audio". This would
allow mime magic filetype detection if we require that the toc always be
first. Others would follow like so:

general bitstream headers:
        STREAMCLASS=video
        <misc metadata a la vorbis/kitchensink?>
substream 8347929:
        STREAMTYPE=toc	(this example)
substream 2361643:
        STREAMTYPE=tarkin
        LAYER=0		(means this is a primary stream, not an overlay)
        USAGE=default	(not an alternate track)
substream 8293298:
        STREAMTYPE=vorbis
        SUBTYPE=surround	(could be a mapping number instead)
        LAYER=0
        USAGE=default
        LANG=en
        LABEL=English surround audio
substeram 0923470:
        STREAMTYPE=vorbis
        SUBTYPE=stereo
        LAYER=0
        USAGE=alternate
        LANG=es
        LABEL=director's commentary
substream 7829372:
        STREAMTYPE=mng		(these would be pre-rendered subtitles)
        LAYER=1
        LANG=jp
        http://advogato.net/person/rakholh/diary.html?start=165
..and so on. The substream numbers refer to the logical substream ids, for
easy correlation. They could be encoded either as separate section
headers, or a part of the tags in a linear arrangement.

That's about as general as I can make it right now, and I think something
(particularly a forwards-compatible vorbis-only implementation) could be
written in time for 1.0.

IMHO,
 -ralph


--
giles at ashlu.bc.ca

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'vorbis-dev-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.




More information about the Vorbis-dev mailing list