[vorbis-dev] Re: ogg stream-id options

Michael Smith msmith at labyrinth.net.au
Sat Nov 18 04:45:39 PST 2000



>Thats interesting, there is a project I am related to that is working on the
>Dublin Core stuff (Dubline Core == Object Metadata Framework right?)

Yes. I think bits of Dublin Core have been considered, but much of it is
inappopriate (which isn't a problem. Using the bits which are appropriate
is fine given the way DC is designed, I think).

>
>XML could be too "heavyweight" to parse (the tags "waste" space :P) this has
>consequences for file-size, streaming, embedded devices ;) i.e:
>compare the number of chars:
><title>Lala</title> (19 chars)
>Title=Lala (10 chars)
>
>Its not a /huge/ difference I know - but every little bit counts doesn't it?
>;)

The size difference is basically irrelevent. Compared to the size of the
actual data, even 'bloated' metadata will be tiny.

XML is being considered (though I don't particularly think it's a good idea
for unrelated reasons - see list archives if you're interested) because
it's human-readable (more or less), and easily parsable with existing
tools. Your suggestion keeps the former, but loses the latter. And is still
massively inefficient - if we actually cared about size, we'd do some form
of entropy coding on the textual data, at least (the reality is that the
size isn't an issue, complexity and readability is)

>
>> Note that this doesn't really allow mime magic detection of the 'sequence
>> x at offset n' type. What I meant earlier about substring searching is
>> that you first look for the initial OggS, then search for '<useage>' in
>> bytes 15-200 and case on whatever comes immediately after it.
>
>This provides no advantages over the current method, I believe. Basically we
>would be still stuck with an algorithmic approach to determining the
>file-type's contents.
>
>It is better than the current solution (which doesn't exist) - but is not the
>ideal solution (in my humble opinion).
>
>Perhaps the '<usage>' should be the first tag within toc header or something?
>(giving you what you want, and giving us what we want)

I think that's what Monty has been suggesting - for files which are not
degenerate (where degenerate means that there is only a single stream, and
no metadata stream), the first 'thing' would be the metadata. That would
start at a fixed offset, probably. It may have to start at an offset of x +
value-of-byte-y, depending on how things work out - but that's almost as
trivial as a constant offset, so it's simple to deal with without actual
data scanning (which I agree is a terrible way to be forced to identify
things).

It doesn't really help you much with getting details out for complex stream
types (since there can be an arbitrary number of streams, identifiers
obviously CAN'T be at a fixed offset for all of them. It could use a simple
table (at a fixed offset) of 'pointers' into the data - you'd always need
to check each of them, that's unavoidable, but this would make it possible
to do it quickly and simply). 

The common case will probably always be what we have now - degenerate
streams with only a single logical stream contained within. Those you can
already identify with a fixed-offset lookup - only if that fails would it
be neccesary to go to the full algorithm.

Michael

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'vorbis-dev-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.



More information about the Vorbis-dev mailing list