[theora-dev] First steps towards a simple text stream format.

Silvia.Pfeiffer at csiro.au Silvia.Pfeiffer at csiro.au
Thu Aug 21 23:53:44 PDT 2003

Hi Philip,

ups, time flies!

Philip Jägenstedt wrote:
> Yes, that does look very attractive :) My objection is not with the
> format of this, but with the headers that wrap around the entire ogg
> stream with the purpose of allowing search engine spiders to "penetrate"
> the file. This is somehow mixing two different things.

Indeed, our objective was not to solve the meta-data problem just for 
theora. We wanted to solve it for any time-continuous data stream that 
could be stored in an Ogg file. That includes things like multitrack 
audio with different codecs for each audio stream or even mpeg-1 
combined with vorbis.

Originally, we wanted the files to be compatible with the existing 
players and tools such that e.g. an annodexed vorbis file would still 
play on a vorbis player. However it turned out that we needed to solve 
the general problem of synchronising media streams with ogg and the 
problem of identifying what media streams are generally encapsulated in 
an ogg. This led to the necessity of putting one additional header page 
in front of each logical bitstream containing essentially the mime type 
of that bitstream and its temporal resolution (as a rational number 
given in Hz which provides a way to relate the "granulepos" of an ogg 
page to a time offset).

<p>>>Then you use the related Theora video file and run it through "anxenc" 
>>and you've got the synchronised file. With DVD you'd automate the 
>>process of creating the CMML file by transcoding from the DVD directly.
> Except that on DVDs, the subtitle is stored as images of course, so
> there's a whole lot of work involved in this process.

Ups, that means that a bit of OCR will be needed to transcribe the 
subtitles in text. In my view that makes Annodex more attractive than 
DVD - but that's a personal opinion :) . And I was hoping for a simple 
dvd2anx tool :(

<p>>>Philip Jägenstedt wrote:
> Yes, indeed there would be libfoosub (no, I don't have a name for my
> format), but nothing beyond that. Since I haven't looked too closely at
> the internals of annodex, tell me -- does an application which suports
> annodex use only libannodex, or do you also access libcmml?

An application that only decodes annodex files should only need 
libannodex as libcmml is only needed for "authoring", i.e. when you're 
in need of parsing a cmml file. I can see that the build environment for 
libannodex is not set up that way though, sorry. Taking out the 
requirement for libcmml in configure.ac will make it compile w/o 
libcmml. Then you only require libogg and libtheora to run e.g. anxrip. 
(in contrast, anxenc requires libcmml.)

<p>> Actually, not having external subtitles is one of the motivations for my
> little endeavour. I do appreciate your help and suggestions. I realize
> that I might be making a "no-no-no, I want to my OWN format"-impression,
> but that really isn't what I'm doing. I just want something that works
> well and is clean.

Understood. I also don't just want to push our solution onto everyone - 
only where it makes sense. I'm glad you've checked it out!

<p>> Indeed, the use of CMML is overhead in my view -- overhead for a
> subtitle format, but not necessarily overhead for the other things
> annodex does. As I've indicated, the fact that annodex wraps around all
> other streams is what I have the most difficulties accepting. Do you
> mean that annodex doesn't enforce this, and that I could use only the
> annotation stream of annodex, or is this violating the spec?

When you use only the annotation stream, it's not an annodex file any 
longer. Feel free to try that out though.


--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'theora-dev-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.

More information about the Theora-dev mailing list