[ogg-dev] Skeletal relations

Wed Feb 20 02:01:35 PST 2008

> 1) Font data, as in the actual font itself, doesn't really belong in an ogg
> stream.

People wanting to  have more control on the appearance of an overlay
might want to control the font. Since font naming is largely non standard
(eg, the foundry etc system (you know, *-*-*-* system) is X only I think,
and I think Windows just has filenames), one can't specify a font to use
in a way that you know will work. Provided you'd have the font in the
first place. That's (one of ?) the reason various document types can have
embedded fonts. Yes, it's not ideal, but there's no real other way to do
it that I know of.

> 2) We have been working on a specification and mechanism for indicating to
> clients that there are multiple tracks of the same "kind" (e.g.
> translation), and allowing clients to request individual tracks out of sets
> of like tracks.  In fact with HTTP headers like Content-Language we can also
> allow the server to default to a particular translation selection in the
> absence of guidance from the client.  At the moment I think a preliminary
> name for this specification is ROE - Silvia is in the process of nailing the
> spec down so you should ask her any questions you have about it :)
> Obviously this doesn't "solve" the duplication issue (if there is one) but
> it does prevent duplicated data eating bandwidth.

In this case, it's realtime muxing. That's a special case. While it
probably does
help in a lot of situations, it doesn't apply in all cases where one could use
an Ogg stream. It's a great help though.
Besides, when I coded the xine Kate plugin, I've made it so you can switch
languages on the fly. All streams are decoded, but only the selected one
(if any) is actually displayed.  This is not possible with such a scheme (not
saying it's deficient, just that it also adds constraints).

> 3) Text is cheap!  Really cheap :)  Seriously - compare the amount of space
> in your file taken up by text to that taken up even by audio, let alone
> video.

Yes, text is cheap, but not fonts. Especially if they have to be burst
transmitted
in headers before playback actually begins. It's also a corner case (custom
font + lots of multiplexed streams). but I'd ideally like to have something that
scales if possible.
Speaking of scaling, one of the issues that I've seen is the large amount of
framing data against codec data. Since I have a packet per page (for timing
reasons), Ogg adds a lot of bytes to mine. But that is another story...