archives/Re: [theora-dev] First steps towards a simple text stream format.

Philip Jägenstedt philipj at
Sat Aug 9 16:59:45 PDT 2003


If we're both heading in the same direction, perhaps we should
coordinate our efforts.

On Sun, 10 Aug 2003 01:15:08 +0200
<GODA-XEN at> wrote:

> I like this and I work in a subtitle format ( I don´t have anything now, its
> only a draft of desirables specifications ), But I decided to don't use
> UTF8, intested, I in the work to use a type of compresed utf8 in other
> words, this format is similar to utf8 in some way.
> my idea:
> 00000000-01111111 ->englsih characters, similar to utf8
> 1x... -> indexed utf in a table
> this save space in a subtitle in any language and isn't dificult to parse
> it, in a double byte language we can save sapace using this, i don' use a
> compresed all character for litle question, control of subtitle, for the
> controll we use the english characterset and they are always presents.

The thought struck me to compress the text stream in some way, but I've
come to the conclusion that the relative overhead imposed by Ogg is so
large that it's not really worth the effort -- you'd have your
compressed data using say 10 bytes and the Ogg frames 20 bytes, that
would just be silly. UTF-8 is an absolute requirement in my view, and it
would seem that the work of compressing the text just doesn't pay off.
However, if you have any compelling reasons to do so (apart from
coolness) please say so.

> the other part is the use of a endebed font, only the part use is coded as
> png or a vectorial format, but png is more easy to parse in a low power
> machines.

This I don't quite understand. Well I understand it, but it's an idea
that I rejected while thinking this over. It's inevitable that people
will want this sort of functionality at some point, and then more and
more and more until you have something big and clumsy like USF. In my
Master Plan(tm) SVG could be used for fancy subtitles in the future, but
then it would also be generally useful for things other than subtitles,
and it's a _real_ W3C-approved standard for which there has already been
some work done.

Please tell me more about what you're thinking, and if our goals are
compatible, maybe we can work together.

// Philip Jägenstedt
--- >8 ----
List archives:
Ogg project homepage:
To unsubscribe from this list, send a message to 'theora-dev-request at'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.

More information about the Theora-dev mailing list