[theora] Choosing subtitle language

Arc arc at indymedia.org
Sat Jun 28 08:03:51 PDT 2003

On Sat, Jun 28, 2003 at 01:14:26PM +0100, Ralph Giles wrote:
> Since everything's in the source ogg file, this is the easy thing to do 
> with http streaming and probably what icecast will do, at least for the 
> first release. It does use extra bandwidth, but we need to start 
> somewhere.
> With RTP streams, the practice is to send every substream over a 
> parallel connection, so flexibility is much easier to implement, and 
> avoiding wasted bandwidth is one of the primary design goals. So 
> everything will be different there. I've no idea how to chose among the 
> available streams in that case. Does RTSP have a established mechanism 
> for this?

Ok here's been my idea on how to handle this...

1) Subtitles be an Ogg codec, even if adopted from some currently
standard format, so that they can co-exist within the same physical
bitstream.  Each concurrent language is available as it's own logical
bitstream to make it easy to strip away if one should want to do so

2) Icecast (etc) will stream the entire source Ogg if no prefered
language is specified (may be preferable in some situations, such as for
archiving the stream)

3) If a prefered language is specified (via standard HTTP headers) and
multiple subtitles or audio tracks co-exist with different language=
comment headers (vorbiscomment with vorbis, etc) Icecast will try to
provide only the language the browser asked for, if the stream is not
available in the provided list of preferred languages all will be sent

I'd also like to point out that "subtitles" use a very small amount of
bandwidth, even if concurrently available for 10+ languages, these do
not even come close to compairing to the bandwidth of a vorbis stream.
They are also useful for non-theora works, such as to encode the lyrics
for Vorbis music or a Speex newscast, and prehaps could eventually be
spidered to make an Ogg "search engine" ;-)

I'm looking at this from the Indymedia perspective, for instance, the
European Newsreal.  Every edition of this program has to be translated
to 4+ languages, each one a seperate distribution, and this is a pain in
the ass.  The 5 minute segments of the 30 minute program are often in
different languages, ie, first segment in spanish, second in english,
third in german, etc.

If each segment were to have subtitles for it's origional language
provided, then the translation team hammer out the different
translations, each 5 minute segment could be encoded as Theora/Vorbis
(or in some cases Theora/Speex) with the many subtitles layers and the
Vorbis/Speex layer indicating it's language with the "language" field in
it's header (just another comment, so no change in the standards).
These segments are basically then cat'ed together along with the program
title and credits, and you'd have a complete program right there...

The player (or prehaps Icecast if streamed) would be configured for the
language it's suppost to output to, if the vorbis/speex stream is not in
that language it can check to see if it's preferred language is
available as a subtitle and automatically enable that stream.  This is
re-done everytime it jumps to a new chained set of bitstreams (when it
changes segments), so subtitles would automatically switch on/off
depending on the language preferences and the language of each segment.

I can't think of many uses that this scheme would not work for...

-------------- next part --------------
A non-text attachment was scrubbed...
Name: part
Type: application/pgp-signature
Size: 188 bytes
Desc: not available
Url : http://lists.xiph.org/pipermail/theora/attachments/20030628/62fcd3cf/part.pgp

More information about the Theora mailing list