[theora] Closed Captioning

David Siebert dsiebert at eclipsecat.com
Wed Dec 4 12:23:16 PST 2002



> Thanks for the detailed description. I've never seen any documentation 
> on the various cc formats. Is there a formal spec available somewhere?

<p>Yes I will see if I can grab a copy of it for you. Where you are at they 
use teletext with some additons for captioning.

>
> Does the steno-english conversion explain the common homophone 
> substitution? I noticed that particularly in the corverage of the 
> sydney olympics. It made me think they were using speech recognition.

Yes it does. Odds are they where using a program called Captivator or if 
was the BBC they where using a custom system that they had written for 
them. One of the really nice features about our system is we use AI to 
fix what are called conflicts. Words like there, their, and they're tend 
to be written with the same steno stoke. Our system uses AI to figure 
out the correct word. We do the same with numbers, our program converts 
them to text or digits using rules and a little AI. Yes I think our 
program rock I help write parts of it  it is over 100,000 lines of code 
at this point.

Our website is www.eclipsecat.com Yes I know it needs to be torn down 
and re written and I promise to do it in all my free time :)

<p>>
> Mostly I've been thinking about the popup style. I'll have to think 
> about ways to do live transcription without abusing the format so 
> much. Another argument against xml, basically. Of course, you can 
> always accumulate lines before sending.

The real-time captioning is what I feel is really lacking. I would love 
to see more "LIVE" webcasts that are captioned.

Accumulating lines is not a great option a delay could be a killer. 
 Suppose that you are doing a newscast and the delay some how has 
captions talking about a child molester being arrested when there is a 
picture of the Pope on the screen? Or the captions for a tampon ad 
showing up with the video of a cookie ad? It has to sync with the video.

<p><p><p><p>--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'theora-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.



More information about the Theora mailing list