[theora] Closed Captioning

Joe Straitiff josephs at agames.com
Wed Dec 4 13:16:56 PST 2002



There are formal EIA specs on these, the line 21 CC data is under EIA-608
and the new DTVCC (closed captioning for digital tv) is under EIA-708.  The
digital spec is a mess of conflicts, there is at least one clarification
document that they released EIA-CEB-10 (and there should've been another by
now).

There are also formal DVB (european) subtitling specifications.

The main difference between the US and European versions are that the
european versions are all graphical (they do have sections talking about
character based encoding but it's so vague and up to implementation that
I've never seen anyone use it) and the US is fully character based.  So the
DVB is similar to DVD's subtitling, i.e. they use a graphics rendering model
and push graphics through on their stream.

Here's where you can buy the standards -- look for documents like
EIA-CEB-8 "Consideration of EIA-608-B data within the DTV Closed Captioning
(EIA-708-B) Construct",
EIA-SP-4968 "Line 21 Data Service Proposed Upgrade of EIA/CEA-608B",
EIA/CEMA-608 Series Set "EIA/CEMA 608 Series Set",
EIA-SP-3688 "Recommended practice for line 21 Data Service",
EIA-708 "Digital Television (DTV) Closed Captioning",
EIA-CEB-10 "EIA-708-B Implementation Guidance":
http://global.ihs.com/

Here's an ancient simulator for EIA-708 DTVCC:
http://www.avio-systems.com/dtvcc/simulate.html

WGBH did a lot of the stuff with EIA-708, they also have free test streams:
http://main.wgbh.org/wgbh/pages/mag/resources/archive/techfacts/cctechfacts5
.html
http://ncam.wgbh.org/dtv/
http://ncam.wgbh.org/dtv/techfacts/testmaterialv1.html
http://ncam.wgbh.org/dtv/dtvccsummit/field.ppt

And there is one place that sells VERY expensive EIA-708 test streams:
http://www.sarnoff.com/digital_video_informatics/compliance_bitstreams/bitst
ream_sets.asp

Here's the ATSC (i.e. HDTV) info on DTVCC:
http://www.atsc.org/faq_closed.html

Some stuff on the DVB subtitling:
http://www.softel.co.uk/downloads/DVB_Subtitling_FAQ.pdf
http://www.bbc.co.uk/rd/pubs/papers/pdffiles/ibc99net.pdf

The final draft of the DVB subtitling standard:
http://webapp.etsi.org/action%5COP/OP20021004/en_300743v010201o.pdf

http://www.etsi.org/

The download area of etsi to download the DVB subtitling standards:
http://pda.etsi.org/pda/home.asp?wki_id=3653
http://pda.etsi.org/pda/home.asp?wki_id=15155

In case you haven't guessed, I did have the joy of implementing a 608 and
708 decoder about a year ago for a set-top box company...

Hope this info helps ;)

-Joe

<p>>
> > Thanks for the detailed description. I've never seen any documentation
> > on the various cc formats. Is there a formal spec available somewhere?
>
>
> Yes I will see if I can grab a copy of it for you. Where you are at they
> use teletext with some additons for captioning.
>
> >
> > Does the steno-english conversion explain the common homophone
> > substitution? I noticed that particularly in the corverage of the
> > sydney olympics. It made me think they were using speech recognition.
>
> Yes it does. Odds are they where using a program called Captivator or if
> was the BBC they where using a custom system that they had written for
> them. One of the really nice features about our system is we use AI to
> fix what are called conflicts. Words like there, their, and they're tend
> to be written with the same steno stoke. Our system uses AI to figure
> out the correct word. We do the same with numbers, our program converts
> them to text or digits using rules and a little AI. Yes I think our
> program rock I help write parts of it  it is over 100,000 lines of code
> at this point.
>
> Our website is www.eclipsecat.com Yes I know it needs to be torn down
> and re written and I promise to do it in all my free time :)
>
>
> >
> > Mostly I've been thinking about the popup style. I'll have to think
> > about ways to do live transcription without abusing the format so
> > much. Another argument against xml, basically. Of course, you can
> > always accumulate lines before sending.
>
> The real-time captioning is what I feel is really lacking. I would love
> to see more "LIVE" webcasts that are captioned.
>
> Accumulating lines is not a great option a delay could be a killer.
>  Suppose that you are doing a newscast and the delay some how has
> captions talking about a child molester being arrested when there is a
> picture of the Pope on the screen? Or the captions for a tampon ad
> showing up with the video of a cookie ad? It has to sync with the video.
>
>
>
>
>
> --- >8 ----
> List archives:  http://www.xiph.org/archives/
> Ogg project homepage: http://www.xiph.org/ogg/
> To unsubscribe from this list, send a message to 'theora-request at xiph.org'
> containing only the word 'unsubscribe' in the body.  No subject is needed.
> Unsubscribe messages sent to the list will be ignored/filtered.
>

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'theora-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.



More information about the Theora mailing list