[theora] Http adaptive streaming for html5

Michael Dale mdale at wikimedia.org
Sun Dec 20 18:43:43 PST 2009


Chris Double wrote:
> I'm not sold on the approach of listing the different bitrate versions 
> in source elements. This will cause issues if the source elements have 
> different codecs, sizes, or other characteristics. It also makes it 
> hard to know which is a 'fallback' source and which is a bitrate source.
>
> I much prefer the approach of treating bitrate switching as a new 
> codec. Have a file on the server that lists the different bitrate 
> versions. Much like Apple's live streaming draft RFC.
>
> In this way browsers can support it by implementing the 'codec' and 
> the WHATWG doesn't need to be involved at all as far as I can tell.
>
> Chris.
Hmm...if externally describing the resource set.  There is the 
xiph/annodex "roe" xml format: http://wiki.xiph.org/ROE  which is 
probably slightly less arcane than writing a bunch of

#EXT-X-STREAM-INF:PROGRAM-ID=1,BANDWIDTH=96000
http://192.168.132.15/ipv/stream-96k.m3u8

lines and potentially having to segment all media files. But it may be 
hard to get apple to adopt anything other than what they are already 
supporting. The codec issue is already a big barrier to compatibility. 
Adding on more xiph / Mozilla specific stuff might be yet another 
elephant to push up the water slide.

One option to help this situation is push for an api for adding 
"sources" to the video element for bit-rate switching via javascript. So 
that as new formats come about they can be negotiated by javascript 
player interface.

Back to ROE... it lets you set "switching dystinctions" for a sets of 
time aligned temporal media stream. ie you could support switching on 
dubbed languages, video camera angles, or really arbitrary distinction 
strings. One such distinction key could be a reserved keyword like 
"adaptive_streaming" and list the clips bit-rate streams similar to the 
source tag example I gave earlier.

I don't know if I like the ROE's "sequence" support though. Since it 
goes into the SMIL direction which I think roe should avoid if possible 
and only be a format to describe a set of temporally aligned streams. If 
difficult to implement perhaps it should be removed.  I don't doubt its 
valuable but just worry about spec complexity. Other complexities in 
ROE's format is that it lets you select "audio tracks" which also could 
be difficult to support across all the browsers.

--michael


More information about the theora mailing list