[foms] Proposal: adaptive streaming using open codecs

Tue Nov 16 01:15:38 PST 2010

Hello all,

>>>>>> The earlier consensus from most of the content providers was the non
>>>>>> interleaved was easier to manage, particularly at large scale when you
>>>>>> have a number of different bitrate streams, and a number of different
>>>>>> audio tracks.
>>>>> We have to be careful where we take that statement. Just because the
>>>>> large content owners don't want to do physical chunks and want to keep
>>>>> audio and video tracks separate doesn't mean we have to do that over
>>>>> the network or use chunks in the manifest file.
>>>> 
>>>> There's not much point in designing a technology which the content
>>>> providers won't want to use.
>> 
>> Maybe then I misunderstood an earlier discussion. I was under the
>> impression that it is not possible for a large content provider to
>> prepare millions of small files on the server to do Apple's HTTP live
>> streaming - so instead there is a server extension that provides the
>> chunking functionality on the fly. Is that not correct?
> 
> That might be what some advocate, but what I would advocate is having just one file for each bitrate of video and a separate one for each bitrate or language of audio etc. and then provide the clients with an index into each file so they can make byte range requests for the pieces they need from each.
> 
> There does exist, in several CDNs, a simple server extension which enables a byte range to be embedded in a URL, instead of in the Range header, and we do use this with Apple clients for our service to avoid the "millions of files" problem. But this is just a different way of communicating the byte range to the server which happened already to exist and be useful as a workaround and which is very much an application-independent capability: what I would suggest we avoid is any video-specific server extensions, where servers are expected to understand the format of the video and audio files, re-multiplex them etc.

So it seems there's a general consensus on splitting up audio and video into separate streams? Who's really against it, for which reasons? This has big implications for the dummy Stream.appendChunk() call we were brainstorming about. Just appending chunks wouldn't work anymore; we'd basically have to create tracks and append chunks to tracks...

I'm also a little lost on how the files on the server would be structured. Would there be audio-only and video-only "plain" WebM files, or do we need to go to a "chained" format (range requests) or a "chunked" format (separate files)? In both latter cases, we'd loose adaptive streaming supporting current WebM files...

- Jeroen