[foms] WebM Manifest

Mark Watson watsonm at netflix.com
Thu Mar 17 14:24:43 PDT 2011

On Mar 17, 2011, at 2:10 PM, Steve Lhomme wrote:

> On Thu, Mar 17, 2011 at 4:59 PM, Mark Watson <watsonm at netflix.com> wrote:
>> The problem is that adaptive streaming is more complex than simply
>> concatenating a bunch of resources found at URLs advertised in a manifest.
>> Firstly, you need reasonably small granularity in terms of switch points. 2s
>> is good. 10s is too long.
>> Next if you had a separate file for each 2s chunk, then you have an
>> unmanageably large number of files (it would be ~25 billion for our content
>> library).
>> The solution in DASH (required in the "Basic On Demand" profile) is to store
>> the content as a single file for each bitrate. At the start of a file is an
>> index giving the time-to-byte-range mapping for the 2s fragments (2s is an
>> example - the spec doesn't constrain you). This is for on-demand, not live,
>> btw. The index is in the file, binary coded, to keep it compact and thereby
>> keep startup time low. If it was in the XML Manifest it would be huge.
>> To construct byte range requests you need to read and parse this index. I'm
>> not sure Javascript has good tools for efficiently handling & parsing binary
>> data yet.
> Would it be possible in DASH to have no pre-defined fragment duration
> ?

Absolutely. When I say 2s this is just an example. DASH doesn't require a specific duration and doesn't even require that all the fragments in a file have the same duration.

> If you load the index of all the available streams on startup, you
> have your seek points with time. Given it's the same source and
> probably the same codec/encoder, there's a very good chance the
> keyframes, and thus the switch points are exactly in the same place.

It's a good idea to encode the content so that the keyframes are in the same place in each version, but you don't get this unless you explicitly prepare the content this way. Simplest case if you have fixed GoP size (e.g. 60 frames) then you'll have this alignment. You can flag a set of streaming as having this kind of alignment in a DASH manifest, so clients that require this alignment know if they are going to be able to switch seamlessly.

> That would avoid forcing the encoder the extra fragmentation
> constraint and use the available bandwidth more wisely (better end
> user quality).

Not sure I understand the bit about quality: do you mean that by allowing the encoder to choose key frame positions you get better quality. Yes. But then you do have to take care to ensure the different rate encodes pick the same keyframe positions. Without modification they won't do that. What I've seen in the past is running the highest bitrate encode first, getting a trace of the keyframe positions and then feeding that into the other encodes. You need a modified encoder that supports that.

> The pre-defined fragment duration may not be avoidable in the live
> streaming context.
> I think there is a JavaScript library somewhere that can parse Matroska streams.
> -- 
> Steve Lhomme
> Matroska association Chairman

More information about the foms mailing list