[foms] WebM Manifest

Mark Watson watsonm at netflix.com
Thu May 5 08:34:06 PDT 2011


On May 5, 2011, at 6:01 AM, Pierre-Yves KEREMBELLEC wrote:

> Le 5 mai 2011 à 11:18, Thomas Vander Stichele a écrit :
> 
> [Thomas]
>> (Pierre-Yves, you summarized my point way better than I did, thanks)
>>>> - no byte-ranging (or at least normalized byte-ranges between all vendors, which
>>>> probably is NOT the case with remote probing)
>>> What is the problem with byte-ranging?
>> caching servers deployed today simply don't do byte-range caching well
>> or at all.  We all know that it *should* be possible to create a large
>> file of zeroes, and fill it in with received byte ranges, and tracking
>> which ranges you've already seen.  But very few caching servers do.  I
>> think squid only does it starting from 0.  I think varnish can do it.
>> Varnish is definitely not wide deployed in CDN's today however.  The
>> reality today is that byte range requests are not properly cached, and
>> there is no pressing need to either.  Requiring it for WebM adaptive is
>> going to hurt WebM more than CDN's.
> 
> Exactly. I don't know about any HTTP cache that deals properly with byte-ranges and
> partial caching (using for instance hollow files + bitmaps, like Thomas described).
> (this problem is now new, see http://bit.ly/ixdQwo for instance). As pointed by Thomas,
> Varnish may be able to achieve partial caching through the http_range_support directive,
> (since 2.1.3), but it has to be proven stable.
> 
> Unfortunately, this type of caches is more the exception than the norm today.

At Netflix we make extensive use of byte ranges (allegedly 20% of US Internet traffic at peak times). This is well supported by the major CDNs who all support byte ranges and partial caching of large files.

Lack of byte range support is not the reason chunking is used for live (more on that below).

I absolutely agree that solutions need to work with "dumb" HTTP infrastructure and for me this excludes special media-format-specific capabilities on the origin servers more than it excludes byte ranges which are part of HTTP1.1.

For us, chunking is completely infeasible due to the excessive number of files required (note that this is not so much to do with having separate audio/video files - that causes just a 10x multiplier, but splitting by time, which is a ~3600x multiplier). Storing on the origin as a large file and chunking based on requests is more of a restriction (in terms of CDN capabilities required) than byte ranges. Byte range support is a generic capability which can find applications for many services.

Furthermore, chunking like this restricts all clients to make the same requests: if one client requests a 10s chunk and another a 5s chunk then you cache 15s of data even if the 5s piece is contained within the 10s piece. This restriction reduces either cache efficiency or client performance.

We've discussed this at length will many of the major players (especially Microsoft but also Apple) in MPEG. The "Basic on-demand" profile in MPEG DASH is based on the single-file-with-byte-ranges approach.

The reason chunking was chosen by Apple, Microsoft etc. for live is that HTTP caching procedures do not play well with files that change size over time. A change in size is a change which can cause the whole file to be ejected from cache. There is nothing in HTTP to indicate that the file just grew and so all the earlier data in the file is still the same. There are certainly things you could do with a growing file, but you need to be careful that all your caches support it - and there's no spec for what "it" is.

Also a factor is the fact that the big disadvantages of chunking don't really apply to live where by definition there is a real-time media-aware process generating the files and feeding them into the HTTP infrastructure.

...Mark

> 
>>>>> I'm sure you've seen the recent NetFlix vs Comcast, Orange vs Cogent/MegaUpload,
>>>>> or Google vs French ISPs bs in the news recently, this is exactly what I'm talking
>>>>> about: let's try to define a scheme that would maximize "natural caching" within
>>>>> a "dumb" HTTP-caching-aware network, with "streaming intelligence" happening on
>>>>> end-user player and origin server sides only.
>>>> 
>>>> I agree that we should not rely on any intelligence in the network.
>>>> 
>>>> However, we also cannot expect intelligent servers. We have to deal
>>>> with what standard HTTP servers allow - at most we can assume byte
>>>> range request support. So, in essence, all intelligence needs to be in
>>>> the player. And for players to do "chunking", I cannot see a way
>>>> around byte range requests. If you do, please share.
>>> 
>>> Because if I follow correctly you are not considering actually having
>>> chunked files on the server, which is exactly how Microsoft/Adobe/Apple
>>> do it for adaptive bandwidth.  For some reason this group is seeing
>>> chunked files at the source as a huge problem.  Store chunked files
>>> upstream and you don't need byte ranges at all.
> 
> [Sylvia]
>> That's how I understood it to be about 6 months ago when we started
>> this discussion. However, many of the larger content provides,
>> including Netflix, have stated that they cannot do chunking simply
>> because it is going to explode their number of files. They provide
>> many tracks to a single video - including dubs in different languages
>> - and thus claim it is infeasible to have it work in this way.
> 
> "Chunked file" != "multiple files". A chunked file may be a single physical file,
> internally re-organized (re-muxed) to expose chunks more easily (whether it's
> through byte-range requests or a special server extension/plugin). As of today,
> only Apple has the "multiple files" approach (multiple MPEG2-TS physical files +
> associated M3U8 manifests), but this is a "limitation" of their toolset, and I
> know server-side implementations that will take a regular MP4/H264/AAC file and
> chunk it "on the fly" from the iOS client perspective (Wowza, Dailymotion, ...).
> 
> Netflix has a slightly different issue: due to their activity (mass-market vod,
> multiple subtitle tracks, audio tracks and video qualities per content piece),
> they cannot actually produce all the different combination in advance: it would
> effectively be unpractical, due to the number of files and storage cost. But
> Netflix got a step further: they demuxed their videos and stored audio, video and
> subtitles tracks separately on their origin servers: everything is downloaded and
> re-synchronized at the client side.
> 
> The same applies for Dailymotion : due to the large number of videos (tens of millions),
> it's not practical to remux into any chunked form or even split into multiple files
> (plus we would have to keep the original file anyway, for byte-ranging "naive" clients
> like smartphones or HMLT5 video tags in browsers).
> 
> [Sylvia]
>> Maybe the answer is that for such use cases they will need an
>> intelligent piece of server software that pretends to the player that
>> it has chunks and makes them available as dynamically created from the
>> server.
> 
> Agreed. I don't think there is anything wrong with this approach, since we are talking
> about "enhancing" the first origin server, nothing else along the path between this origin
> server and the player. It doesn't break the current internet architecture at all. If HTTP
> caching is leveraged all steps of the way down to the player, these origins don't even
> have to be powerful (i.e. "on-the-fly" software remuxing is totally feasible, and we are
> the living proof of that ^_^).
> 
> [Thomas]
>>> I'm not saying it's the best way of doing things, I'm saying the market
>>> has already decided. The problems you see with storing lots of small
>>> chunks is already there, and what WebM chooses isn't going to solve that
>>> problem for CDN's.  In fact, it's what CDN's do best, and what they've
>>> asked from the big vendors.
>>> We can choose to go the 'this is the best solution according to us' way,
>>> which will take us a few years, and probably won't see any market takeup
>>> at all, or we can go the pragmatic 'this approach is guaranteed to work
>>> on the internet because it aligns with all the other technologies'.
> 
> Agreed. We see it everyday, and not only at CDNs, but also (and more and more) at ISPs!
> (and don't get me started on the net neutrality rant that got completely screwed-up with
> the advent of unicast video ^_^).
> 
> Regards,
> Pierre-Yves
> 
> _______________________________________________
> foms mailing list
> foms at lists.annodex.net
> http://lists.annodex.net/cgi-bin/mailman/listinfo/foms
> 



More information about the foms mailing list