[foms] WebM Manifest

Mon May 9 09:34:57 PDT 2011

Pierre-Yves,

Interesting discussion. To be clear, I agree that a chunked mode is necessary for live, and clearly clients should not see much difference between live and on-demand, except that they should not requests chunks "from the future" in the live case.

My point is that this is not sufficient for efficient large scale on-demand services. See comments below...

On May 7, 2011, at 1:40 AM, Pierre-Yves KEREMBELLEC wrote:

>>> Exactly. I don't know about any HTTP cache that deals properly with byte-ranges and
>>> partial caching (using for instance hollow files + bitmaps, like Thomas described).
>>> (this problem is now new, see http://bit.ly/ixdQwo for instance). As pointed by Thomas,
>>> Varnish may be able to achieve partial caching through the http_range_support directive,
>>> (since 2.1.3), but it has to be proven stable.
>>> Unfortunately, this type of caches is more the exception than the norm today.
> 
>> At Netflix we make extensive use of byte ranges (allegedly 20% of US Internet traffic at peak times). This is well supported by the major CDNs who all support byte ranges and partial caching of large files.
> 
> Well, maybe major CDNs supports byte-range caching properly (and even that seems to be handled specifically
> by some CDN, see http://www.akamai.com/dl/feature_sheets/fs_lfdo.pdf for instance). Anyway, this is definitely
> not the case for most ISPs (transparent proxies) or enterprises today (we are reminded of that fact everyday
> unfortunately). Again, efficient byte-ranges caching is more the exception than the norm globally (Microsoft
> even recently filed a patent for that: http://www.faqs.org/patents/app/20100318632 ^_^).
> 
>> Lack of byte range support is not the reason chunking is used for live (more on that below). I absolutely agree
>> that solutions need to work with "dumb" HTTP infrastructure and for me this excludes special media-format-specific
>> capabilities on the origin servers more than it excludes byte ranges which are part of HTTP1.1.
> 
> I agree to disagree here: the first origin server may implement some dynamic chunk/fragmentation intelligence because
> it's under the content provider control, and generally backed-up by a first level of CDN proxies. It doesn't break the
> "dumb public internet network" rule (from any perspective but the origin's, the chunks are just simple separate documents
> with unique URL).

CDNs often provide the origin servers. For us it is nice to be able to purchase all the real-time streaming services from a 3rd party, rather than having our own HTTP servers involved on a real-time basis.

There is an architectural issue here. Today, only a small fraction of media consumption takes place over the Internet. That will become a big fraction in time. To me this means that asynchronous one-to-many content delivery needs to become a first-class service of the "dumb public internet" - and this is happening through the increasing embedding of HTTP proxies in the form of CDNs and also deep into ISP networks.

Secondly, it's valuable to decouple the unit of storage from the unit of request - we seem to agree on this.

Thirdly, byte ranges naturally provide caches with a hint about what requests might come next (the next bytes in the same file), which allows pre-filling of caches to improve cache hit ratios.

It seems clear to me that these things should be done in an application-independent way and indeed the HTTP specification already supports this using byte ranges. Using chunking immediately introduces application-specific requirements at the origin, but these requirements quickly leak down to the caches because of the third point above.

ISPs and enterprises will have an increasing financial incentive to cache content. And indeed we are talking to some of them about this. I think it far more likely they will address this with support for the (now quite mature) HTTP1.1 specification then by embarking on an uncharted path where request/storage decoupling is done in an application-specific way.

If we are talking about standardizing adaptive streaming solutions we should focus on something which really scales in this sense.

> 
>> For us, chunking is completely infeasible due to the excessive number of files required (note that this is not so much
>> to do with having separate audio/video files - that causes just a 10x multiplier, but splitting by time, which is a ~3600x
>> multiplier). Storing on the origin as a large file and chunking based on requests is more of a restriction (in terms of CDN
>> capabilities required) than byte ranges. Byte range support is a generic capability which can find applications for many
>> services.
> 
> I totally understand your point, but dynamically chunking also works in this case: you may have a single MP4 (or MKV for
> that matter) with 4 video angles, 10 audio tracks and 10 subtitle tracks, and still be able to dynamically remux and deliver
> independent short segments for each individual track if needed (no byte-ranges involved). In a ideal world, neither MP4 nor
> MKV would be used for the wire container format anyway, because even with static or dynamic chunking, these containers are
> quite complicated to handle and do not represent a pure "stream" (contrary to MPEG2-TS or in a certain degree FLV, which are
> pure streams).

Maybe we have to agree to disagree here too: I find mp4 vastly easier to understand and parse than MPEG2-TS. A Transport Stream is not a "pure stream" - it's a multiplexing layer and extracting the Elementary Streams is not straightforward. There are also timing and conformance rules associated with Transport Streams which I don't think anyone would describe as simple.

> 
>> Furthermore, chunking like this restricts all clients to make the same requests: if one client requests a 10s chunk and another
>> a 5s chunk then you cache 15s of data even if the 5s piece is contained within the 10s piece. This restriction reduces either
>> cache efficiency or client performance.
> 
> Absolutely, this is the whole point: encouraging all clients to make the exact same requests, to increase cache-ability for
> all types of caches, even those not playing nice with byte-ranges (enterprises, ISPs, browsers, ...).

But this comes with a cost! Good adaptivity argues for small requests. But if all clients make small requests all the time you have a high server and uplink load. So this argues for making larger requests when fast adaptivity is not required (e.g. when a client has plenty of buffered data). Forcing all clients to make the same requests means you cannot engineer this trade-off.

> 
>> We've discussed this at length will many of the major players (especially Microsoft but also Apple) in MPEG. The "Basic on-demand"
>> profile in MPEG DASH is based on the single-file-with-byte-ranges approach. The reason chunking was chosen by Apple, Microsoft etc.
>> for live is that HTTP caching procedures do not play well with files that change size over time. A change in size is a change which
>> can cause the whole file to be ejected from cache. There is nothing in HTTP to indicate that the file just grew and so all the earlier
>> data in the file is still the same. There are certainly things you could do with a growing file, but you need to be careful that all
>> your caches support it - and there's no spec for what "it" is.
> 
> It seems all the major vendors (Microsoft, Apple, Adobe) are using fixed resources URL for chunks (whether those chunks are
> pre-prepared or extracted dynamically), for live and on-demand. For instance, Microsoft chunks URL format is something like
> (no byte-ranges involved):
> 
> http://video.foo.com/NBA.ism/QualityLevels(400000)/Fragments(video=610275114)
> http://video.foo.com/NBA.ism/QualityLevels(64000)/Fragments(audio=610275114)
> 
> The reason for that are described in this document :
> 
> http://download.microsoft.com/download/4/2/4/4247C3AA-7105-4764-A8F9-321CB6C765EB/IIS_Smooth_Streaming_Technical_Overview.pdf

I always thought it convenient that you needed IIS servers to support the Micro$oft solution ;-) Anyway, MS are deeply involved in the DASH discussions and seem quite keen to support and migrate to that standard (though I don't speak for them of course).

> 
> Same for Adobe, and Apple is physically pre-splitting MPEG2-TS files (but this is a limitation of their tools, Wowza servers
> are doing this on the fly for instance).

Yes, well, Move Networks were doing that many years ago too - it's the first thing that works - but that doesn't mean it's the best scalable solution.

> 
>> Also a factor is the fact that the big disadvantages of chunking don't really apply to live where by definition there is a real-time
>> media-aware process generating the files and feeding them into the HTTP infrastructure.
> 
> As Sylvia pointed out, I think both systems should be allowed to co-exist from a manifest/ABR technic pov.

This I agree with.

> 
> Regards,
> Pierre-Yves
> 
> _______________________________________________
> foms mailing list
> foms at lists.annodex.net
> http://lists.annodex.net/cgi-bin/mailman/listinfo/foms
>