[foms] Proposal: adaptive streaming using open codecs

Tue Oct 26 11:41:39 PDT 2010

On Oct 26, 2010, at 3:41 AM, Jeroen Wijering wrote:

> 
> On Oct 21, 2010, at 9:22 PM, Christopher Blizzard wrote:
> 
>>>>>>> Again, the proposal from Christopher on providing a "Manifest API" (basically a playlist of chunks) plus having some QOS metrics (bandwidth, framedrops) would already allow developers to build adaptive streaming on the javascript level. Far easier for a first implementation. I guess we swiftly need a proposal for the "Manifest API".
>>>>>> Note that one of Philip's suggestion's (maybe not on the list? I can't remember.) was that we do the API before we do the manifest work.  This would allow us to iterate, test and figure out what worked before figuring out what we needed in the manifest.
>>>>> Yes, that was Philip's proposal as well. Makes a lot of sense.
>>>>> 
>>>>> - Jeroen
>>>> 
>>>> Also would allow us to test out switching algorithms that we might want
>>>> to include in browsers by default.  And (*gasp*!) specify them.
>>>> 
>>>> --Chris
>>> 
>>> I support this message :)
>>> 
>>> In some way or another, we need to achieve gapless playback. These are the options I know of so far:
>>> 
>>> 1. A concatenation API (maybe Stream) to form a single stream from multiple URLs. This would basically be a byte concatentation API, and assumes that we either have the chunks be plain slices or that we support chained Ogg/WebM gaplessly. It has some similarity to a Manifest API in that it lists several URLs. The difference may be that the video element isn't aware of the multiple resources, that's all hidden in the URL, effectively made part of the network layer of the browser.
>>> 
>> 
>> Basically an API that says "Play this chunk of video next"?  I think that's what I've pushed for, but it's a decent amount of work.  I'm not sure what the rules are for that esp. wrt sound sync.  Also I don't think it has to be byte-concatination if we have decent support for moving from one video to the next on a frame-by-frame basis.
> 
> I have added a small section on this to the proposal I drafted. I also posted it up on the WhatWG wiki:
> 
> http://wiki.whatwg.org/wiki/Adaptive_Streaming#API_adaptive_streaming
> 
> Please feel free to add/edit/remove as you see fit. There's still a lot of wrong statements in there, or omissions of feedback or alternatives. I added in a bunch based off the emails over last week, but some sections (particularly around chaining/chunking and to-rangerequest-or-not) as still very weak. 
> 
> On the audio concatenation: can the suggestion that Monty put forward in the workshop (making up additional sound data in Vorbis e.g. for a crossfade) also be used for other codecs? Or is this something that can only be done in Vorbis? 
> 
> Chris' idea on the video concatenation sounds good - this can be on a frame-by-frame basis. I presume one then can still used only one decoding pipe? Or is that an issue then?

I think this kind of API is a great idea - the adaptation decisions are a key area for experimentation and innovation and if you can do that in Javascript it would be great.

But I think you need to drive it based on what is happening on the network. Otherwise how do I know how many chunks to "append". If I append too many and network conditions change, then I could stall. If I append too few then again I could stall.

Instead, the Javascript code could get called back each time a chunk has been downloaded. This is the point at which you want to decide whether and what chunk to request next. In the simplest case the information you need for this decision is
(a) current buffer level (i.e. amount of received but not played media, in playout time)
(b) recently observed incoming bandwidth

However, you might find that with this limited information there are not many adaptation algorithms you can actually build and so not much scope for experimentation. If you want to do more what you need is more information about the observed network conditions. For example a trace of number of bytes received in each 1s (or 100ms) interval since the last callback. The Javascript can then choose its own bandwidth measures/filters/heuristics etc.

There could also be an "intermediate" version of this API in which the player *does* know about manifests etc. and is just asking the Javascript to choose one of the available bitrates for the next request. This way the player manages everything related to determining supported codecs, file formats, scheduling of requests onto TCP connections etc. In this case the Javascript needs to be told the choices including some notion of the available bitrates (which needs to be some kind of peak measure - average is not very useful).

It would be really great if the whole thing could run independently for audio and video. They can be completely decoupled for streaming and synchronized at the renderer.

Best,

Mark 

> 
> Kind regards,
> 
> Jeroen
> _______________________________________________
> foms mailing list
> foms at lists.annodex.net
> http://lists.annodex.net/cgi-bin/mailman/listinfo/foms
>