[foms] Proposal: adaptive streaming using open codecs

Sun Oct 31 14:07:15 PDT 2010

Hello all,

On Oct 27, 2010, at 6:13 PM, Mark Watson wrote:

>>> Instead, the Javascript code could get called back each time a chunk has been downloaded. This is the point at which you want to decide whether and what chunk to request next. In the simplest case the information you need for this decision is
>>> (a) current buffer level (i.e. amount of received but not played media, in playout time)
>> 
>> Yes indeed, thanks! That's a total miss. There needs to be a getter for the amount of video in the buffer, otherwise you don't know what's going on. I'll add that (when the WhatWG database is back up ;).
>> 
>>> (b) recently observed incoming bandwidth
>> 
>> That is part of the QOS metrics section. The "append" API is indeed depending upon that to be useful.
>> 
>>> However, you might find that with this limited information there are not many adaptation algorithms you can actually build and so not much scope for experimentation. If you want to do more what you need is more information about the observed network conditions. For example a trace of number of bytes received in each 1s (or 100ms) interval since the last callback. The Javascript can then choose its own bandwidth measures/filters/heuristics etc.
>> 
>> The QOS section defines a getter for bandwidth (bytes received per second), frames dropped, height and width. A typical javascript-based heuristics algorithm would poll and store these metrics in an array or so. With that data, you could do all kinds of stuff in terms of e.g. averaging things out, building treshold filters or periodically blacklisting levels.
>> 
>> You have ideas for additional metrics? Is Netflix considering other data for its heuristics beside these?
> 
> We have some fairly advanced heuristics which have evolved over the years based on large scale testing and which are still evolving. We are still experimenting ;-) 
> 
> We proposed a work item on QoS metrics into DASH - depending on your timeframe we could align. In DASH it's just about defining what information to collect, not how to report it (IMO that would be done by the Javascript app).
> 
> One point is that "bandwidth" is not well defined unless you say the period over which is was measured, or define some kind of exponentially weighted moving average with some time period. If you provide the actual number of bytes received since the last call, then its precisely clear what this means. My only concern then would be the efficiency of calling this function very frequently. Perhaps the app should be able to specify a measurement period and then query less frequently, getting back an array of the bytes received in all the measurement periods since the last call ?

Hmm, I don't think that's too worrysome. We check this value once / second with our players, storing the _bytesReceived_ value in an array and using an average over the last 5 seconds to determine the "bandwidth". Pretty simple, but it works fine. Anyway, we talked about this topic in another thread, and I agree with you and Andy that a more raw _bytesReceived_value is prefered over something like "dataRate" or "bandwidth" or "bufferToPlaybackRate".

> A technical point is that you need to be careful if the player has some limited buffer size. If it does then it may stop reading from the TCP socket, causing the receive buffer to fill and the data transfer to stop. Now you are no longer measuring what the network can give you. You should be careful only to use bandwidth/bytes received information for time periods where the player was really trying to read from the socket.

Yes, that's indeed the bigger issue. You want to know when the user-agent is trying to fetch the bytes (to calculate BW) and when it is done fetching. 

Perhaps events are the solution here. If the "videoElement.appendVideo()" function also send out regular videoElement events (loadstart, progress, loadeddata), the bandwidth can be easily calculated. 

I don't know the exact status of which events are sent by which browser though. It's a little hard to follow in this area. Here's a rough list we made 6 months back (that excludes Opera). Perhaps time to update (including Opera) and see if all events we need are not already in there:

http://developer.longtailvideo.com/trac/wiki/HTML5BrowserCompatability

> The other thing that affects rate change heuristics is how you define the data rate of each version of the content. It's not straightforward for VBR content. What might be interesting would be for the API to allow the player to report for each version of the content a data rate sufficient to play smoothly for the next X minutes, for some X specified by the app. A simple implementation always reports the same pre-calculated value from the manifest (some kind of max over the content), but you could do something more advanced in future.

Yups. Something simple like 1-min averages usually works fine.

> Frames dropped, height and width are different kinds of information. Useful, but different.

Yes, very useful. Especially in a country like NL (where everybody is in 5mbps+), screensize  and decoding power usually defines which quality  level to choose.

> What is important for "Quality of Experience" reporting is to know exactly what was displayed when. For example a set of tuples ( start real time, start media time, duration, stream version ), one for each period of continuous uninterrupted playout of a single version of the stream. If you have this and the timestamps of the user interactions (play, pause, seek, stop) then you can reconstruct exactly what the user saw: number of stalls, frequency of rate switches, average quality, startup time etc. (this is what we're proposing to DASH). You may effectively have this if you have accurately timed events for playout start and stop and if the start event fires on rate switches too (I mean when the new rate starts rendering, not when it's requested).

You're talking about reporting this back to the service/server, correct? So the service provider can check, tweak and fix delivery issues. Interesting as well - is DASH aiming to standardize this as well?

>>> There could also be an "intermediate" version of this API in which the player *does* know about manifests etc. and is just asking the Javascript to choose one of the available bitrates for the next request. This way the player manages everything related to determining supported codecs, file formats, scheduling of requests onto TCP connections etc. In this case the Javascript needs to be told the choices including some notion of the available bitrates (which needs to be some kind of peak measure - average is not very useful).
>> 
>> That would be nice as well indeed, but I'm afraid too much work for browser vendors as a first try (right?). Especially given the uncertainty around which manifest format to use.
>> 
>>> It would be really great if the whole thing could run independently for audio and video. They can be completely decoupled for streaming and synchronized at the renderer.
>> 
>> I'd imagine both audioElement and videoElement have this "append" call. 
>> 
>> I did some quick tests with trying to keep a video and an audio in sync (for closed audiodescriptions). You have to pay attention around buffering, but once you have sync it works great.
> 
> By "closed audio descriptions" do you mean additional audio streams for the visually impaired ? How do you handle those ? As a separate audio stream in addition to/instead of the original one, or embedded somehow ?

Separate audio streams, in addition to the original one. Here's the reference of our plugin for this:

http://www.longtailvideo.com/support/addons/audio-description/15136/audio-description-reference-guide

Kind regards,

Jeroen