[foms] Proposal: adaptive streaming using open codecs
watsonm at netflix.com
Wed Oct 27 09:13:23 PDT 2010
On Oct 27, 2010, at 3:35 AM, Jeroen Wijering wrote:
Hey Mark, all,
On Oct 26, 2010, at 8:41 PM, Mark Watson wrote:
(a) current buffer level (i.e. amount of received but not played media, in playout time)
Yes indeed, thanks! That's a total miss. There needs to be a getter for the amount of video in the buffer, otherwise you don't know what's going on. I'll add that (when the WhatWG database is back up ;).
(b) recently observed incoming bandwidth
That is part of the QOS metrics section. The "append" API is indeed depending upon that to be useful.
You have ideas for additional metrics? Is Netflix considering other data for its heuristics beside these?
We have some fairly advanced heuristics which have evolved over the years based on large scale testing and which are still evolving. We are still experimenting ;-)
One point is that "bandwidth" is not well defined unless you say the period over which is was measured, or define some kind of exponentially weighted moving average with some time period. If you provide the actual number of bytes received since the last call, then its precisely clear what this means. My only concern then would be the efficiency of calling this function very frequently. Perhaps the app should be able to specify a measurement period and then query less frequently, getting back an array of the bytes received in all the measurement periods since the last call ?
A technical point is that you need to be careful if the player has some limited buffer size. If it does then it may stop reading from the TCP socket, causing the receive buffer to fill and the data transfer to stop. Now you are no longer measuring what the network can give you. You should be careful only to use bandwidth/bytes received information for time periods where the player was really trying to read from the socket.
The other thing that affects rate change heuristics is how you define the data rate of each version of the content. It's not straightforward for VBR content. What might be interesting would be for the API to allow the player to report for each version of the content a data rate sufficient to play smoothly for the next X minutes, for some X specified by the app. A simple implementation always reports the same pre-calculated value from the manifest (some kind of max over the content), but you could do something more advanced in future.
Frames dropped, height and width are different kinds of information. Useful, but different.
What is important for "Quality of Experience" reporting is to know exactly what was displayed when. For example a set of tuples ( start real time, start media time, duration, stream version ), one for each period of continuous uninterrupted playout of a single version of the stream. If you have this and the timestamps of the user interactions (play, pause, seek, stop) then you can reconstruct exactly what the user saw: number of stalls, frequency of rate switches, average quality, startup time etc. (this is what we're proposing to DASH). You may effectively have this if you have accurately timed events for playout start and stop and if the start event fires on rate switches too (I mean when the new rate starts rendering, not when it's requested).
That would be nice as well indeed, but I'm afraid too much work for browser vendors as a first try (right?). Especially given the uncertainty around which manifest format to use.
It would be really great if the whole thing could run independently for audio and video. They can be completely decoupled for streaming and synchronized at the renderer.
I'd imagine both audioElement and videoElement have this "append" call.
I did some quick tests with trying to keep a video and an audio in sync (for closed audiodescriptions). You have to pay attention around buffering, but once you have sync it works great.
By "closed audio descriptions" do you mean additional audio streams for the visually impaired ? How do you handle those ? As a separate audio stream in addition to/instead of the original one, or embedded somehow ?
Or did you mean captions/subtitles ?
In time, it'd be awesome if one could feed a videoElement e.g. a full DASH manifest with separate video, audio and text tracks. But that'll probably be a couple of iterations away...
foms mailing list
foms at lists.annodex.net<mailto:foms at lists.annodex.net>
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the foms