[foms] Proposal: adaptive streaming using open codecs
jeroen at longtailvideo.com
Sun Oct 31 14:07:15 PDT 2010
On Oct 27, 2010, at 6:13 PM, Mark Watson wrote:
>>> (a) current buffer level (i.e. amount of received but not played media, in playout time)
>> Yes indeed, thanks! That's a total miss. There needs to be a getter for the amount of video in the buffer, otherwise you don't know what's going on. I'll add that (when the WhatWG database is back up ;).
>>> (b) recently observed incoming bandwidth
>> That is part of the QOS metrics section. The "append" API is indeed depending upon that to be useful.
>> You have ideas for additional metrics? Is Netflix considering other data for its heuristics beside these?
> We have some fairly advanced heuristics which have evolved over the years based on large scale testing and which are still evolving. We are still experimenting ;-)
> One point is that "bandwidth" is not well defined unless you say the period over which is was measured, or define some kind of exponentially weighted moving average with some time period. If you provide the actual number of bytes received since the last call, then its precisely clear what this means. My only concern then would be the efficiency of calling this function very frequently. Perhaps the app should be able to specify a measurement period and then query less frequently, getting back an array of the bytes received in all the measurement periods since the last call ?
Hmm, I don't think that's too worrysome. We check this value once / second with our players, storing the _bytesReceived_ value in an array and using an average over the last 5 seconds to determine the "bandwidth". Pretty simple, but it works fine. Anyway, we talked about this topic in another thread, and I agree with you and Andy that a more raw _bytesReceived_value is prefered over something like "dataRate" or "bandwidth" or "bufferToPlaybackRate".
> A technical point is that you need to be careful if the player has some limited buffer size. If it does then it may stop reading from the TCP socket, causing the receive buffer to fill and the data transfer to stop. Now you are no longer measuring what the network can give you. You should be careful only to use bandwidth/bytes received information for time periods where the player was really trying to read from the socket.
Yes, that's indeed the bigger issue. You want to know when the user-agent is trying to fetch the bytes (to calculate BW) and when it is done fetching.
Perhaps events are the solution here. If the "videoElement.appendVideo()" function also send out regular videoElement events (loadstart, progress, loadeddata), the bandwidth can be easily calculated.
I don't know the exact status of which events are sent by which browser though. It's a little hard to follow in this area. Here's a rough list we made 6 months back (that excludes Opera). Perhaps time to update (including Opera) and see if all events we need are not already in there:
> The other thing that affects rate change heuristics is how you define the data rate of each version of the content. It's not straightforward for VBR content. What might be interesting would be for the API to allow the player to report for each version of the content a data rate sufficient to play smoothly for the next X minutes, for some X specified by the app. A simple implementation always reports the same pre-calculated value from the manifest (some kind of max over the content), but you could do something more advanced in future.
Yups. Something simple like 1-min averages usually works fine.
> Frames dropped, height and width are different kinds of information. Useful, but different.
Yes, very useful. Especially in a country like NL (where everybody is in 5mbps+), screensize and decoding power usually defines which quality level to choose.
> What is important for "Quality of Experience" reporting is to know exactly what was displayed when. For example a set of tuples ( start real time, start media time, duration, stream version ), one for each period of continuous uninterrupted playout of a single version of the stream. If you have this and the timestamps of the user interactions (play, pause, seek, stop) then you can reconstruct exactly what the user saw: number of stalls, frequency of rate switches, average quality, startup time etc. (this is what we're proposing to DASH). You may effectively have this if you have accurately timed events for playout start and stop and if the start event fires on rate switches too (I mean when the new rate starts rendering, not when it's requested).
You're talking about reporting this back to the service/server, correct? So the service provider can check, tweak and fix delivery issues. Interesting as well - is DASH aiming to standardize this as well?
>> That would be nice as well indeed, but I'm afraid too much work for browser vendors as a first try (right?). Especially given the uncertainty around which manifest format to use.
>>> It would be really great if the whole thing could run independently for audio and video. They can be completely decoupled for streaming and synchronized at the renderer.
>> I'd imagine both audioElement and videoElement have this "append" call.
>> I did some quick tests with trying to keep a video and an audio in sync (for closed audiodescriptions). You have to pay attention around buffering, but once you have sync it works great.
> By "closed audio descriptions" do you mean additional audio streams for the visually impaired ? How do you handle those ? As a separate audio stream in addition to/instead of the original one, or embedded somehow ?
Separate audio streams, in addition to the original one. Here's the reference of our plugin for this:
More information about the foms