[foms] What data is needed for adaptive stream switching?

Mon Nov 22 14:30:46 PST 2010

What data does the browser need to expose to javascript to enable 
clients to do adaptive stream switching?

We need to detect the following situations:
(1) The network is not fast enough to play the current stream at real time.
(2) The network is fast enough to play the current stream, but is not 
fast enough to play a higher quality stream.
(3) The network is fast enough to play a higher quality stream at real time.
(4) The machine is not powerful enough to play the current stream smoothly.
(5) The machine is powerful enough to play the current stream smoothly, 
but not powerful enough to play a higher quality stream smoothly.
(6) The machine is powerful enough to play a higher quality stream smoothly.
(7) Any others?

Browsers can expose counters for frames demuxed, decoded and painted to 
enable detection of cases 4-6.

Or we could also expose an attribute which simply reports whether the 
browser is dropping frames, and so you can use this to determine whether 
you need to switch down to a lower quality stream in order to get smooth 
playback. Or maybe have it return the number of dropped frames in the 
last second?

Given the painting architecture in Firefox, it's very inconvenient to 
separately count the number of frames which were not painted due to the 
video being hidden/off screen (this was suggested by Andy Berkheimer). 
If you wanted to detect this, you could detect when the "painted frames" 
counter completely stops incrementing. Even when we're CPU starved, we 
should be able to paint the occasional frame...

How do we detect cases 1-3?

In HTML5 we already have the @buffered attribute, and the @currentTime 
attribute. With this you can see the current playback position, and you 
can see how many seconds you could play for if the network connection 
stopped dead. This isn't a bitrate approximation; in Firefox at least 
@buffered is accurate to frame boundaries, I'm not sure about other 
browsers' accuracy. During playback you can check periodically, and if 
currentTime - bufferedEndTime is decreasing, you know that you're 
consuming more video than you're receiving, and you'll run out of data 
eventually unless something changes.

I imagine @buffered would be particularly useful in this case for VBR 
content.

That would look something like this:

<script>

// Set an interval timer to call this function a few times a second.
function checkRate(video) {
   var time = video.currentTime;
   var end = findBufferedEnd(video.buffered, time);
   var playable = end - time; // seconds of playable video after current 
playback position

   if (!video.hasAttribute("_playableSeconds") || video.paused) {
     video._playableSeconds = playable;
     return;
   }

   // The change in amount of buffered data
   var deltaPlayable = playable - video._playableSeconds;
   // If deltaPlayable is > 0 we're downloading at a rate faster than 
we're playing.
   if (isTrendingDownwards(deltaPlayable, video)) {
     // We're not downloading playable video at a rate greater than 
we're consuming it.
     decreaseQuality(video);
   } else if (isTrendingUpwards(deltaPlayable, video)) {
     increaseQuality(video);
   }
   video._playableSeconds = playable;
}

function findBufferedEnd(b, t) {
   for (var i=0; i<b.length; i++) {
     if (b.start(i) <= t && t <= b.end(i))
       return b.end(i);
   }
   return t;
}

function isTrendingDownwards(deltaPlayable, video) {
   // Return true if average deltaPlayable is decreasing over the last N 
seconds?
}

function isTrendingUpwards(deltaPlayable, video) {
   // Return true if average deltaPlayable is increasing over the last N 
seconds?
}

</script>

How does that sound? Do you guys think that approach will be sufficient, 
or do you need more data exposed by the browser? If so what data?

We used to have download position and playback position counters on the 
"progress" events, but these were removed recently (we also used to have 
bufferedBytes which Phillip suggested would be a better interface to 
reinstate if wanted this functionality). Are these useful? On VBR 
content, _can_ they be useful? In Firefox we try to use these to 
estimate whether we can play through to the end of media without needing 
to stop to buffer (to determine whether we can fire the canplaythrough 
event, and whether we should switch to readyState HAVE_ENOUGH_DATA), but 
for VBR content this can be _very_ inaccurate.

Do we need to expose any new data in the browser to do adaptive stream 
switching?

Chris P.