[foms] What data is needed for adaptive stream switching?
Benjamin M. Schwartz
bmschwar at fas.harvard.edu
Mon Nov 22 20:41:12 PST 2010
On 11/22/2010 07:35 PM, Chris Pearce wrote:
> To reduce memory footprint we only keep about decoded 10 video frames in
> advance of the playback position (but about 2s of audio).
True, 10 frames is pretty tight (although in a way I'm impressed FF is
buffering that much).
>> or (2) exposes the browser's estimate of the time remaining
>> before each queue underruns (possibly +inf). Option 1 is just like your
>> design (using @buffered), but extended to distinguish network and CPU limits.
> There are two separate problems here, detecting slow network, and
> detecting slow CPU. My @buffered proposal can only detect slow network.
Right. An additional value (e.g. @buffered_decoded) would allow an
analogous design to detect slow CPU.
> So you want us to expose an attribute which simply reports whether the
> we expect to drop frames in future?
> I think predicting frame dropping due to slow CPU in a timely manner is
> hard. Given that we only keep 10 decoded frames queued, even if we post
> an event notification when things go bad, you wouldn't have enough time
> to setup a new connection to switch to a lower quality stream before you
> start to drop frames anyway.
The more CPU you are willing to leave unused, the easier it is to avoid
frame dropping. For example, FF could fire an alert event whenever the
video decode CPU usage exceeded X% over the last 5 seconds, triggering a
downgrade to lower resolution. Choosing X is a tradeoff between
utilization and reliability.
Clearly if available CPU abruptly goes to zero and stays there, there is
nothing you can do, but on a properly functioning operating system this
should never happen. :-)
I admit that the decode buffer fullness is not a perfect measure of CPU
usage. Maybe the JS needs access to CPU usage history statistics.
However, buffer fullness can provide an approximate measure of CPU usage.
In normal operation, the decode buffer should always be full (or full-1).
If it's ever not full, then the decoder has missed a deadline and is in
danger of falling behind (although it may catch up on a subsequent frame).
As long as (CPU needed)/(CPU available) changes gradually, the buffer
statistics should be enough to judge whether the available CPU power is
sufficient... thanks to the frame-to-frame variability in decode CPU
requirements. If the standard deviation of decode times is very small,
but there is a large sudden drop in available CPU, clearly this strategy
will not succeed in avoiding dropped frames.
> So you mean guess in advance what resolution and bitrate the user's
> hardware will be able to handle playing?
That's actually a great idea, since the browser will presumably have many
opportunities to build up this sort of wisdom about the hardware.
However, you can also (1) start at low quality and raise it when it seems
safe to do so, or (2) accept dropped frames for the few seconds after
playback starts, as the decoder calibrates the streams. It's the client's
> How do you tell the bitrate of
> a VBR video in advance?
1. You can't stream VBR video.
2. If you have an index (or are playing separate chunks) then you know the
bitrate of each segment. Most of the adaptive stream switching proposals
I've seen here seem to mandate both an index and a constrained bitrate.
> How do you factor in the affect of variable
The browser should schedule threads as needed to maintain the desired user
experience. Maybe that means not decoding video in invisible tabs. Maybe
> People do crazy things with web browsers these days. Decoding is not
> only heavy user of the CPU!
Yes. The browser is now an operating system, and such requires a
scheduler. Any good operating system for multimedia needs a scheduler
that can give multimedia the CPU it needs to decode in real time.
>> Due to variance in decode times, it's likely that at the edge of
>> CPU-starvation you will see some frames miss their display deadline even
>> though the stream can be decoded in real-time on average.
> What would the deadline-miss statistics look like?
With a 10-frame buffer, just below CPU starvation, it would look like you
occasionally get down to 7 or 8 (or, well, N) frames in the buffer, but
then it gets filled back up again. If the decode complexity is slowly
rising, then this is an indication that you should back off to a
lower-complexity stream before you experience an underrun.
> We drop frames for two reasons:
> 1. It looks like we're going to audio underrun, so we need to shed
> work to keep the audio playing smoothly.
The audio should only underrun when the demux buffer is full, preventing
further demuxing. The audio should never be CPU-starved, as the browser
can raise the audio decode's priority as needed. The JS control code,
given knowledge of decode CPU usage, can see that it is too high and
switch to a lower-complexity stream. In short, this shouldn't happen,
which isn't to say that it won't.
> 2. The frame arrived late for painting (possibly due to slow decode,
> or delays in the pipeline).
I assume this means that the decode buffer has underrun, at which point
there isn't much you can do.
> In case 1 what would you report for the deadline-miss?
The browser should somehow inform JS that frames have been dropped due to
CPU starvation, I guess. My overall claim is just that reporting dropped
frames should be the last resort, not the first.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 198 bytes
Desc: OpenPGP digital signature
Url : http://lists.annodex.net/cgi-bin/mailman/private/foms/attachments/20101122/c7189595/attachment.pgp
More information about the foms