[foms] Proposal: adaptive streaming using open codecs
watsonm at netflix.com
Mon Nov 8 13:29:33 PST 2010
On Nov 8, 2010, at 6:39 AM, Jeroen Wijering wrote:
> Hello all,
>>> I don't like the idea of an appendVideo() function which specifies a byte range because:
>>> • I think that browsers wouldn't want to initiate a new byte range request for every appendVideo() call anyway, they'd want to request the entire media (probably with a startOffset to EOF BRR), in order to reduce delays and overhead caused by setting up many smallish BRR.
>>> • If appendVideo() could specify a byte range, it could specify an invalid range, or a range which doesn't chronologically follow the previously appended range. Everyone who wants to do manual stream switching has to get this right, but if it's implemented in-browser, it only needs to be gotten right once.
>>> • It's easier for a browser to get the transition between two streams consistently seamless than it is for user JS to, since we have more information and more control over the network and decode.
>>> I'd much rather that the browser fetched the indexes of all streams on startup (probably asynchronously) and played a default stream as per normal. If manual switching was enabled, then upon a call to setLevel(), the video would switch to the requested stream at an appropriate time. We'd probably want to start rendering the new stream as soon as possible, as we could be switching down due to dropping too many frames due to the decode not keeping up with rendering.
>> I agree there's a certain amount of ambiguity between lower level range-requests and the un-availability of seekpoints and decoder settings. What about a slightly modified version of appendVideo()?
>> *) Both the startPosition and endPosition are in seconds and optional. If they are both omitted, this function is basically a way to build seamless playlist playback (as Silvia requested).
>> *) Like with today's playback, the browser decides to probe the header data if it doesn't have enough info (yet) on the file. In other words, when building a bitrate switching API with this, the browser fetches the index the first time (part of) this videoURL is requested.
>> *) As the appended video metadata is fetched and the chunk duration is known, the ''duration'' and the "seekable" properties of the stream change. A "durationchange" event is fired. The ''readyState'' is re-evaluated (it may e.g. jump from HAVE_FUTURE_DATA to HAVE_CURRENT_DATA)
>> *) As playback rolls into the new videoURL, the ''videoWidth'' and ''videoHeight'' of the stream are updated (though they may remain the same).
>> *) If the metadata of the appended video cannot be probed, the browser throws an error (MEDIA_ERR_SRC_NOT_SUPPORTED) and does not append the video. This means that, during append-metadata-fetching, the "readyState" of a video does not change.
>> *) Appended videos should be in the same container format and A/V codec than the currently playing video. If not, the browser will throw an error and not append the video (MEDIA_ERR_APPEND_NOT_ALIGNED). This restriction is added to ensure appended content can be decoded within a single pipeline.
>> *) Videostream bitrate, dimensions and framerate may vary. Audiostream bitrate, channels and samplefrequency may also vary. A single decoding pipeline can handle these variations.
>> *) Buffering and metadata availability will still happen inside the browser. When an appendVideo() is fired, the browser will typically rush to load the metadata ánd the media time range (as it does today).
>> This seems to be a good tradeoff and starting point to experiment with adaptive streaming?
I also agree that the end goal is to have the browser handle the manifest and rate adaptation transparently below the video tag - perhaps with some way to give the JS some control of the adaptation. So we are considering what could be done as an interim step where the manifest is handled by the JS layer and the browser just understand the file format (including indexing etc.)
If the JS layer is going to receive detailed "bytes received" values, from which it can estimate bandwidth and make rate adaptation decisions, then it had better also know the date rate of the stream to a similar level of detail. This was an attractive aspect of exposing the detailed time <-> byte range mapping to the JS.
How you specify what you want in the appendVideo call, and how the browser loads it are somewhat independent. The browser could use small byte range requests or a single open-ended request: the instructions from the JS layer are about what to play, not exactly how to fetch it.
> I'd also like to propose to re-enstate the videoElement.PROGRESS event along with its "total" and "loaded" values. What do people from Opera/Firefox think of this? I know it's been removed from the spec because it bites with the buffered/player TimeRanges, but it ís exactly what's needed to perform all kinds of bandwidth calculations. Perphaps a tradeoff is to ping the offset/progress/total of the currently being / last fetched timerange? That's what we do with Jw Player in Flash as well for HTTP pseudo-streaming.
> I'll also wrap this up - together with the appendVideo() call - and send an email to WhatWG. Hopefully one or more browser developers are interested in taking a stab at this.
> Kind regards,
> foms mailing list
> foms at lists.annodex.net
More information about the foms