[foms] WebM Manifest

Philip Jägenstedt philipj at opera.com
Thu Mar 17 05:47:26 PDT 2011

On Thu, 17 Mar 2011 12:24:57 +0100, Jeroen Wijering  
<jeroen at longtailvideo.com> wrote:

> On Mar 17, 2011, at 9:35 AM, Philip Jägenstedt wrote:
>> My approximate solution for the low-level API is something like this:
>> * XMLHttpRequest is used to get a data Blob for a resource.
>> * We have some mechanism to concatenate several Blobs into a Stream  
>> object
>> * setting video.src to the Stream object just works, similar to how  
>> video
>> conferencing is defined.
>> What's missing here is to make XMLHttpRequest byte range aware, so that
>> one could do the kind of thing you mention above.
> Do note a few additional JavaScript APIs are required in order for such  
> a low level API to work:
> *) Byte-level access to the data fetched using XMLHttpRequests. For  
> example, to read and parse the seek index and to read and store the  
> codec configurations.

This will eventually be possible, see  

> *) A means to detect decoding load (e.g. CPU load, frames  
> painted/dropped, pixel throughput), so processing constraints can be  
> taken into account.

Mozilla is already working on playback statistics, I assume that something  
will eventually be standardized.

> I'm not up to date with all functionalities XMLHttpRequests, but I  
> presume there's events like initialization (to calculate latency) and  
> progress (to calculate bandwidth)?

If there are progress events for the request, then both the latency and  
bandwidth could be inferred from that.

> Next, the Stream API needs to be very strictly defined in terms of how  
> provided A/V frames should be formatted, and how and when codec  
> initialization data must be (re)sent.
> Basically, javascript handles the demuxing. This would be a great API,  
> allowing for much flexibility. At the same time, the amount of knowledge  
> required for such an API would be so staggering (e.g. full understanding  
> of video containers) that few people would be able to work with it.

I may very well be in need of education, but I don't see why that needs to  
be the case.

Assume a manifest at its simplest is a list of URLs and switchover times.  
If one has a "manifest API" that allows one to add URLs and switchover  
times, then surely anything that can be done with a manifest can be done  
with the API? If a manifest solution doesn't require inspecting the data  
outside of the normal decoding, why would it be necessary when one uses an  

Philip Jägenstedt
Core Developer
Opera Software

More information about the foms mailing list