[foms] Chunks

Thu Nov 11 00:11:51 PST 2010

I am currently reading this document:
http://wiki.whatwg.org/wiki/Adaptive_Streaming and especially this
part "In its most low-tech form, chunks can be stored as separate
files-on-disc on a webserver. This poses issues around transcoding (no
ecosystem yet) and file management (not everybody loves 100s of
files)."

When you say range-requests could be used, I assume you mean in the
same file ? Given you know where exactly in the stream you want to
seek (via the manifest file). In MPEG TS that should not be much of a
problem as there is no header. In Matroska/WebM the server would need
some remuxing intelligence to provide the header before the actual
data requested. That's of course considering the file was not encoded
in "chunks". But adding that kind of intelligence to web server is not
a good thing.

Matroska can have Segments (chunks) concatenated and it should play
fine as long as the channel parameters remain the same between chunks.
That has the advantage of having chunks and still dealing with a
single file. VLC and DirectShow based players should be able to handle
these type of files. I think browsers supporting WebM should also
ensure they support this feature. After all compared to loading chunks
separately you avoid some latency for each new HTTP connection. In the
hand any software that supports this can support chunked videos,
whether they come from a single file or various files virtually
concatenated (in which case resolution changes should be allowed and
handled as a single output resolution).

So the resolution (width/height) should also be known beforehand. When
integrated in a web page it's important to know exactly how much a
video will take. Even if at some point during playback the actual
frames are much smaller. What is the plan here ? Let the m3u8 files
tell the max/default dimension expected or making the width/height
mandatory in the <video> element ?

The paragraph also says that there's currently no ecosystem to produce
such files. From what I see, it may be easy enough to encode the same
stream the usual way in as many files per bandwidth. And then have a
tool simply make the splits where the keyframes match in all these
files. That means encoders are not modified and you're assured all
chunks are in sync (it's a little trickier on the audio side as Vorbis
doesn't always have the same amount of sample per frames, but that can
be resolved on the player side). The only trick is to tell the video
encoder the max allowed amount of frames before a keyframe is issued.
I think most encoders support that. And from the same source, the
keyframes should happen more or less at the same locations for all
bandwidths.

I've seen mentioned in a discussion that switching to JSON could be an
option. I think it would be a good move. It's a little more flexible
(arrays) and would use code already present in all browsers and many
programming languages, especially the ones made for the web (easier
than XML on Android for example)

Some of my 2 cents :)