[foms] Proposal: adaptive streaming using open codecs
fgalligan at google.com
Mon Nov 15 20:07:45 PST 2010
On Mon, Nov 15, 2010 at 5:22 PM, Silvia Pfeiffer
<silviapfeiffer1 at gmail.com>wrote:
> On Tue, Nov 16, 2010 at 4:49 AM, Steve Lhomme <slhomme at matroska.org>
> > On Mon, Nov 15, 2010 at 6:48 PM, Steve Lhomme <slhomme at matroska.org>
> >> Doesn't it lead to more sync issues when the files you received are
> >> not interleaved ? The 2 streams may not load at the same speed (once
> >> better cached than the other for example). It also makes it harder to
> >> estimate the current download speed... That's an edge case, but
> >> precisely the kind of odd network behaviour that "adaptative"
> >> streaming is meant to handle.
> >> One big pro for non interleaved is that switching between languages
> >> (or regular/commentary track) is a lot easier and the only reasonable
> >> way to handle it server side.
> > PS: And also allows something not possible now: listen to music from
> > video sites without having to load the video part. It's possible with
> > RTP but the quality (on YouTube for ex) is just not there.
> I believe we are optimizing for the wrong use cases by trying to
> provide data to the Web browser in a non-interleaved manner. I would
> not put that functionality into the adaptive HTTP streaming layer, but
> into other technologies.
> Firstly, providing different language audio tracks to the Web browser
> for a video can be handled at the markup level. There is work in
> progress on this anyway because we will see video descriptions and
> sign language video that will need to be delivered on demand in
> parallel to the main video. I would prefer we do not try to solve this
> problem through adaptive HTTP streaming - it seems to wrong layer to
> get this sorted.
I think this is fine. For this to work clients will have to
take synchronized separate streams and render them at correct times. I don't
see how this is different than rendering one video and one audio that came
from separate streams. It shouldn't matter if the streams are referenced
from a manifest or from the markup.
> Secondly, the use case of picking up only an audio track from a video
> is also one that can be solved differently. It requires a process on
> the server anyway to extract the audio data from the video and then it
> would be a user request. So, it would probably come through a media
> fragment URI such as http://example.com/video.ogv?track=audio which
> would be processed by the server and an audio resource would be
> delivered, if the service provider decides to offer such
> I don't see the difference between an audio language track and N different
audio bitrate streams. We could make it a lot easier on the
adaptive streaming clients if we said we need server-side software to
re-mux everything on the fly. This might be feasible for larger companies
but I think don't think it will be feasible smaller companies. Having them
build or even manage anything other than unmodified HTTP servers is probably
As I have understood adaptive HTTP streaming, it is supposed to be a
> simple implementation where the player's only additional functionality
> is in interpreting a manifest file and switching between resource
> chunks rather than byte ranges of a single resource.
I don't think physical chunked files would necessarily be harder for a
client to deal with over byte ranges. Actually having the client keep av
sync on chunk boundaries might be harder. The creation tools will be a lot
harder over what we have now too.
> All the decoding
> pipeline continues to stay in tact and work as previously.
The decoding pipeline shouldn't have to change that much for streams in
separate files. At a high level you would be removing the demux step for
streams in separate files. Worst case if the client was designed in such a
way that it was really hard to remove the demux step you could add a simple
muxer right before the demux step.
I think for chunked files the decoding pipeline will have to change a good
amount as there will have to be another step added to remove any of
the extraneous data in the chunk to send one interleaved stream down into
the decoding pipeline as before.
> I think we
> should not touch the interleaved delivery functionality at this level.
As I said above the decoding pipeline of a client is going to have to change
no matter which way we go.
> It would cause the player to do too much synchronisation
Actually with separate streams in files you would have to do just about the
same synchronization as a client does now. The only difference being if one
of the streams gets too far ahead of the other slow down on the download of
that stream until it gets to an acceptable level.
As I said above I think the client could have a tougher time doing
the synchronization across physical boundaries of chunks.
> delivery handling overhead work that should really not be created from
> a single resource.
I think there will be more network overhead with physical chunks than
separate streams because you have file and stream overhead that needs to be
replicated in each chunk.
> foms mailing list
> foms at lists.annodex.net
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the foms