[foms] WebM Manifest

Steve Lhomme slhomme at matroska.org
Sat Mar 19 12:13:52 PDT 2011

On Sat, Mar 19, 2011 at 6:19 PM, Mark Watson <watsonm at netflix.com> wrote:
> I'm not sure why you concluded that #2 was not an issue because the frames arrive in decode order. I did not mention anything about the order of frame arrival. The issue is duplicate decoding of a frame, which is an issue both from a decoder capability and computational load point of view.

Yes, decoding frame n and not displaying it is a waste of the
resource. But it would have happened whether the switch is happening
to an aligned or non-aligned fragment. By the time n-1 is displayed,
the original n frame has been loaded and decoded anyway.

>> Now that's a good point in favor of using the TCP window more. It
>> could be reduced while the decision is being made or start loading
>> from another stream while the main one is still loading. With a window
>> of (almost) 0 the TCP connection would then be established/ready to be
>> used as soon as the bandwidth is available. When you know you are
>> going to switch to a new stream, you can reduce the window gradually,
>> with 0 happening at the exact byte end position of the fragment (or
>> n-1 frame). That would minimize the bandwidth waste and latency time
>> between reading 2 fragments.
> Closing the receive window just pauses the transmission. The data you originally requested will still come later unless you close the connection.

Yes. The idea is to reduce the window gradually to 0 so the server
stops sending more data before we tell it to close the connection (in
fact both information could be sent in the same packet, but I'm not
sure the way sockets work usually does that).

> Receiving the overlap data is not really the issue (though it would be nice to avoid). The point is that you cannot detect where to stop in the old stream without parsing down to the frame level. Which ties together the media player and the adaptive streamer in a way which is both unnecessary and not aligned with existing architectures.

By existing architecture you mean player architecture or existing
adaptive streaming systems ?

The idea of allowing non-aligned variants is not unnecessary IMO. I
think it should be carefully thought before ruling it out. So far I
see drawbacks that balance the advantages but no deal breaker. But
that's my opinion and I hope we can all reach a consensus on this.

>> That's true and using a range request is surely nicer than playing
>> with the TCP window. But as shown above, playing with the TCP window
>> can still be useful when switching streams. (server + DNS + TCP
>> latency). Also using a "range" request has some drawbacks. It forces
>> to open a new connection for each fragment, even if the new fragment
>> was exactly the following of the previous fragment.
> No, you re-use the same connection for the next request.

I did not know that. Is it supported by all HTTP servers (handling
many requests on the same TCP connection) ? In that case stopping a
range-request could be possible too in the middle of the stream,
helping the issue of switching to another variant in the middle.

>> That result in
>> time and resource wasted to establish the TCP connection and the
>> server side "session". If you use an "offset" request (a start offset
>> but no end) you avoid that issue. And you just need to adjust your TCP
>> window if you really don't want to waste a byte in the transmission.
>> It has already been established that having fragments of one stream in
>> many file is not practical and will likely not be used (for on demand
>> at least). So maybe the next step should be to NOT use range requests
>> at all.
> What would you use then?

A request specifying the offset to start reading from in the remote
stream, without specifying the end offset. I don't know if this is
also called a range request or not.

Steve Lhomme
Matroska association Chairman

More information about the foms mailing list