[xiph-rtp] about theora-over-rtp draft

Mon Jul 31 03:12:48 PDT 2006

>
> I think you've pointed out another difference between the schemes here:
> latency. With the 2 fragmentation bits you know immediately if you've
> lost a frame, while the marker bit scheme you don't know until you see
> the next packet with a marker bit. In a naive implementation you lag
> either way, but if you're going to do something to interpolate the
> missing frame, you have more time to do that if you don't have to wait
> until the next marker bit falls out of the jitter buffer.
>
> Or...can you infer the same thing from the timestamp?
The correct way to detect missing packets is to use the sequence number.
The fragmentation bits allows to know, after loosing a sequence of packet, if 
you can start decode immediately (in case of a begin of fragment ). Without 
this indication you would wait for the packet after the next marker bit... or 
choose to pass the possibility truncated data to the decoder and see what 
happens.
So this does not make a big difference.

>
> Our other main design goal was to support multicast. Generally speaking,
> non-interactive applications are the opposite case, where bandwidth use
> is more important than latency. Luca pointed out a "youtube"-like
> unicast streaming application, but IP multicast is the case where RTP
> transport is absolutely essential.
I understand this. My feeling is that the bandwidth gain is little compared to 
additional complexity needed at the receiver side be compliant with the 
draft.
I had a little question: when receiving multiple frames in a single RTP 
packet, should we assume that it's always Full-Fragment ?

>
> > To sum up all this discussion, I'd like this draft to:
> > - clarify the packed-conf message (limit between header and tables)
>
> Agreed.
>
> > - explain a SDP operation that let each side to configure asymetrically
> > in a simple offer-answer (only 2 messages) scheme (for me it implies to
> > NOT transmit inline encoder configuration in SDP, which prevents the
> > offerer to adapt to the other end).
>
> Sounds reasonable.
>
> > - use less bits to indicate fragmentation (for me 1 bit is enough, 2 if
> > you wish to indicate begin of frame and end of frame). Whether this bit
> > is RTP marker or not is not important.
> > - assume each rtp contains at most one frame.
> > 	The last two points in the goal of having much simpler unpacketisation
> > code.
>
> I don't think you've made a convincing case here.
I think you 'll get more and more feedbacks from telephony developers in that 
way... Perhaps they'll be more convincing than me !

>
> > Currently the unpacketisation code necessary for implementing this draft
> > makes it more complex than the RFC2429bis or RFC3016 packetisation, which
> > I think is not good if we want more and more people like me or companies
> > to prefer open-source technology instead of heavy patented ones.
>
> Doesn't the (switchable) codebook transmission requirement completely
> overshadow this? Is your rtp library somehow written in such a way that
> it needs significant changes to do packing this way? Is Luca's source
> code not helpful here?

The problem is not a programming one. I could have implemented the draft 
already (with probably twice more lines than my "lightweighted" version.

My problem is that a I think complexity is adequate when there is a need to 
solve a particular problem.

* To solve the problem of frame fragmentation, I think one bit is enough. The 
four item enum proposed in this draft is full of redundancy and makes the 
implementation a bit more complex for no value added.

* You tried to solve the problem of rtp header overhead by proposing a 
multi-frame per rtp packet structure, that's ok but my feeling is that the 
rtp header overhead is not a problem at all. So the complexity of handling 
multiple frames per packet is not justified, for me.

* One big problem for me is to setup a RTP session through SDP with 2 
messages: for me the theora draft is "too simple" as it does not answer 
clearly on how to achieve this.

The reason why I think I might not be totally wrong is that this draft has 
choosen different design principles than mpeg4 and h263, and I don't see any 
reasons for that.

I'll try to implement the draft fully in a future release, with some 
workaround for SDP until the draft clarifies that part.

Simon