[xiph-rtp] header caching and chaining

Mon Apr 11 16:45:46 PDT 2005

Ok here is an updated SDP that is more compact. Basically the chain-IDs, 
ident-IDs, and codec-IDs are implied by the order of the information in the 
SDP lines.

-- Case where all chains are known at SDP generation time --

v=0
o=- 1105605563 1105605563 IN IP4 207.188.30.165
s=<No title>
i=<No author> .2000
c=IN IP4 0.0.0.0
t=0 0
a=control:*
a=range:npt=0-202.297000
m=audio 0 RTP/AVP 101
b=AS:8
a=control:TrackID=0
a=rtpmap:101 VORBIS/44100/2
a=fmtp:101 baseURL="http://foo.com/ogg-info/"
a=chain-info: 0:0; 1:0; 2:1;
a=ident-info: url="ident-441k", MD5=987234BC8D92DFE2987234BC8D92DFE2;url="ident-8k", MD5=2186461716517578792145688D92DFE2; url="ident-11k", MD5=218646687642f4AEFD2145688D92DFE2;
a=codebook-info: url="codebook-lowBW", MD5=309573098520975ABEFC34768D92DFE2;url="/codebook-speech", MD5=4567319735186778271C34768D92DFE2;

Basically this says
- Chain ID 0 uses Ident ID 0 and CodebookID 0.
- Chain ID 1 uses Ident ID 1 and CodebookID 0.
- Chain ID 2 uses Ident ID 2 and CodebookID 1.

- Ident 0 can be retrieved at http://foo.com/ogg-info/ident-441k and has an
  MD5 hash of 987234BC8D92DFE2987234BC8D92DFE2
- Ident 1 can be retrieved at http://foo.com/ogg-info/ident-8k and has an 
  MD5 hash of 2186461716517578792145688D92DFE2
- Ident 2 can be retrieved at http://foo.com/ogg-info/ident-11k and has an 
  MD5 hash of 218646687642f4AEFD2145688D92DFE2

- Codebook 0 can be retrieved at http://foo.com/ogg-info/codebook-lowBW and
  has a MD5 hash of 309573098520975ABEFC34768D92DFE2
- Codebook 1 can be retrieved at http://foo.com/codebook-speech and has an
  MD5 hash of 4567319735186778271C34768D92DFE2

Note that the url in the ident-info and codebook info is a relative URL. It
should be applied to the base URL using normal relative URL resolution rules
specified in the URI RFC. URL can contain absolute URLs if you wish, but it
just takes more SDP space.

For now I'm still sticking by my original proposal for dealing with the case
where the chains are not known at SDP generation time. I'd slightly change the
format of the chain info to address Ralph's idea of using different hash 
functions.

-- Case where chains are NOT known at SDP generation time --
v=0
o=- 1105605563 1105605563 IN IP4 207.188.30.165
s=<No title>
i=<No author> .2000
c=IN IP4 0.0.0.0
t=0 0
a=control:*
a=range:npt=0-202.297000
m=audio 0 RTP/AVP 101
b=AS:8
a=control:TrackID=0
a=rtpmap:101 VORBIS/44100/2
a=fmtp:101 chainIDBaseURL="http://foo.com/chainIDs/"

http://foo.com/chainIDs/0 would contain something like

+----------------------+
|    Ident hash 4cc    |
+----------------------+
|   Ident hash length  |
+----------------------+
|   Codebook hash 4cc  |
+----------------------+
| Codebook hash length |
+----------------------+
|       Ident Hash     |
+----------------------+
|     Codebook Hash    |
+----------------------+
|      Ident URL       | <- Null terminated string
+----------------------+
|     Codebook URL     | <- Null terminated string
+----------------------+

The 4cc's indicate which hash is being used. The lengths allow a client to
skip over hash codes it doesn't understand to get to the URLs. This allows
an older client to still be able to get the ident and codebook even if it 
doesn't understand a newer hash code scheme that is being used.

You can also do a hybrid of the 2 SDP's if you happen to know some of the
ident and codebooks at SDP generation time.

more comments inline

On Mon, Apr 11, 2005 at 10:05:54AM -0700, Ralph Giles wrote:
> On Mon, Apr 11, 2005 at 09:06:47AM -0700, Aaron Colwell wrote:
> 
> > I don't really care that much about the size. My only concern is that 
> > theoretically an ogg file can have 2^32-1 chains in it. I know that no one is
> > likely to do this, but they could. I think a 32 bit field in every packet would
> > be a waste of bits as would any number of bits if chaining wasn't even used.
> 
> True. I think supporting 32 bits of chain segments is outside our 
> requirements list though. The arguments I find reasonable for RTP
> chaining support are framerate changes in theora and your suggestion 
> of realserver-style bandwidth adjustment. Being able to support
> more general chained Ogg files is more of a nice side effect; and
> therefore I'm not worried about technically covering the same domain.
> 
> > Ideally I'd like to rearrange the flag bits a little bit so that there could be
> > a bit that indicates whether a chainID is present in the packet. The chainID
> > field could then be variable length ala UTF-8 style or perhaps a simple 
> > encoding like MSB being set means that there is another byte for the chainID.
> > The reason I like this option is that it is kind of a "pay as you go" strategy.
> > As you add more chains to the stream, you pay more and more for the chainID
> > field. Yes it is a little more complex, but it allows you to accomodate any
> > valid ogg file, prevents you from wasting bits when chaining isn't even used,
> > and provides incremental overhead when chaining is used.
> 
> Ok. I'd be happy with this to select between 1,2,3,4 byte signatures, 
> but I still prefer my 16+8 proposal to maintain alignment. We can either 
> steal a bit from the packet count field, or (my preference) have a 
> minimum 1 byte ident field, with the high bit indicating continuation, 
> as you suggest.

ok. The MSb method is my preference too.

> 
> > I think absolute URLs should be used for the base URL and relative URLs
> > should be used for the ident-info. Then you can just use the relative URL
> > resolution rules to figure out what the URL for each ident info. That's 
> > basically how SETUP urls are generated for RTSP. That is where I got the idea
> > from.
> 
> Ok.
> 
> > I though about just pointing to an ogg stream, but that prevents the client
> > from just grabbing the ident or codebook. It is possible for the client to
> > already have one of these 2 pieces from an earlier chain or from an earlier 
> > file. I just wanted to have a scheme where the client is able to only get what
> > it needs.
> > 
> > > [multiplexed header urls]
> > 
> > It would, but if the client already has everything but the Theora codebook then
> > it has to waste bits pulling down the Vorbis ident, Vorbis codebook, and Theora
> > ident header. Having to pull down all the pieces reduces the savings of the
> > client's cache because if you have a cache miss on a codebook or ident you
> > still have to download stuff you might already have.
> 
> In both cases, the latency of additional queries will cost as much as 
> the extra data transfer for broadband users. Perhaps we should do some
> measurements and see what the actual usage patterns would be with 
> current streams?

True. I do think some measurements would probably be useful. The main use of
chaining I can think of right now is the Virgin radio case. In this case there
are multiple chains, but the codebooks and ident's don't change. In this case
you use less bits by not retrieving the codebooks every time. You would just
grab the chain info, notice that the ident and codebook hasn't changed. The
client knows it doesn't have to download the codebooks in this case. That is
the main case that I know is out in the wild right now. 

The less trivial scenario that I can think of is ad-insertion into a stream.
In that case the ident and/or codebook could change. We'd have to create a
stream like this to figure out the latencies involved. It seems like this
is somewhat similar to a .m3u file retrieved via HTTP. You do a GET to retrieve
the .m3u and then you immediately do another GET to get the actual stream.
That usually seems to happen pretty quickly.

I'd be happy to run this experiment if I could find some time to implement all
this stuff in the Helix Server. I'm pretty busy these days, but I'll try to
carve out some time for this.

Aaron

> 
>  -r
>