[vorbis-dev] Update on Vorbis RTP I-D
Phil Kerr
philkerr at elec.gla.ac.uk
Tue Feb 11 03:47:10 PST 2003
Hi all,
I'm in the final stages of putting a new Vorbis RTP draft together, the
new sections are below.
There are a number of open questions:
1.) At present there is a 16 bit length field for the codebooks. As
they are spec'd as being unbound in length, but typically around 15K,
are there situations where they may be greater than 64KB? The size
limit can be extended to 2MB as there are 5 bits unused.
2.) We discussed using the RTP seq ids to keep the config RTCP messages
in sync with the RTP data stream. I've changed this slightly to use
timestamp values.
3.) Codebook delivery has a number of different ways. I'm currently
looking into TCP over RTP.
4.) There is now an overflow flag. This is used if the size of the
codebooks and/or the comment headers is larger than the max RTCP packet
(64k). If this flag is set clients must obtain the headers from the URI
specified in the overflow URI field or SDP value. This will add a small
amount of complexity to the client.
4.) Port numbers. Should we reserve fixed ports for servers with the
IANA? 1190/91 are unused.
5.) The MIME type is audio/vorbis
I'll have a final draft ready in the next day or so and if possible I
want to submit the update to the IETF by next Monday so there is
discussion time before their next meeting.
Feedback and comments welcomed.
Regards
Phil
--------------8<--------------------8<---------------
4 IANA Considerations
Media MIME type name: audio
Media MIME subtype name: vorbis
Required Parameters: none
Optional Parameters: none
<p>5 Vorbis configuration headers
To decode a Vorbis stream three configuration header information
blocks are needed. This data is sent out-of-band and is defined
below as an APP defined RTCP message with the 4 octet name field
set to VORB.
On joining a session the first packet sent back to the client
MUST be a Vorbis message containing the codec setup and codebook
data.
VORB RTCP packets MUST set the padding (P) flag and add the
appropriate padding octets needed to conform with section 6.6
of [3]. Synchronising the configuration headers to the RTP stream
is critical. A 32 bit timestamp field,
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|V=2|P| subtype | PT=APP=204 | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| SSRC/CSRC |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| VORB |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Timestamp (in sample rate units) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Vorbis Version |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Audio Sample Rate |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Bitrate Maximum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Bitrate Nominal |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Bitrate Minimum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| bsz 0 | bsz 1 | Num Audio Channels |c|m|o|x|x|x|x|x|
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
| Codebook length | Codebook checksum |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
.. Codebook |
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
| Vendor string length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Vendor string ..
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| User comments list length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
.. User comment length / User comment |
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
| URI string length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
.. URI string |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The first Vorbis config header defines the Vorbis stream
attributes. The Vorbis version MUST be set to zero to comply with
this document. The fields Sample Rate up to Num Audio Channels
are set in accordance with [6] with the bsz fields above referring
to the blocksize parameters. The framing bit is not used for RTP
transportation and so applications constructing Vorbis files MUST
take care to set this if required.
The next 8 bits are used to indicate the presence of the two
other Vorbis stream config headers and the size overflow header.
The c flag indicates the presence of a Codebook header block, the
m flag indicates the presence of a comment metadata block. The o
flag indicates if the size of either of the c and m headers would
make the VORB packet greater than that allowed for a RTCP message.
The remaining five bits, indicated with an x, are reserved/unused
and MUST be set to 0.
If the c flag is set then the next header block will contain the
codebook configuration data.
Unlike other mainstream audio codecs Vorbis has no statically
configured probability model instead it packs all entropy decoding
configuration, VQ and Huffman models into a self-contained codebook.
This codebook block also requires additional identification
information detailing the number of audio channels, bit rates and
other information used to initalise the Vorbis stream.
This setup information MUST be completely intact and a client can
not decode a stream with an incomplete or corrupted codebook set.
A 16 bit codebook length field and a 16 bit 1's complement checksum
of the codebook precedes the codebook datablock. The length field
allows for codebooks to be upto 64k in size. The checksum is used to
detect a corrupted codebook. If a checksum failure is detected then
a new config header file SHOULD be obtained from SDP. If no SDP
value is set and no other method for obtaining the config headers
exists then this is considered to be a failure and should be
reported to the client application.
If the m flag is set then the next header block will contain the
comment metadata, such as artist name, track title and so on. These
metadata messages are not intended to be fully descriptive but to
offer basic track/song information. This
message MUST be sent at the start of the stream, together with
the setup and codebook headers, even if it contains no information.
During a session the metadata associated with the stream may change
from that specified at the start, eg. a live concert broadcast
changing acts/scenes, so clients MUST have the ability to receive
m header blocks. Details on the format of the comments can be found
in the Vorbis documentation [7].
The format for the data takes the form of a 32 bit codec vendors
name length field followed by the name encoded in UTF-8. The next
field denotes the number of user comments and then the user comments
length and text field pairs upto the number indicated by the user
comment list length.
The framing bit is not used for RTP transportation and so
applications constructing Vorbis files MUST take care to set
this if required.
If the o, overflow, bit is set then the URI of a whole header
block is specified in an overflow URI field. The header file
specified at the URI MUST NOT have the overflow flag set, otherwise
a loop condition will occur. If SDP information is available then
the URI value set there MUST take precedent.
<p>5 Session Descriptions for Vorbis RTP Streams
Session description information concerning the Vorbis stream
SHOULD be provided if possible and must be in accordance with
[8]. The contents of the Vorbis Header file referred to in the
u attribute must contain all three of the config header blocks
as specified above. Also the overflow bit must not set.
u=<URI of Vorbis header file>
m=audio <port> RTP/AVP 98
c=IN IP4 <URI of Vorbis stream>
a=rtpmap:98 vorbis/<sample rate>
The port value is specified by the server application bound to
the URI specified in the c attribute. The bitrate value specified
in the a attribute MUST match the Vorbis sample rate value.
<p><p> 6. Ogg Vorbis I spec: Codec setup and packet decode.
http://www.xiph.org/ogg/vorbis/doc/vorbis-spec-ref.html
7. Ogg Vorbis I spec: Comment field and header specification.
http://www.xiph.org/ogg/vorbis/doc/v-comment.html
8. SDP: Session Description Protocol (RFC 2327), Handley, M. and
V. Jacobson.
<p><p><p>--- >8 ----
List archives: http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'vorbis-dev-request at xiph.org'
containing only the word 'unsubscribe' in the body. No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.
More information about the Vorbis-dev
mailing list