[Speex-dev] Packing multiple frames in a RTP packet

Jean-Marc Valin jean-marc.valin at usherbrooke.ca
Thu Dec 10 03:52:52 PST 2009


You cannot concatenate bytes because Speex frames don't necessarily end
on octet boundaries. You need to call the encoder multiple times on the
same SpeexBits bitpacket.

	Jean-Marc

Manish Jalan wrote:
> Hello,
> 
> _*Background:*_
> The RFC 5574 suggests the RTP payload format for the speex codec. The
> payload formation is straight forward; the encoded frames are to be
> concatenated one after another. Once we have appended desired number of
> frames, we have to pad the stream with 01111 sort of sequence to ensure
> that payload ends on a octet boundary.
> 
> _*Observation:*_
> I am using the speex encoder at 2150 Kbps (by setting the quality to 0).
> For a frame of 20 ms ~ 160 samples (considering 8000 samples per second
> as the sampling rate), the encoder is giving me encoded output of 6 bytes.
> As a test case, I encoded some 10 frames one after another each time
> getting 6 bytes of encoded output. I concatenated each of the 6 byte
> encoded outputs.
> 
> As suggested in couple of posts I tried to decode this stream of encoded
> voice by calling the decoder repeatedly until the bits remaining api
> returned me a value less than 1.
> 
> What I observed was this sequence: First time the decoder returned
> successful decode; Second time it returned end of stream; thrid time it
> returned successful decode; fourth time it returned end of stream; ...
> 
> That is: decode success, EoS, decode success, EoS, decode success, EoS, ....
> 
> _*Hypothesis:*_
> Based on the above observation, what might be happening is:
> For a frame of 20 ms (=> 50 frames in a second), the encoder (running at
> 2150 bps) computes 43 bits of encoded stream. Since it has to return in
> terms of full bytes, it pads 01111 sequence to give a 48 bit output.
> Now while decoding 43 bits are first decoded; Then 01111 sequence is
> interpreted as end of stream; Then next 43 bits  are decoded and 01111
> is interpreted as end of stream and so on.
> 
> _*Query:*_
> For Speex, when we are packing multiple encoded frames in the RTP
> packet, should we,
> a. pack the encoded frame in full bytes as received from the encoder
> (i.e. 48 bits)
>    or
> b. we should be chopping the end of stream marker 0 followed by 1's
> (i.e. strictly 43 bits) and have the 0 followed by 1's sequence used
> only for padding the payload to ensure octet boundary.
> 
> _*Reason for the query:*_
> I want to implement the RTP packetization that is interoperable. If the
> receiver is not in my control, it should still be able to decode the
> stream that I am sending.
> 
> 
> Regards,
> Manish S. Jalan
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Speex-dev mailing list
> Speex-dev at xiph.org
> http://lists.xiph.org/mailman/listinfo/speex-dev


More information about the Speex-dev mailing list