Hello,<br><br><u><b>Background:</b></u><br>The RFC 5574 suggests the RTP payload format for the speex codec. The payload formation is straight forward; the encoded frames are to be concatenated one after another. Once we have appended desired number of frames, we have to pad the stream with 01111 sort of sequence to ensure that payload ends on a octet boundary.<br>
<br><u><b>Observation:</b></u><br>I am using the speex encoder at 2150 Kbps (by setting the quality to 0).<br>For a frame of 20 ms ~ 160 samples (considering 8000 samples per second as the sampling rate), the encoder is giving me encoded output of 6 bytes.<br>
As a test case, I encoded some 10 frames one after another each time getting 6 bytes of encoded output. I concatenated each of the 6 byte encoded outputs.<br><br>As suggested in couple of posts I tried to decode this stream of encoded voice by calling the decoder repeatedly until the bits remaining api returned me a value less than 1.<br>
<br>What I observed was this sequence: First time the decoder returned successful decode; Second time it returned end of stream; thrid time it returned successful decode; fourth time it returned end of stream; ...<br><br>
That is: decode success, EoS, decode success, EoS, decode success, EoS, ....<br><br><u><b>Hypothesis:</b></u><br>Based on the above observation, what might be happening is:<br>For a frame of 20 ms (=> 50 frames in a second), the encoder (running at 2150 bps) computes 43 bits of encoded stream. Since it has to return in terms of full bytes, it pads 01111 sequence to give a 48 bit output.<br>
Now while decoding 43 bits are first decoded; Then 01111 sequence is interpreted as end of stream; Then next 43 bits are decoded and 01111 is interpreted as end of stream and so on.<br><br><u><b>Query:</b></u><br><span style="color: rgb(255, 0, 0);">For Speex, when we are packing multiple encoded frames in the RTP packet, should we,</span><br style="color: rgb(255, 0, 0);">
<span style="color: rgb(255, 0, 0);">a. pack the encoded frame in full bytes as received from the encoder (i.e. 48 bits)</span><br style="color: rgb(255, 0, 0);"><span style="color: rgb(255, 0, 0);"> or</span><br style="color: rgb(255, 0, 0);">
<span style="color: rgb(255, 0, 0);">b. we should be chopping the end of stream marker 0 followed by 1's (i.e. strictly 43 bits) and have the 0 followed by 1's sequence used only for padding the payload to ensure octet boundary.</span><br>
<br><u><b>Reason for the query:</b></u><br>I want to implement the RTP packetization that is interoperable. If the receiver is not in my control, it should still be able to decode the stream that I am sending.<br><br><br>
Regards,<br>Manish S. Jalan<br>