[Speex-dev] CELT 0.5.0 is out

Jean-Marc Valin jean-marc.valin at usherbrooke.ca
Fri Oct 17 04:30:46 PDT 2008

>> The main reason CELT can't do like Speex (I wish it could) is that in
>> Speex, the overhead of transmitting the mode info is 5 bits for
>> narrowband and 9 bits for wideband. With 10 ms frames, that's just 250
>> and 450 bps. With CELT, there would be a bit more data needed and the
>> frame size can be as small as 2 ms, so we could end up with several kbps
>> of mode signalling. In the current code, there's no signalling at all.
>> The good thing is that after a few frames, the decoder should at least
>> realise it's decoding garbage.
> I understand, but CELT would be useless for SIP if one can't read/guess
> correctly decoder configuration from the RTP data.

Why is that? Isn't it the whole point of SDP that you first negotiate
before sending data?

> One possible way to cope with this would be to have several CELT payload
> defines for use in SIP signalling. This is usually not well accepted as
> this would remove flexibility and increase size and error withing SIP
> negotiation.

There are too many possible parameter combinations to make that viable

> I don't think this requirement is only for SIP: any device receiving
> data would reasonably want to be sure it decodes it correctly. No
> matter the overhead.

That's a totally different issue. What CELT does is that there's always
about 1 bit that's unused when the encoder is done. CELT encodes a known
value there, but because it's encoded with the range coder, a decoder
not using the exact same mode will get something random and the check
for that known value will fail. In practice, it only takes a couple
frames before the decoder realises that there's an error (it can't tell
whether it's an error in the transmission or a mode/version mismatch).

> One approach which is very acceptable to me would be to have something
> like PPS/FPS for h264: you will send a special data packet (one bit?)
> to mark the packet as data or as decoding data.
> The first packet sent is a "decoding information data" content and
> other packets are "real data". With RTP, you will retranmit this packet
> regularly to cope with packet loss or delayed initiation (initial packets
> are often lost at the beginning on one side of the conversation).

That's something that *could* be done. What I'm not sure about is how
complicated it would make it for clients to implement. I think there was
a similar issue with Vorbis (which *requires* a large header before
you're able to decode anything) and it got messy. But again, I don't
know the details.

> I think this approach will fit your need for keeping CELT as
> low as possible but mandatory for VoIP.

What do you mean here?


> tks,
> amsip - http://www.antisip.com
> osip2 - http://www.osip.org
> eXosip2 - http://savannah.nongnu.org/projects/exosip/

More information about the Speex-dev mailing list