[Speex-dev] Questions relating RTP packetisation
shreddingwork at gmail.com
Sun Sep 30 03:26:54 PDT 2012
I am working on implementing RFC 5574 (RTP Payload Format for the Speex
Codec) in the ffmpeg and have a question concerning it.
It would be nice if somebody could answered it.
* Chapter 4.1.1 Registration of Media Type Audio/Speex, subpart
"Optional parameters" states these SDP optional parameters:
vbr: variable bit-rate - either 'on', 'off', or 'vad' (defaults
to 'off'). If 'on', variable bit-rate is enabled. If 'off',
disabled. If set to 'vad', then constant bit-rate is used, but
silence will be encoded with special short frames to indicate a
lack of voice for that period. This parameter is a preference
to the encoder.
cng: comfort noise generation - either 'on' or 'off' (defaults to
'off'). If 'off', then silence frames will be silent; if 'on',
then those frames will be filled with comfort noise. This
parameter is a preference to the encoder.
And Speex documentation here
When enabled, voice activity detection detects whether the audio being
encoded is speech or silence/background noise. VAD is always implicitly
activated when encoding in VBR, so the option is only useful in non-VBR
operation. In this case, Speex detects non-speech periods and encode
them with just enough bits to reproduce the background noise. This is
called ``comfort noise generation'' (CNG).
So, I am a little lost: speex doc says that VAD for CBR environment operates
with CNG during silence periods. But RFC separates these two.
My question is: what functionality is expected for cases:
* vbr=vad + cng=off
* vbr=off + cng=on
P.S. I'm aware of DTX and I think it would make sense if "cng" parameter were actually
controlling dtx func (description in RFC looks quite like it).
More information about the Speex-dev