[opus] Reg an issue with smoothing factor in VAD implementation
Jean-Marc Valin
jmvalin at jmvalin.ca
Fri Feb 16 07:42:55 UTC 2018
Hi Chandrakala, Logan,
Can you confirm that the attached patch fixes the overflow problem?
Koen, can you confirm the fix makes sense?
Cheers,
Jean-Marc
On 11/27/2017 12:10 PM, Logan Stromberg wrote:
> Sorry, long holiday weekend in America.
> I can say with pretty high certainty that there is an overflow occurring
> and it is flipping smooth_coef_Q16 to be negative when it probably
> shouldn't be. I had originally thought it was only an issue where it was
> overflowing the 15th bit but not the 16th, which might still preserve
> the intended value for operations that ignore the sign bit (in cases
> where the 16-bit value is stored in a 32-bit int without sign-extension
> e.g. 0x00008b3b), but that doesn't seem to be the case here.
> The problem is even worse when opus_int is defined as 16-bit in the
> platform - SA_Q15 overflows to negative right here with a similar effect
> speech_nrg = silk_SQRT_APPROX(speech_nrg);
> SA_Q15 = silk_SMULWB(32768 + speech_nrg, SA_Q15);
>
> I can't speak for the logic where the speech energy gets doubled. It
> obviously seems intentional but I don't know why. Maybe so that
> smoothing is performed at a constant rate regardless of whether
> framesize is 10 or 20ms?
>
> On Sun, Nov 26, 2017 at 8:07 PM, Chandrakala Madhira
> <chandrakala.madhira at soctronics.com
> <mailto:chandrakala.madhira at soctronics.com>> wrote:
>
> Hi,
>
> Can anyone let me know if this is a bug?
>
> Thank you,
> Chandrakala
>
> ------------------------------------------------------------------------
> *From: *"Logan Stromberg" <loganstromberg at gmail.com
> <mailto:loganstromberg at gmail.com>>
> *To: *"Chandrakala Madhira" <chandrakala.madhira at soctronics.com
> <mailto:chandrakala.madhira at soctronics.com>>
> *Cc: *opus at xiph.org <mailto:opus at xiph.org>
> *Sent: *Wednesday, November 22, 2017 12:12:39 PM
>
> *Subject: *Re: [opus] Reg an issue with smoothing factor in VAD
> implementation
>
> Yes, yes, I can reproduce it now, but only on platforms that define
> a 16-bit int by default (SA_Q15 is an opus_int rather than
> opus_int32). What system are you compiling this for?
>
> On Tue, Nov 21, 2017 at 8:34 PM, Chandrakala Madhira
> <chandrakala.madhira at soctronics.com
> <mailto:chandrakala.madhira at soctronics.com>> wrote:
>
> Hi Logan,
>
> Please find attached the input stream we are using testing.
>
> Thank you,
> Chandrakala
>
> ------------------------------------------------------------------------
> *From: *"Chandrakala Madhira"
> <chandrakala.madhira at soctronics.com
> <mailto:chandrakala.madhira at soctronics.com>>
> *To: *"Logan Stromberg" <loganstromberg at gmail.com
> <mailto:loganstromberg at gmail.com>>
> *Cc: *opus at xiph.org <mailto:opus at xiph.org>
> *Sent: *Wednesday, November 22, 2017 9:53:55 AM
>
> *Subject: *Re: [opus] Reg an issue with smoothing factor in VAD
> implementation
>
> Hi Logan,
>
> Below is the configuration I am using. The remaining parameters
> have defaults. The stream is a 16KHz stream. The bitrate
> selected is 16KHz. As the thread have size limitations, I will
> try to attach the stream in next mail. At 125th frame of this
> stream, the probability (SA_Q15) value is 0x0000859c. This value
> is treated as a negative number in step-3.
>
> -e voip 16000 1 16000 -framesize 10 -cvbr Input/dg105_16k.wav
> Output/dg105_16k_16000Fs_mono_16000bps_10ms_vbr_voip.bit
>
> Thank you,
> Chandrakala
>
> Thank you,
> Chandrakala
>
> ------------------------------------------------------------------------
> *From: *"Logan Stromberg" <loganstromberg at gmail.com
> <mailto:loganstromberg at gmail.com>>
> *To: *opus at xiph.org <mailto:opus at xiph.org>
> *Sent: *Tuesday, November 21, 2017 2:38:16 AM
> *Subject: *Re: [opus] Reg an issue with smoothing factor in VAD
> implementation
>
> Just for fun, I tried to reproduce such an overflow. I turned on
> all debug macros, assertions, and checked arithmetic and then
> encoded 2 hours of mixed speech/audio with these parameters:
>
> Sample rate = 48000
> Channels = 1
> Application = OPUS_APPLICATION_AUDIO
> Bitrate = 24 KB/s
> Force Mode = MODE_SILK_ONLY
> Signal Type = OPUS_SIGNAL_AUTO
> Complexity = 10
> Frame size = 480 samples (10ms)
>
> No errors came up in encoding. Chandrakala, are these the
> encoding parameters that you believe should trigger the error?
>
> - Logan
>
> Hi,
>
> We are looking at the VAD implementation used in opus. We
> are looking at the code where speech probability is
> calculated based on which SNR is estimated. Below is the
> part of the code I am talking about.
>
> /*********************************/
> /* Speech Probability Estimation */
> /*********************************/
> SA_Q15 = silk_sigm_Q15( silk_SMULWB( VAD_SNR_FACTOR_Q16,
> pSNR_dB_Q7 ) - VAD_NEGATIVE_OFFSET_Q5 ); // step1: Calculate
> speech probability : comment by me
>
> /* Power scaling */
> if( speech_nrg <= 0 ) { // step2: update speech probability
> based on speech energy : comment by me
> SA_Q15 = silk_RSHIFT( SA_Q15, 1 );
> } else if( speech_nrg < 32768 ) {
> if( psEncC->frame_length == 10 * psEncC->fs_kHz ) {
> speech_nrg = silk_LSHIFT_SAT32( speech_nrg, 16 ); // Energy
> is doubled here : comment by me
> } else {
> speech_nrg = silk_LSHIFT_SAT32( speech_nrg, 15 );
> }
>
> /* square-root */
> speech_nrg = silk_SQRT_APPROX( speech_nrg );
> SA_Q15 = silk_SMULWB( 32768 + speech_nrg, SA_Q15 );
> }
>
> /* Smoothing coefficient */
> smooth_coef_Q16 = silk_SMULWB( VAD_SNR_SMOOTH_COEF_Q18,
> silk_SMULWB( (opus_int32)SA_Q15, SA_Q15 ) ); // step3:
> Update the smoothing factor based on speech probability :
> comment by me
>
> if( psEncC->frame_length == 10 * psEncC->fs_kHz ) {
> smooth_coef_Q16 >>= 1;
> }
>
> Here, in step1, Speech probability is calculated whose value
> is expected to be within [0, 1) in Q15 format. Then based on
> the speech energy levels, in Step2, the probability is
> updated whose value shall also lie between [0, 1). Later in
> Step3, the smooth coeff is calculated. This code do not have
> any issue when the frame size is more than or equal to
> 20msec. But, if the frame size is 10ms, then in step2, the
> energy is doubled (this may be done because the original
> Silk code is for 20ms. To convert the energy for 20ms, it
> could have been doubled). When this is done the probability
> which is updated in step2 becomes more than 1. When this is
> used in multiplication in Step3, the value is treated as a
> negative number because its a 32x16 multiplication. This is
> will result in a negative smooth coefficient. Please let me
> know if this is a bug.
>
>
> Thank you,
> Chandrakala
>
>
> _______________________________________________
> opus mailing list
> opus at xiph.org <mailto:opus at xiph.org>
> http://lists.xiph.org/mailman/listinfo/opus
> <http://lists.xiph.org/mailman/listinfo/opus>
>
>
>
>
>
>
>
> _______________________________________________
> opus mailing list
> opus at xiph.org
> http://lists.xiph.org/mailman/listinfo/opus
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: vad_overflow_fix.patch
Type: text/x-patch
Size: 953 bytes
Desc: not available
URL: <http://lists.xiph.org/pipermail/opus/attachments/20180216/29299d0d/attachment.bin>
More information about the opus
mailing list