[opus] Reg an issue with smoothing factor in VAD implementation

Jean-Marc Valin jmvalin at jmvalin.ca
Fri Feb 16 07:42:55 UTC 2018


Hi Chandrakala, Logan,

Can you confirm that the attached patch fixes the overflow problem?

Koen, can you confirm the fix makes sense?

Cheers,

	Jean-Marc

On 11/27/2017 12:10 PM, Logan Stromberg wrote:
> Sorry, long holiday weekend in America.
> I can say with pretty high certainty that there is an overflow occurring
> and it is flipping smooth_coef_Q16 to be negative when it probably
> shouldn't be. I had originally thought it was only an issue where it was
> overflowing the 15th bit but not the 16th, which might still preserve
> the intended value for operations that ignore the sign bit (in cases
> where the 16-bit value is stored in a 32-bit int without sign-extension
> e.g. 0x00008b3b), but that doesn't seem to be the case here.
> The problem is even worse when opus_int is defined as 16-bit in the
> platform - SA_Q15 overflows to negative right here with a similar effect
> speech_nrg = silk_SQRT_APPROX(speech_nrg);
> SA_Q15 = silk_SMULWB(32768 + speech_nrg, SA_Q15);
> 
> I can't speak for the logic where the speech energy gets doubled. It
> obviously seems intentional but I don't know why. Maybe so that
> smoothing is performed at a constant rate regardless of whether
> framesize is 10 or 20ms?
> 
> On Sun, Nov 26, 2017 at 8:07 PM, Chandrakala Madhira
> <chandrakala.madhira at soctronics.com
> <mailto:chandrakala.madhira at soctronics.com>> wrote:
> 
>     Hi,
> 
>     Can anyone let me know if this is a bug?
> 
>     Thank you,
>     Chandrakala
> 
>     ------------------------------------------------------------------------
>     *From: *"Logan Stromberg" <loganstromberg at gmail.com
>     <mailto:loganstromberg at gmail.com>>
>     *To: *"Chandrakala Madhira" <chandrakala.madhira at soctronics.com
>     <mailto:chandrakala.madhira at soctronics.com>>
>     *Cc: *opus at xiph.org <mailto:opus at xiph.org>
>     *Sent: *Wednesday, November 22, 2017 12:12:39 PM
> 
>     *Subject: *Re: [opus] Reg an issue with smoothing factor in VAD
>     implementation
> 
>     Yes, yes, I can reproduce it now, but only on platforms that define
>     a 16-bit int by default (SA_Q15 is an opus_int rather than
>     opus_int32). What system are you compiling this for?
> 
>     On Tue, Nov 21, 2017 at 8:34 PM, Chandrakala Madhira
>     <chandrakala.madhira at soctronics.com
>     <mailto:chandrakala.madhira at soctronics.com>> wrote:
> 
>         Hi Logan,
> 
>         Please find attached the input stream we are using testing.
> 
>         Thank you,
>         Chandrakala
> 
>         ------------------------------------------------------------------------
>         *From: *"Chandrakala Madhira"
>         <chandrakala.madhira at soctronics.com
>         <mailto:chandrakala.madhira at soctronics.com>>
>         *To: *"Logan Stromberg" <loganstromberg at gmail.com
>         <mailto:loganstromberg at gmail.com>>
>         *Cc: *opus at xiph.org <mailto:opus at xiph.org>
>         *Sent: *Wednesday, November 22, 2017 9:53:55 AM
> 
>         *Subject: *Re: [opus] Reg an issue with smoothing factor in VAD
>         implementation
> 
>         Hi Logan,
> 
>         Below is the configuration I am using. The remaining parameters
>         have defaults. The stream is a 16KHz stream. The bitrate
>         selected is 16KHz. As the thread have size limitations, I will
>         try to attach the stream in next mail. At 125th frame of this
>         stream, the probability (SA_Q15) value is 0x0000859c. This value
>         is treated as a negative number in step-3.
> 
>         -e voip 16000 1 16000  -framesize 10  -cvbr Input/dg105_16k.wav
>         Output/dg105_16k_16000Fs_mono_16000bps_10ms_vbr_voip.bit
> 
>         Thank you,
>         Chandrakala
> 
>         Thank you,
>         Chandrakala
> 
>         ------------------------------------------------------------------------
>         *From: *"Logan Stromberg" <loganstromberg at gmail.com
>         <mailto:loganstromberg at gmail.com>>
>         *To: *opus at xiph.org <mailto:opus at xiph.org>
>         *Sent: *Tuesday, November 21, 2017 2:38:16 AM
>         *Subject: *Re: [opus] Reg an issue with smoothing factor in VAD
>         implementation
> 
>         Just for fun, I tried to reproduce such an overflow. I turned on
>         all debug macros, assertions, and checked arithmetic and then
>         encoded 2 hours of mixed speech/audio with these parameters:
> 
>         Sample rate = 48000
>         Channels = 1
>         Application = OPUS_APPLICATION_AUDIO
>         Bitrate = 24 KB/s
>         Force Mode = MODE_SILK_ONLY
>         Signal Type = OPUS_SIGNAL_AUTO
>         Complexity = 10
>         Frame size = 480 samples (10ms)
> 
>         No errors came up in encoding. Chandrakala, are these the
>         encoding parameters that you believe should trigger the error?
> 
>         - Logan
> 
>             Hi,
> 
>             We are looking at the VAD implementation used in opus. We
>             are looking at the code where speech probability is
>             calculated based on which SNR is estimated. Below is the
>             part of the code I am talking about.
> 
>             /*********************************/
>             /* Speech Probability Estimation */
>             /*********************************/
>             SA_Q15 = silk_sigm_Q15( silk_SMULWB( VAD_SNR_FACTOR_Q16,
>             pSNR_dB_Q7 ) - VAD_NEGATIVE_OFFSET_Q5 ); // step1: Calculate
>             speech probability : comment by me
> 
>             /* Power scaling */
>             if( speech_nrg <= 0 ) { // step2: update speech probability
>             based on speech energy : comment by me
>             SA_Q15 = silk_RSHIFT( SA_Q15, 1 );
>             } else if( speech_nrg < 32768 ) {
>             if( psEncC->frame_length == 10 * psEncC->fs_kHz ) {
>             speech_nrg = silk_LSHIFT_SAT32( speech_nrg, 16 ); // Energy
>             is doubled here : comment by me
>             } else {
>             speech_nrg = silk_LSHIFT_SAT32( speech_nrg, 15 );
>             }
> 
>             /* square-root */
>             speech_nrg = silk_SQRT_APPROX( speech_nrg );
>             SA_Q15 = silk_SMULWB( 32768 + speech_nrg, SA_Q15 );
>             }
> 
>             /* Smoothing coefficient */
>             smooth_coef_Q16 = silk_SMULWB( VAD_SNR_SMOOTH_COEF_Q18,
>             silk_SMULWB( (opus_int32)SA_Q15, SA_Q15 ) ); // step3:
>             Update the smoothing factor based on speech probability :
>             comment by me
> 
>             if( psEncC->frame_length == 10 * psEncC->fs_kHz ) {
>             smooth_coef_Q16 >>= 1;
>             }
> 
>             Here, in step1, Speech probability is calculated whose value
>             is expected to be within [0, 1) in Q15 format. Then based on
>             the speech energy levels, in Step2, the probability is
>             updated whose value shall also lie between [0, 1). Later in
>             Step3, the smooth coeff is calculated. This code do not have
>             any issue when the frame size is more than or equal to
>             20msec. But, if the frame size is 10ms, then in step2, the
>             energy is doubled (this may be done because the original
>             Silk code is for 20ms. To convert the energy for 20ms, it
>             could have been doubled). When this is done the probability
>             which is updated in step2 becomes more than 1. When this is
>             used in multiplication in Step3, the value is treated as a
>             negative number because its a 32x16 multiplication. This is
>             will result in a negative smooth coefficient. Please let me
>             know if this is a bug.
> 
> 
>             Thank you,
>             Chandrakala
> 
> 
>         _______________________________________________
>         opus mailing list
>         opus at xiph.org <mailto:opus at xiph.org>
>         http://lists.xiph.org/mailman/listinfo/opus
>         <http://lists.xiph.org/mailman/listinfo/opus>
> 
> 
> 
> 
> 
> 
> 
> _______________________________________________
> opus mailing list
> opus at xiph.org
> http://lists.xiph.org/mailman/listinfo/opus
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: vad_overflow_fix.patch
Type: text/x-patch
Size: 953 bytes
Desc: not available
URL: <http://lists.xiph.org/pipermail/opus/attachments/20180216/29299d0d/attachment.bin>


More information about the opus mailing list