[opus] Reg an issue with smoothing factor in VAD implementation

Mon Nov 27 17:10:11 UTC 2017

Sorry, long holiday weekend in America.
I can say with pretty high certainty that there is an overflow occurring
and it is flipping smooth_coef_Q16 to be negative when it probably
shouldn't be. I had originally thought it was only an issue where it was
overflowing the 15th bit but not the 16th, which might still preserve the
intended value for operations that ignore the sign bit (in cases where the
16-bit value is stored in a 32-bit int without sign-extension e.g.
0x00008b3b), but that doesn't seem to be the case here.
The problem is even worse when opus_int is defined as 16-bit in the
platform - SA_Q15 overflows to negative right here with a similar effect
speech_nrg = silk_SQRT_APPROX(speech_nrg);
SA_Q15 = silk_SMULWB(32768 + speech_nrg, SA_Q15);

I can't speak for the logic where the speech energy gets doubled. It
obviously seems intentional but I don't know why. Maybe so that smoothing
is performed at a constant rate regardless of whether framesize is 10 or
20ms?

On Sun, Nov 26, 2017 at 8:07 PM, Chandrakala Madhira <
chandrakala.madhira at soctronics.com> wrote:

> Hi,
>
> Can anyone let me know if this is a bug?
>
> Thank you,
> Chandrakala
>
> ------------------------------
> *From: *"Logan Stromberg" <loganstromberg at gmail.com>
> *To: *"Chandrakala Madhira" <chandrakala.madhira at soctronics.com>
> *Cc: *opus at xiph.org
> *Sent: *Wednesday, November 22, 2017 12:12:39 PM
>
> *Subject: *Re: [opus] Reg an issue with smoothing factor in VAD
> implementation
>
> Yes, yes, I can reproduce it now, but only on platforms that define a
> 16-bit int by default (SA_Q15 is an opus_int rather than opus_int32). What
> system are you compiling this for?
>
> On Tue, Nov 21, 2017 at 8:34 PM, Chandrakala Madhira <chandrakala.madhira@
> soctronics.com> wrote:
>
>> Hi Logan,
>>
>> Please find attached the input stream we are using testing.
>>
>> Thank you,
>> Chandrakala
>>
>> ------------------------------
>> *From: *"Chandrakala Madhira" <chandrakala.madhira at soctronics.com>
>> *To: *"Logan Stromberg" <loganstromberg at gmail.com>
>> *Cc: *opus at xiph.org
>> *Sent: *Wednesday, November 22, 2017 9:53:55 AM
>>
>> *Subject: *Re: [opus] Reg an issue with smoothing factor in VAD
>> implementation
>>
>> Hi Logan,
>>
>> Below is the configuration I am using. The remaining parameters have
>> defaults. The stream is a 16KHz stream. The bitrate selected is 16KHz. As
>> the thread have size limitations, I will try to attach the stream in next
>> mail. At 125th frame of this stream, the probability (SA_Q15) value is
>> 0x0000859c. This value is treated as a negative number in step-3.
>>
>> -e voip 16000 1 16000  -framesize 10  -cvbr Input/dg105_16k.wav
>> Output/dg105_16k_16000Fs_mono_16000bps_10ms_vbr_voip.bit
>>
>> Thank you,
>> Chandrakala
>>
>> Thank you,
>> Chandrakala
>>
>> ------------------------------
>> *From: *"Logan Stromberg" <loganstromberg at gmail.com>
>> *To: *opus at xiph.org
>> *Sent: *Tuesday, November 21, 2017 2:38:16 AM
>> *Subject: *Re: [opus] Reg an issue with smoothing factor in VAD
>> implementation
>>
>> Just for fun, I tried to reproduce such an overflow. I turned on all
>> debug macros, assertions, and checked arithmetic and then encoded 2 hours
>> of mixed speech/audio with these parameters:
>>
>> Sample rate = 48000
>> Channels = 1
>> Application = OPUS_APPLICATION_AUDIO
>> Bitrate = 24 KB/s
>> Force Mode = MODE_SILK_ONLY
>> Signal Type = OPUS_SIGNAL_AUTO
>> Complexity = 10
>> Frame size = 480 samples (10ms)
>>
>> No errors came up in encoding. Chandrakala, are these the encoding
>> parameters that you believe should trigger the error?
>>
>> - Logan
>>
>> Hi,
>>>
>>> We are looking at the VAD implementation used in opus. We are looking at
>>> the code where speech probability is calculated based on which SNR is
>>> estimated. Below is the part of the code I am talking about.
>>>
>>> /*********************************/
>>> /* Speech Probability Estimation */
>>> /*********************************/
>>> SA_Q15 = silk_sigm_Q15( silk_SMULWB( VAD_SNR_FACTOR_Q16, pSNR_dB_Q7 ) -
>>> VAD_NEGATIVE_OFFSET_Q5 ); // step1: Calculate speech probability : comment
>>> by me
>>>
>>> /* Power scaling */
>>> if( speech_nrg <= 0 ) { // step2: update speech probability based on
>>> speech energy : comment by me
>>> SA_Q15 = silk_RSHIFT( SA_Q15, 1 );
>>> } else if( speech_nrg < 32768 ) {
>>> if( psEncC->frame_length == 10 * psEncC->fs_kHz ) {
>>> speech_nrg = silk_LSHIFT_SAT32( speech_nrg, 16 ); // Energy is doubled
>>> here : comment by me
>>> } else {
>>> speech_nrg = silk_LSHIFT_SAT32( speech_nrg, 15 );
>>> }
>>>
>>> /* square-root */
>>> speech_nrg = silk_SQRT_APPROX( speech_nrg );
>>> SA_Q15 = silk_SMULWB( 32768 + speech_nrg, SA_Q15 );
>>> }
>>>
>>> /* Smoothing coefficient */
>>> smooth_coef_Q16 = silk_SMULWB( VAD_SNR_SMOOTH_COEF_Q18, silk_SMULWB(
>>> (opus_int32)SA_Q15, SA_Q15 ) ); // step3: Update the smoothing factor based
>>> on speech probability : comment by me
>>>
>>> if( psEncC->frame_length == 10 * psEncC->fs_kHz ) {
>>> smooth_coef_Q16 >>= 1;
>>> }
>>>
>>> Here, in step1, Speech probability is calculated whose value is expected
>>> to be within [0, 1) in Q15 format. Then based on the speech energy levels,
>>> in Step2, the probability is updated whose value shall also lie between [0,
>>> 1). Later in Step3, the smooth coeff is calculated. This code do not have
>>> any issue when the frame size is more than or equal to 20msec. But, if the
>>> frame size is 10ms, then in step2, the energy is doubled (this may be done
>>> because the original Silk code is for 20ms. To convert the energy for 20ms,
>>> it could have been doubled). When this is done the probability which is
>>> updated in step2 becomes more than 1. When this is used in multiplication
>>> in Step3, the value is treated as a negative number because its a 32x16
>>> multiplication. This is will result in a negative smooth coefficient.
>>> Please let me know if this is a bug.
>>>
>>>
>>> Thank you,
>>> Chandrakala
>>
>>
>> _______________________________________________
>> opus mailing list
>> opus at xiph.org
>> http://lists.xiph.org/mailman/listinfo/opus
>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xiph.org/pipermail/opus/attachments/20171127/cda17f94/attachment.html>