[opus] Alleged bug in Silk codec

Marcello Caramma (mcaramma) mcaramma at cisco.com
Fri Jun 20 03:10:50 PDT 2014


Hi Jean-Marc,

well spotted! The patch provided fixes the issue for me.

Nevertheless, in my code (and I would suggest doing the same in libopus) I
am going to replace the function with one that accumulates on 64 bits and
then calculates the shift, for at least 4 reasons:
 - It is less and simpler code
 - The result is likely to be slightly more accurate in case big numbers
come early in the array.
 - I don’t like to have branching inside loops

 - Even though it seems to work on most compilers, assuming that overflow
in the accumulator will lead to a negative result calls for undefined
behaviour (NOT implementation defined): if left as is the code should at
least use an unsigned accumulator and replace the test (nrg < 0) with (nrg
> MAX_INT32).

Can you see any issue in doing so?
If you also think it might be a good idea to do the same in the libopus
repository, I can send you the suggested patch.

Cheers,

Marcello


On 18/06/2014 08:09, "Jean-Marc Valin" <jmvalin at jmvalin.ca> wrote:

>Hi Marcello,
>
>It turns out that the problem has a much simpler explanation. As far as
>I can tell, it's actually a bug in silk_sum_sqr_shift() and this trivial
>patch appears to fix the problem:
>http://jmvalin.ca/misc_stuff/sum_sqr_shift_fix.patch
>
>It would still require some testing to check that the fix doesn't have
>any bad side effect. Let me know how well the fix works for you. Again,
>thanks for investigating this.
>
>Cheers,
>
>	Jean-Marc
>
>On 13/06/14 12:28 PM, Marcello Caramma (mcaramma) wrote:
>> Hi Jean Marc,
>> 
>> please find attached the audio file (mono 16khz). I shortened it to
>>about
>> 10 seconds. I also add 2 patches that worked for me. Further info that
>> might help:
>> 
>> - The problem seems to be related to silk_burg_modified not reaching the
>> maximum gain, so the actual filter order is 16 rather than 2 (which is
>> what would be expected with a sine wave).
>> - The problem seems to happen when rshifts >= 3
>> - when pre-scaling the signal to be < 16384 the problem goes away (patch
>> scale_burg_in.diff)
>> - When calculating C0 and rshifts based on a 64 bits correlation instead
>> of using silk_sum_sqr_shift the problem also goes away (patch
>> new_C0_calc.diff)
>> 
>> I suspect that for very high prediction gain the fixed point
>> implementation becomes very sensitive to numerical error, but I am not
>>too
>> sure why the new versions work better.
>> I favour the version with the new C0 calculation, as it avoids rescaling
>> the input.
>> I also played around with a version (not attached) that prescales the
>> input by rshifts/2 - this might be considering as it simplifies the
>>code.
>> 
>> Regards,
>> 
>> Marcello
>> 
>> PS: I am using 1.1 but the same issue is present with the latest code
>>well.
>> 
>> 
>> On 13/06/2014 06:05, "Jean-Marc Valin" <jmvalin at jmvalin.ca> wrote:
>> 
>>> Hi Marcello,
>>>
>>> Thanks for the report. It's hard to debug this without the actual file.
>>> Can you please post the sweep_in.raw file you used?
>>>
>>> Cheers,
>>>
>>> 	Jean-Marc
>>>
>>> On 11/06/14 04:46 AM, Marcello Caramma (mcaramma) wrote:
>>>> Hi,
>>>>
>>>> Apologies if this is a known issues, but I have found what I believe
>>>>is
>>>> a bug in the fixed point implementation of the Silk codec and could
>>>>not
>>>> find any mention on this in the archives.
>>>>
>>>> The bug can be easily reproduced with the fixed point demo program
>>>> (./configure ‹enable-fixed-point ‹disable-float-api && make) using the
>>>> following command:
>>>>
>>>> ./opus_demo voip 16000 1 23000 sweep_in.raw sweep_out.raw
>>>>
>>>> Where sweep_in.raw is a 30 seconds full scale frequency sweep from 0
>>>>to
>>>> 8kHz sampled at 16kHz.
>>>>
>>>> The first 6 seconds of audio after transcoding sound Ok. After that
>>>> artefacts are introduced all the way to the end of the file.
>>>> The floating point version does not have the issue (even though the
>>>> quality is subjectively worse roughly from the same point).
>>>>
>>>> I believe I narrowed down the problem to the file burg_modified_FIX.c
>>>>­
>>>> if I make sure the input signal is scaled down to 14 bits before
>>>> processing the coefficients of the predictor are calculated correctly
>>>> and no artefact is introduced.
>>>>
>>>> Is anyone experiencing the same problem or has a proper fix for this?
>>>>(I
>>>> can work around the bug with input scaling for now).
>>>>
>>>> Thanks and best regards,
>>>>
>>>> Marcello Caramma
>>>>
>>>>
>>>> _______________________________________________
>>>> opus mailing list
>>>> opus at xiph.org
>>>> http://lists.xiph.org/mailman/listinfo/opus
>>>>
>> 



More information about the opus mailing list