[opus] Antw: [EXT] Re: [PATCH] Make CELT FFT twiddle complex type aligned

Ulrich Windl Ulrich.Windl at rz.uni-regensburg.de
Mon Oct 24 07:27:36 UTC 2022


>>> Zheng Lv <lvzheng at google.com> schrieb am 24.10.2022 um 04:31 in Nachricht
<CACNzCb422LVKKUzRSYJnDH09uaVgOABB1w6z48zYUXMPQvQAmg at mail.gmail.com>:
> Hi Team,
> 
> Can anyone take a look? We are already using this optimization in
> production.
> 
> Best regards,
> Zheng
> 
> On Thu, 15 Sept 2022 at 19:04, Zheng Lv <lvzheng at google.com> wrote:
> 
>> This makes kiss_twiddle_cpx 4-byte aligned (instead of 2-byte) for
>> fixed-point builds. Tested with an armv6j+nofp development board, CELT
>> encoding becomes 1.4x as fast, and decoding over 2x.
>>
>> Performance gain is mostly attributed to the proper alignment of the
>> static const array mdct_twiddles960.
>>
>> Co-authored-by: David Gao <davidgao at google.com>
>> ---
>>  celt/kiss_fft.h | 12 +++++++++++-
>>  1 file changed, 11 insertions(+), 1 deletion(-)
>>
>> diff --git a/celt/kiss_fft.h b/celt/kiss_fft.h
>> index bffa2bfa..267f72f9 100644
>> --- a/celt/kiss_fft.h
>> +++ b/celt/kiss_fft.h
>> @@ -49,31 +49,41 @@ extern "C" {
>>  #ifdef FIXED_POINT
>>  #include "arch.h"
>>
>>  #  define kiss_fft_scalar opus_int32
>>  #  define kiss_twiddle_scalar opus_int16
>>
>> +/* Some 32-bit CPUs would load/store a kiss_twiddle_cpx with a single
>> memory
>> + * access, and could benefit from additional alignment.
>> + */
>> +#  define KISS_TWIDDLE_CPX_ALIGNMENT (sizeof(opus_int32))
>>
>>  #else
>>  # ifndef kiss_fft_scalar
>>  /*  default is float */
>>  #   define kiss_fft_scalar float
>>  #   define kiss_twiddle_scalar float
>>  #   define KF_SUFFIX _celt_single
>>  # endif
>>  #endif
>>
>> +#if defined(__GNUC__) && defined(KISS_TWIDDLE_CPX_ALIGNMENT)
>> +#define KISS_TWIDDLE_CPX_ALIGNED
>> __attribute__((aligned(KISS_TWIDDLE_CPX_ALIGNMENT)))
>> +#else
>> +#define KISS_TWIDDLE_CPX_ALIGNED
>> +#endif
>> +
>>  typedef struct {
>>      kiss_fft_scalar r;
>>      kiss_fft_scalar i;
>>  }kiss_fft_cpx;
>>
>>  typedef struct {
>>     kiss_twiddle_scalar r;
>>     kiss_twiddle_scalar i;
>> -}kiss_twiddle_cpx;
>> +} KISS_TWIDDLE_CPX_ALIGNED kiss_twiddle_cpx;

I'm only surprised that you do align the typedef and not the actual data!


>>
>>  #define MAXFACTORS 8
>>  /* e.g. an fft of length 128 has 4 factors
>>   as far as kissfft is concerned
>>   4*4*4*2
>>   */
>> --
>> 2.37.2.789.g6183377224-goog
>>
>>






More information about the opus mailing list