[opus] [RFC PATCH v1 0/4] Enable aarch64 intrinsics/Ne10

Viswanath Puttagunta viswanath.puttagunta at linaro.org
Tue Mar 24 14:16:28 PDT 2015


Hi All,

While developing code targeting ARMv8 fixed intrinsics for
celt_pitch_xcorr, I realized I have caused some regressions (compile
failures) for fixed point... Will fix these and repost.

To prevent this in future, I will add compile & test fixed point code
as well to my local test plan before posting new patches.

Regards,
Vish

On 18 March 2015 at 14:49, Viswanath Puttagunta
<viswanath.puttagunta at linaro.org> wrote:
>
> Hi All,
>
> Since I continue to base my work on top of Jonathan's patch,
> and my previous Ne10 fft/ifft/mdct_forward/backward patches,
> I thought it would be better to just post all new patches
> as a patch series. Please let me know if anyone disagrees
> with this approach.
>
> You can see wip branch of all latest patches at
> https://git.linaro.org/people/viswanath.puttagunta/opus.git
> Branch: rfcv1_final_armv8
>
> aarch64: Enable intrinsics for aarch64
> - enables intrinsics (includes Ne10) optimizations for
> aarch64.
>
> armv7(float): Optimize encode usecase using NE10 library
> - Changes from RFCv4 [1]: Caught compile issue while
>   enabling armv8 in celt/arm/mdct_arm.h
>
> armv7(float): Optimize decode usecase using NE10 library
> - Changes from RFCv1 [2]: Same compile issue propagated
>   so fixed it here as well
>
> Intrinsics/RTCD related fixes. Mostly x86.
> - Changes from RFCv2 [3]: While space removal celt/x86/x86cpu.h
>
> [1]: http://lists.xiph.org/pipermail/opus/2015-March/002905.html
> [2]: http://lists.xiph.org/pipermail/opus/2015-March/002919.html
> [3]: http://lists.xiph.org/pipermail/opus/2015-March/002928.html
>
> Jonathan Lennox (1):
>   Intrinsics/RTCD related fixes. Mostly x86.
>
> Viswanath Puttagunta (3):
>   armv7(float): Optimize encode usecase using NE10 library
>   armv7(float): Optimize decode usecase using NE10 library
>   aarch64: Enable intrinsics for aarch64
>
>  Makefile.am                              |  72 ++++--
>  celt/arm/arm_celt_map.c                  |  71 +++++-
>  celt/arm/armcpu.c                        |   6 +-
>  celt/arm/celt_ne10_fft.c                 | 148 +++++++++++
>  celt/arm/celt_ne10_mdct.c                | 263 ++++++++++++++++++++
>  celt/arm/celt_neon_intr.c                |   2 +
>  celt/arm/fft_arm.h                       |  74 ++++++
>  celt/arm/mdct_arm.h                      |  60 +++++
>  celt/arm/pitch_arm.h                     |   4 +-
>  celt/bands.c                             |   6 +-
>  celt/celt.c                              |  16 +-
>  celt/celt.h                              |  12 +-
>  celt/celt_decoder.c                      |  24 +-
>  celt/celt_encoder.c                      |  20 +-
>  celt/celt_lpc.h                          |   2 +-
>  celt/cpu_support.h                       |  15 +-
>  celt/dump_modes/Makefile                 |  23 +-
>  celt/dump_modes/dump_modes.c             |  21 ++
>  celt/dump_modes/dump_modes_arch.h        |  41 ++++
>  celt/dump_modes/dump_modes_arm_ne10.c    | 125 ++++++++++
>  celt/kiss_fft.c                          |  31 ++-
>  celt/kiss_fft.h                          |  67 ++++-
>  celt/mdct.c                              |  20 +-
>  celt/mdct.h                              |  61 ++++-
>  celt/mips/celt_mipsr1.h                  |   2 +-
>  celt/modes.c                             |   8 +-
>  celt/pitch.c                             |   4 +-
>  celt/pitch.h                             |  22 +-
>  celt/static_modes_float.h                |  25 ++
>  celt/static_modes_float_arm_ne10.h       | 404 +++++++++++++++++++++++++++++++
>  celt/tests/test_unit_dft.c               |  55 +++--
>  celt/tests/test_unit_mathops.c           |  21 +-
>  celt/tests/test_unit_mdct.c              |  85 ++++---
>  celt/tests/test_unit_rotation.c          |  21 +-
>  celt/x86/celt_lpc_sse.c                  |   4 +
>  celt/x86/celt_lpc_sse.h                  |  12 +-
>  celt/x86/pitch_sse.c                     | 334 ++++++++++---------------
>  celt/x86/pitch_sse.h                     | 256 ++++++++------------
>  celt/x86/pitch_sse2.c                    |  95 ++++++++
>  celt/x86/pitch_sse4_1.c                  | 195 +++++++++++++++
>  celt/x86/x86_celt_map.c                  |  76 +++++-
>  celt/x86/x86cpu.c                        |  47 +++-
>  celt/x86/x86cpu.h                        |  26 +-
>  celt_headers.mk                          |   3 +
>  celt_sources.mk                          |   9 +-
>  configure.ac                             | 385 ++++++++++++++++++++---------
>  m4/opus-intrinsics.m4                    |  29 +++
>  silk/x86/SigProc_FIX_sse.h               |  17 ++
>  silk/x86/main_sse.h                      |  48 ++++
>  silk/x86/x86_silk_map.c                  |  25 +-
>  src/analysis.c                           |   8 +-
>  src/analysis.h                           |   2 +-
>  src/opus_encoder.c                       |   2 +-
>  src/opus_multistream_encoder.c           |   9 +-
>  win32/VS2010/celt.vcxproj                |  17 +-
>  win32/VS2010/celt.vcxproj.filters        |  27 +++
>  win32/VS2010/silk_common.vcxproj         |  17 +-
>  win32/VS2010/silk_common.vcxproj.filters |  23 +-
>  win32/VS2010/silk_fixed.vcxproj          |  13 +-
>  win32/VS2010/silk_fixed.vcxproj.filters  |  17 +-
>  win32/config.h                           |  25 +-
>  61 files changed, 2851 insertions(+), 701 deletions(-)
>  create mode 100644 celt/arm/celt_ne10_fft.c
>  create mode 100644 celt/arm/celt_ne10_mdct.c
>  create mode 100644 celt/arm/fft_arm.h
>  create mode 100644 celt/arm/mdct_arm.h
>  create mode 100644 celt/dump_modes/dump_modes_arch.h
>  create mode 100644 celt/dump_modes/dump_modes_arm_ne10.c
>  create mode 100644 celt/static_modes_float_arm_ne10.h
>  create mode 100644 celt/x86/pitch_sse2.c
>  create mode 100644 celt/x86/pitch_sse4_1.c
>  create mode 100644 m4/opus-intrinsics.m4
>
> --
> 1.9.1
>


More information about the opus mailing list