[opus] [RFC PATCH v1 0/4] Enable aarch64 intrinsics/Ne10
Viswanath Puttagunta
viswanath.puttagunta at linaro.org
Wed Mar 18 12:49:28 PDT 2015
Hi All,
Since I continue to base my work on top of Jonathan's patch,
and my previous Ne10 fft/ifft/mdct_forward/backward patches,
I thought it would be better to just post all new patches
as a patch series. Please let me know if anyone disagrees
with this approach.
You can see wip branch of all latest patches at
https://git.linaro.org/people/viswanath.puttagunta/opus.git
Branch: rfcv1_final_armv8
aarch64: Enable intrinsics for aarch64
- enables intrinsics (includes Ne10) optimizations for
aarch64.
armv7(float): Optimize encode usecase using NE10 library
- Changes from RFCv4 [1]: Caught compile issue while
enabling armv8 in celt/arm/mdct_arm.h
armv7(float): Optimize decode usecase using NE10 library
- Changes from RFCv1 [2]: Same compile issue propagated
so fixed it here as well
Intrinsics/RTCD related fixes. Mostly x86.
- Changes from RFCv2 [3]: While space removal celt/x86/x86cpu.h
[1]: http://lists.xiph.org/pipermail/opus/2015-March/002905.html
[2]: http://lists.xiph.org/pipermail/opus/2015-March/002919.html
[3]: http://lists.xiph.org/pipermail/opus/2015-March/002928.html
Jonathan Lennox (1):
Intrinsics/RTCD related fixes. Mostly x86.
Viswanath Puttagunta (3):
armv7(float): Optimize encode usecase using NE10 library
armv7(float): Optimize decode usecase using NE10 library
aarch64: Enable intrinsics for aarch64
Makefile.am | 72 ++++--
celt/arm/arm_celt_map.c | 71 +++++-
celt/arm/armcpu.c | 6 +-
celt/arm/celt_ne10_fft.c | 148 +++++++++++
celt/arm/celt_ne10_mdct.c | 263 ++++++++++++++++++++
celt/arm/celt_neon_intr.c | 2 +
celt/arm/fft_arm.h | 74 ++++++
celt/arm/mdct_arm.h | 60 +++++
celt/arm/pitch_arm.h | 4 +-
celt/bands.c | 6 +-
celt/celt.c | 16 +-
celt/celt.h | 12 +-
celt/celt_decoder.c | 24 +-
celt/celt_encoder.c | 20 +-
celt/celt_lpc.h | 2 +-
celt/cpu_support.h | 15 +-
celt/dump_modes/Makefile | 23 +-
celt/dump_modes/dump_modes.c | 21 ++
celt/dump_modes/dump_modes_arch.h | 41 ++++
celt/dump_modes/dump_modes_arm_ne10.c | 125 ++++++++++
celt/kiss_fft.c | 31 ++-
celt/kiss_fft.h | 67 ++++-
celt/mdct.c | 20 +-
celt/mdct.h | 61 ++++-
celt/mips/celt_mipsr1.h | 2 +-
celt/modes.c | 8 +-
celt/pitch.c | 4 +-
celt/pitch.h | 22 +-
celt/static_modes_float.h | 25 ++
celt/static_modes_float_arm_ne10.h | 404 +++++++++++++++++++++++++++++++
celt/tests/test_unit_dft.c | 55 +++--
celt/tests/test_unit_mathops.c | 21 +-
celt/tests/test_unit_mdct.c | 85 ++++---
celt/tests/test_unit_rotation.c | 21 +-
celt/x86/celt_lpc_sse.c | 4 +
celt/x86/celt_lpc_sse.h | 12 +-
celt/x86/pitch_sse.c | 334 ++++++++++---------------
celt/x86/pitch_sse.h | 256 ++++++++------------
celt/x86/pitch_sse2.c | 95 ++++++++
celt/x86/pitch_sse4_1.c | 195 +++++++++++++++
celt/x86/x86_celt_map.c | 76 +++++-
celt/x86/x86cpu.c | 47 +++-
celt/x86/x86cpu.h | 26 +-
celt_headers.mk | 3 +
celt_sources.mk | 9 +-
configure.ac | 385 ++++++++++++++++++++---------
m4/opus-intrinsics.m4 | 29 +++
silk/x86/SigProc_FIX_sse.h | 17 ++
silk/x86/main_sse.h | 48 ++++
silk/x86/x86_silk_map.c | 25 +-
src/analysis.c | 8 +-
src/analysis.h | 2 +-
src/opus_encoder.c | 2 +-
src/opus_multistream_encoder.c | 9 +-
win32/VS2010/celt.vcxproj | 17 +-
win32/VS2010/celt.vcxproj.filters | 27 +++
win32/VS2010/silk_common.vcxproj | 17 +-
win32/VS2010/silk_common.vcxproj.filters | 23 +-
win32/VS2010/silk_fixed.vcxproj | 13 +-
win32/VS2010/silk_fixed.vcxproj.filters | 17 +-
win32/config.h | 25 +-
61 files changed, 2851 insertions(+), 701 deletions(-)
create mode 100644 celt/arm/celt_ne10_fft.c
create mode 100644 celt/arm/celt_ne10_mdct.c
create mode 100644 celt/arm/fft_arm.h
create mode 100644 celt/arm/mdct_arm.h
create mode 100644 celt/dump_modes/dump_modes_arch.h
create mode 100644 celt/dump_modes/dump_modes_arm_ne10.c
create mode 100644 celt/static_modes_float_arm_ne10.h
create mode 100644 celt/x86/pitch_sse2.c
create mode 100644 celt/x86/pitch_sse4_1.c
create mode 100644 m4/opus-intrinsics.m4
--
1.9.1
More information about the opus
mailing list