[Speex-dev] run time assembler patch for altivec, sse + bug fixes
Tom Harper
tharper at sightspeed.com
Fri Dec 2 10:53:18 PST 2005
Hi Folks,
Attached is a patch against the latest svn, plus new source files.
This patch allows the specification of c or assembler versions of various
functions at run time if _USE_SSE or _USE_ALTIVEC is specified
at compile time.
The basic concept is to use function pointers and preprocessor trickery
to allow for run-time without changing how the other platforms work, esp.
the platform function overrides. I also included two small fixes to svn, as
well as project file changes needed to get things working.
If anyone has any feedback that would be great. I have tested this on
windows using vc2003 and darwin using xcode. I mainly want to make
sure I didn't break any of the arm/blackfin stuff, as I don't have the setup(s)
to test that.
Thanks!
Tom
-------------- next part --------------
A non-text attachment was scrubbed...
Name: filters_altivec.c
Type: application/octet-stream
Size: 4970 bytes
Desc: not available
Url : http://lists.xiph.org/pipermail/speex-dev/attachments/20051202/46c05f55/filters_altivec-0002.obj
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cb_search_sse.c
Type: application/octet-stream
Size: 3214 bytes
Desc: not available
Url : http://lists.xiph.org/pipermail/speex-dev/attachments/20051202/46c05f55/cb_search_sse-0001.obj
-------------- next part --------------
A non-text attachment was scrubbed...
Name: vq_sse.c
Type: application/octet-stream
Size: 4006 bytes
Desc: not available
Url : http://lists.xiph.org/pipermail/speex-dev/attachments/20051202/46c05f55/vq_sse-0001.obj
-------------- next part --------------
A non-text attachment was scrubbed...
Name: filters_sse.c
Type: application/octet-stream
Size: 9943 bytes
Desc: not available
Url : http://lists.xiph.org/pipermail/speex-dev/attachments/20051202/46c05f55/filters_sse-0001.obj
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ltp_sse.c
Type: application/octet-stream
Size: 3211 bytes
Desc: not available
Url : http://lists.xiph.org/pipermail/speex-dev/attachments/20051202/46c05f55/ltp_sse-0001.obj
-------------- next part --------------
/* Copyright (C) 2002 Jean-Marc Valin */
/**
@file ltp_altivec.c
@brief Long-Term Prediction functions (altivec version)
*/
/*
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:
- Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
- Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.
- Neither the name of the Xiph.org Foundation nor the names of its
contributors may be used to endorse or promote products derived from
this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE FOUNDATION OR
CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/
#ifdef _USE_ALTIVEC
#include "ltp_altivec.h"
spx_word32_t inner_prod_altivec(const float *a, const float *b, int len)
{
int i;
float sum;
int a_aligned = (((unsigned long)a) & 15) ? 0 : 1;
int b_aligned = (((unsigned long)b) & 15) ? 0 : 1;
__vector float MSQa, LSQa, MSQb, LSQb;
__vector unsigned char maska, maskb;
__vector float vec_a, vec_b;
__vector float vec_result;
vec_result = (__vector float)vec_splat_u8(0);
if ((!a_aligned) && (!b_aligned))
{
// This (unfortunately) is the common case.
maska = vec_lvsl(0, a);
maskb = vec_lvsl(0, b);
MSQa = vec_ld(0, a);
MSQb = vec_ld(0, b);
for (i = 0; i < len; i+=8)
{
a += 4;
LSQa = vec_ld(0, a);
vec_a = vec_perm(MSQa, LSQa, maska);
b += 4;
LSQb = vec_ld(0, b);
vec_b = vec_perm(MSQb, LSQb, maskb);
vec_result = vec_madd(vec_a, vec_b, vec_result);
a += 4;
MSQa = vec_ld(0, a);
vec_a = vec_perm(LSQa, MSQa, maska);
b += 4;
MSQb = vec_ld(0, b);
vec_b = vec_perm(LSQb, MSQb, maskb);
vec_result = vec_madd(vec_a, vec_b, vec_result);
}
}
else if (a_aligned && b_aligned)
{
for (i = 0; i < len; i+=8)
{
vec_a = vec_ld(0, a);
vec_b = vec_ld(0, b);
vec_result = vec_madd(vec_a, vec_b, vec_result);
a += 4;
b += 4;
vec_a = vec_ld(0, a);
vec_b = vec_ld(0, b);
vec_result = vec_madd(vec_a, vec_b, vec_result);
a += 4;
b += 4;
}
}
else if (a_aligned)
{
maskb = vec_lvsl(0, b);
MSQb = vec_ld(0, b);
for (i = 0; i < len; i+=8)
{
vec_a = vec_ld(0, a);
a += 4;
b += 4;
LSQb = vec_ld(0, b);
vec_b = vec_perm(MSQb, LSQb, maskb);
vec_result = vec_madd(vec_a, vec_b, vec_result);
vec_a = vec_ld(0, a);
a += 4;
b += 4;
MSQb = vec_ld(0, b);
vec_b = vec_perm(LSQb, MSQb, maskb);
vec_result = vec_madd(vec_a, vec_b, vec_result);
}
}
else if (b_aligned)
{
maska = vec_lvsl(0, a);
MSQa = vec_ld(0, a);
for (i = 0; i < len; i+=8)
{
a += 4;
LSQa = vec_ld(0, a);
vec_a = vec_perm(MSQa, LSQa, maska);
vec_b = vec_ld(0, b);
b += 4;
vec_result = vec_madd(vec_a, vec_b, vec_result);
a += 4;
MSQa = vec_ld(0, a);
vec_a = vec_perm(LSQa, MSQa, maska);
vec_b = vec_ld(0, b);
b += 4;
vec_result = vec_madd(vec_a, vec_b, vec_result);
}
}
vec_result = vec_add(vec_result, vec_sld(vec_result, vec_result, 8));
vec_result = vec_add(vec_result, vec_sld(vec_result, vec_result, 4));
vec_ste(vec_result, 0, &sum);
return sum;
}
#endif
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ltp_altivec.h
Type: application/octet-stream
Size: 1786 bytes
Desc: not available
Url : http://lists.xiph.org/pipermail/speex-dev/attachments/20051202/46c05f55/ltp_altivec-0001.obj
-------------- next part --------------
A non-text attachment was scrubbed...
Name: filters_altivec.h
Type: application/octet-stream
Size: 1865 bytes
Desc: not available
Url : http://lists.xiph.org/pipermail/speex-dev/attachments/20051202/46c05f55/filters_altivec-0003.obj
-------------- next part --------------
A non-text attachment was scrubbed...
Name: asm_flag_patch_12_2_05.patch
Type: application/octet-stream
Size: 58437 bytes
Desc: not available
Url : http://lists.xiph.org/pipermail/speex-dev/attachments/20051202/46c05f55/asm_flag_patch_12_2_05-0001.obj
-------------- next part --------------
______________________________________________
Tom Harper
Lead Software Engineer
SightSpeed - <http://www.sightspeed.com/>http://www.sightspeed.com/
918 Parker St, Suite A14
Berkeley, CA 94710
Email: tharper at sightspeed.com
Phone: 510-665-2920
Fax: 510-649-9569
My SightSpeed Video Link: http://tom.sightspeed.com
More information about the Speex-dev
mailing list