[speex-dev] [PATCH] Make SSE Run Time option.
Jean-Marc Valin
Jean-Marc.Valin at USherbrooke.ca
Thu Jan 15 01:29:52 PST 2004
> You may wish to save space for PNI.
>
> http://cedar.intel.com/media/pdf/PNI_LEGAL3.pdf
Seems to be interesting instructions for complex arithmetic there (thus
helping FFTs). I'm not sure there's anything useful for Speex, though.
We'll see. What I think is much more promising is the x86-64 version of
SSE with 16 registers. That could speed up the filters a lot.
> Please note that dot products of simple vector floats are usually faster
> in the scalar units. The add across and transfer to scalar is just too
> expensive. Its generally only worthwhile if the data starts and ends in
> the vector units, and it is inlined so that latencies can be covered with
> other work. e.g:
Actually, even with a scalar unit, the best code is implicitly
vectorized. If you look at the original code I had, there are 4 partial
sums that prevents some stalling due to dependencies. From there, it's
easy to vectorize by 4 and add at the end. Note that for Speex the
vectors are either 40 or 160 samples long. The whole process is also
repeated 128 times in a row, so I think a vector unit will do much
better.
Jean-Marc
--
Jean-Marc Valin, M.Sc.A., ing. jr.
LABORIUS (http://www.gel.usherb.ca/laborius)
Université de Sherbrooke, Québec, Canada
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 190 bytes
Desc: Ceci est une partie de message numériquement signée.
Url : http://lists.xiph.org/pipermail/speex-dev/attachments/20040115/31dba8ec/signature-0001.pgp
More information about the Speex-dev
mailing list