[Vorbis-dev] libvorbis 1.2.1 release?
Erik de Castro Lopo
mle+la at mega-nerd.com
Mon Jun 2 17:59:59 PDT 2008
Reply-to munging wherefore art thou? Come back, all is forgiven!
> The main point of the assembly versions there was to get a large
> (~10%) performance gain for decode, for almost no effort. I doubt
> calling lrint is going to be as fast...
I am willing to bet a case of your choice of Australian beer, payable
at LCA2008 that lrint will be the same speed +/- 5% as this:
#if defined(__i386__) && defined(__GNUC__) && !defined(__BEOS__)
# define VORBIS_FPU_CONTROL
/* both GCC and MSVC are kinda stupid about rounding/casting to int.
Because of encapsulation constraints (GCC can't see inside the asm
block and so we end up doing stupid things like a store/load that
is collectively a noop), we do it this way */
/* we must set up the fpu before this works!! */
typedef ogg_int16_t vorbis_fpu_control;
static inline void vorbis_fpu_setround(vorbis_fpu_control *fpu){
ogg_int16_t ret;
ogg_int16_t temp;
__asm__ __volatile__("fnstcw %0\n\t"
"movw %0,%%dx\n\t"
"orw $62463,%%dx\n\t"
"movw %%dx,%1\n\t"
"fldcw %1\n\t":"=m"(ret):"m"(temp): "dx");
*fpu=ret;
}
static inline void vorbis_fpu_restore(vorbis_fpu_control fpu){
__asm__ __volatile__("fldcw %0":: "m"(fpu));
}
/* assumes the FPU is in round mode! */
static inline int vorbis_ftoi(double f){ /* yes, double! Otherwise,
we get extra fst/fld to
truncate precision */
int i;
__asm__("fistl %0": "=m"(i) : "t"(f));
return(i);
}
#endif
On x86, x86_64 and PowerPC (the only ones I've really looked at)
lrint compiles to a single instruction.
Erik
--
-----------------------------------------------------------------
Erik de Castro Lopo
-----------------------------------------------------------------
"I'd crawl over an acre of 'Visual This++' and 'Integrated
Development That' to get to gcc, Emacs, and gdb. Thank you."
-- Vance Petree
More information about the Vorbis-dev
mailing list