[vorbis-dev] Optimisations

Wed Nov 15 23:45:10 PST 2000

Jason Hecker wrote:
> 
> >Sure I'm interested :-) For optimizing the current all-C version, I
> >hand-unrolled
> >a lot of the critical loops; this will come in handy for doing a vector-op
> 
> That's a very good start then!  I only recently discovered this OggVorbis

If you want, I can send you my current code. It's based on a (now) rather
ancient version of vorbis (before the ogg/vorbis split-up), and it's not
very clean code (to say the least). But it's working hacker code; you can
probably take the useful parts out (if you don't care about getting your
hands dirty ;-) ).

> thing and have yet to fully look at all the code.  One thing I have been
> wondering about is how  the data and calculations are done.  I have done
> assembly optimisation work with the TMS320C6201 DSP which is integer only
> (but damned fast!)

I thought it topped at 33 or 40 MHz or something? But all static ram,
of course :-)

> Does OggVorbis go off and do everything using single
> precision floating point?  How does this affect the truncation of viable

Yep.

> bits after various multiplications (ie accuracy after a very big number is
> multiplied by a very small number)?  I attended a DSP workshop last year by

It's floating point, not fixed point, so no problem.

> the very funky Prof fred harris and he emphasised that there is no need for
> floating point (most of the time) in the DSP world simply because of the
> loss of precision you get with single precision (32 bit word) floats - to

IEEE single precision floats only have 24 bits (23.5, whatever) of precision.
This can sometimes be a limitation, but we are coding audio (maximum of
20, 16, 14 bits precision needed, depending on your preferences/mood/phase
of the moon/etc.; 20 bits is about always enough).

There _were_ some places in the code which suffered heavily from loss of
precision; these unstable algorithms are now long gone; Monty replaced
them by better algorithms.

Note: in floating point, you most probably lose precision at addition
instructions, in fixed point, you lose it at multiplications. This asks
for different algorithms. For example, a good floating point MDCT algorithm
is totally instable in fixed point (I said "a", not "any"). The other way
around makes for less precision algorithms, as well.

> this effect I vaguely recall he showed how floating point IIR filters can
> break (details I forget but I think it had to do with precision
> errors).  With careful consideration to the bounds of your numbers you can

It depends on your algorithm and implementation. If, for example, you
need to
add 256 floats, for maximum precision, you first add them two by two, then
add the resulting 128 numbers two by two, etc. (i.e., in a tree shape; this
is actually the same amount of work (for the cpu) as the straightforward algo).
Even better is to first sort the numbers, to add evenly big numbers.

> do pretty much everything you need with 32 bit (bigger is better) integers
> and not loose precision.  After doing the TI DSP work I have to agree (it's

It is not always an equally fast algorithm, and certainly most of the time
not equally readable. We should be careful of hand-optimizing code that will
be replaced by a different algorithm/code structure later, because Vorbis
is still developing.

> 48 bit accumulator was too flamin' small!).  I have to admit I am new to
> the world of audio compression but is much consideration given to the
> precision of floats and the issues involved with truncation - hence
> resulting quality of the compression?

I think Monty has these thing covered, yeah :)

> 
> >version. Note, that the filters and fast transforms are not the most
> >time-critical,
> >however.
> 
> I guess, but they would be the easiest to implement first.  Apart from the
> x86 vector stuff, what other CPUs that OggVorbis runs on have SIMD
> operations that would need hand coding?

PPC G4. Well, it's freaking fast anyway, but it would burn away with Altivec
support for Vorbis. And I think Sparc has vector now, maybe Alpha
(really don't
know), and various dsp's/dsp likes (there is optional vector math
support for
the StrongARM, for example). So just about anything can use this
support. So
it would be nice if we first have general support (by restructuring C code),
for strides of 2, 4, 8, maybe 16.

I think it's better not to make an all-integer version of Vorbis now (especially
not encode side), because the code is changing still. You will have to redo
a lot of work later.

Hope this answers your questions, feel free to ask, you got interesting questions,

Dagdag,

Segher

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'vorbis-dev-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.