[tremor] Tremor on strongarm CPU

Nicolas Pitre nico at cam.org
Mon Sep 9 09:25:30 PDT 2002



On Sun, 8 Sep 2002, dilb wrote:

> Hi,
> as Nicholas Pitre did, I did spend some time on a fixed point version of the 
> ogg vorbis decoder. Now, as Tremor is free, I will help merge the things I 
> did optimize for my version of the decoder. But first, I have some questions 
> regarding arm optimisation made in the mdct_backward function.
> 
> So, as far as I have seen, in this function are used MULT31 and MULT30, 
> defined this way:
> static inline ogg_int32_t MULT32(ogg_int32_t x, ogg_int32_t y) {
>   int lo,hi;
>   asm volatile("smull  %0,%1,%2,%3;\n"
>                : "=&r"(lo),"=&r"(hi)
>                : "%r"(x),"r"(y));
>   return(hi);
> }
> 
> static inline ogg_int32_t MULT31(ogg_int32_t x, ogg_int32_t y) {
>   return MULT32(x,y)<<1;
> }
> 
> static inline ogg_int32_t MULT30(ogg_int32_t x, ogg_int32_t y) {
>   return MULT32(x,y)<<2;
> }
> so, obviously in MULT31 (resp. MULT30),  1 bit (resp. 2 bits) are truncated. 
> My question is: according to the specification, is this truncation noise 
> amplified by the following stages of the decoder, i.e. can a human being 
> perceive it ? (Sorry, I didn't have enough time to read vorbis specifications 
> :))

Depends how many bits you're using in the final output samples.  This is in
fact a clever way to eliminate a second shift instruction when scaling back
the multiplication result.  You're losing one (or two) bit of accuracy each
time a particular value is multiplied, but if your final samples are only 16
bits wide you then have some room out of the 16 remaining bits for accuracy
loss.

> Now, what I can propose, is an asm version of mdct_backward function, from my 
> fixed point project, I did already merge it with tremor with a quick 
> adaptation hack, and I get a good 5% improvement when decoding. And 
> currently, it's using 64bits operation/accumulation (SMULL/SMLAL 
> instructions), so it can be a little bit more improved, depending on the 
> answer to my question.

If applying the same set of compiler flags to Tremor that I used in my own 
version, I also get that 5% or more improvement as well.  Would you mind 
posting your assembly version (or making it available somewhere if too big) 
and I could compare the performance gain as well?

> I had another function written in pure asm, but it seems, I will need more 
> time to merge it with tremor, and it will probably not be worth.

So far, inspecting the generated assembly from a recent gcc (currently
testing with gcc 3.2), the compiler is increasingly doing a pretty good job
out of plain C code.  It just looks like writing assembly code is less worth
it than it was before.

> Fred <Dilb> Boulay.
> 
> P.S.: For people interested in my hack of the ogg vorbis decoder:
> 	http://dilb.blablanux.org/ipaq/fr/software/oggvorbis_SA1110.html , click on 
> [miroir] (the page is currently in french only, but the changelog and the 
> .tgz are commented in english).

Oh parfait, pas de problème alors.

<p>Nicolas

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'tremor-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.



More information about the Tremor mailing list