[tremor] Tremor on x86 vs ARM, performance highlights: CPU use 120x different

dilb dilb at alrj.org
Fri Sep 13 20:24:32 PDT 2002



On Saturday 14 September 2002 04:51, slothead wrote:
> Dilb:
>
> Done.  Same changes made (I build using a cpp compiler; please remove the
> "scale" var or use it! --haha, if only it were just that).  Guess what? 
> Your 0906 setup uses exactly the same CPU for the SA, at 33% (the x86 is
> already smokin' without any help at all).  EXE file size did grow, though,
> by 120 KB.  Oh, right, I didn't use your ASM file, sorry (it was the point,
> mostly, right?), but like you say
Could you give _more_ information about the compiler you are using, and what 
exactely you are doing? especially, what do you mean when you say "same 
changes made"?
If you do changes on my version, dealing with compiler flags, (and as I 
understand it, in the defines too), then you can't expect a boost from what 
you used before. e.g. if you apply the same compiler flags on my version as 
on the one you used, then you won't have much differences.

<p>>
> > I doubt my assembly version could by itself improve so much the CPU load
> > :),
>
> Mostly compiler generated anyway, right?
I don't see your point here, it's not interesting to spend many hours on 
writing everything in assembly, except if you have really few memory space. 
The point is only to increase performances where it is worth here, and for me 
5% is already close to the limit. When I did it, it was more to play with the 
instruction set of this CPU.

<p>> Now, when you write in your .S that there's a 5% improvement, tops, you
> mean 5% of what's already going on, and not 5% for your particular setup,
> relative 100% CPU use?  I mean, if it take 100 seconds now, you estimate
> with the .S in place it runs in 95 seconds.  Right?   So, given that, the
> best I could see here is about 2%, down to 31%.  You get 18%, huh?
when I say 5% improvement, I mean 5% improvement, i.e. if T1 is the time 
needed to decode with your version of the codec, and T2 the time needed with 
mine, then (T1 - T2)/T1 = n%, where n is the percentage of improvement.

<p>> >I still don't know what quality settings you used (-q5 ?) to create your
> >vorbis files.
>
> Varies.  I use -q2 to -q6.   For -q2 it's 33%.  For -q10 it goes up by
> about 10% (say 43% instead of 33%).  This is 451 kbps.  Same .wav encoded
> at -q5 (158kbps) is about 2% more (35% CPU) than -q2 (90 kbps) at 33% CPU. 
> Something like that, arm 200 mhz.  Got -q3 sitting there..it's about 34%. 
> FWIW, the 41*.ogg from xiph.org is  31% CPU (69 kpbs).
If you are really compiling everything, including Tremor libraries as a C++ 
program, then you should change that. Obviouly, your compiler, your compiler 
flags or the way you used to include the CODEC is your program is/are not 
optimal!

>
> What CPU % use do you get on your strongarm WITHOUT your assembly code
> patch?
With several files, encoded with the following quality setting: -q4 and ogg 
vorbis version 1.0, the results are (using gcc 2.95.4):

(1)Tremor default: 19.6% CPU load
(2)Tremor with alternate CFLAGS: 19.3% CPU load
(3)Tremor default + MDCT.S: 18.7% CPU load
(4)Tremor with alternate CFLAGS + MDCT.S: 18.1% CPU load

(1)->(2) gives 1.5% improvement
(1)->(3) gives 4.6%
(2)->(4) gives 6.2%
and
(1)->(4) gives 7.7%
(I did some other timing estimations to confirm the result for the CPU load).

Dilb.

P.S.: what is it good for to highlight differences between the SA11x0 and an 
athlon something ? They don't have the same purpose, and they don't have the 
same architecture.
--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'tremor-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.



More information about the Tremor mailing list