[Tremor] Notes on Implementing Tremor on an ARM7TDMI CPU

Andrew Lentvorski bsder at allcaps.org
Sat Dec 6 02:34:09 PST 2008


Nicholas Vinen wrote:

> So I am now in the position where I need to work out how I can get this
> decoding faster than real time. I chose this CPU because of this post I
> read from the tremor archives:
> 
> http://lists.xiph.org/pipermail/tremor/2003-January/000303.html
> 
> I am at a loss to undetstand why Segher thinks a 40MHz ARM should be
> fast enough to play back an Ogg Vorbis file. Is an ARM4 faster than an
> ARM7 clock-per-clock? (I wouldn't have thought so).

Note he said ARMv4.  That refers to the version of the Instruction Set 
Architecture--not chip.

See here for the distinction:
http://en.wikipedia.org/wiki/ARM_architecture

Don't ARM9's have separate instruction and data buses?  That means that 
you can do instruction fetch and data fetch on the same cycle.  I don't 
believe that is true on the ARM7.

That's going to kill, if true--especially if fetching from flash ties 
things up for 2 cycles and can't interleave with a RAM access.  Vorbis, 
IIRC, assumes pretty good lookup speed in order to access the basis 
vector tables.  You may also find that caching and calculating is better 
than looking stuff up.

Also, I'm pretty sure that the ARM9 has bypassing that prevents cycle 
stalls while data is getting readied.

Finally, I think the ARM9 handles multiplies quite a bit faster.

> 1) Compile more files without thumb. I will try this to see what happens.

Worth a shot.  If thumb doesn't help your memory bandwidth, better to 
take the extra registers.

> 7) Anything else?

It would be useful to actually *know* where it's spending all its time. 
  Can you run this under a simulator?

I thought that Skyeye supported the AT91 series.  If you could run it 
under that, you might get some better information.



As a reference, I ported the Tremor library to the 66MHz ARM9 on the 
Nintendo DS.  Once I enabled the assembly language implementation, I 
didn't have any performance issues.

Now, I don't know how much extra performance room I had left, but it was 
capable of decoding, streaming from Wifi, writing a raster to the 
display, and reading off of a flash device without me having to do 
anything strange to get performance.

-a



More information about the Tremor mailing list