[tremor] [PATCH] 12% global performance gain on a StrongARM

Thu Sep 19 15:45:18 PDT 2002


From: "Nicolas Pitre" <nico at cam.org>
Sent: Thursday, September 19, 2002 7:18 PM

<p>> On Thu, 19 Sep 2002, Chris Gilbert wrote:
>
> > > In some cases, the compiler would have inverted the two assignments.
> > > However *x really needs to be written to memory first since the value
of
> > > y1  is still being processed in the pipeline by the smlal instruction.
> > > The  memory barrier ensure that ordering.
> >
> > Wouldn't it be better to push the result storing into the asm, to be
> > certain that the ordering is always enforced correctly, and the correct
> > number of cycles is waited?  Although I'm surprised that the processor
> > doesn't actually do a stall to wait for the result.
>
> It does, but the idea is to fill the pipeline with useful work instead of
> just stalling and wasting cycles.

Yes I agree entirely, the fuller the pipeline the better, I just find it
strange that compiler even considers swapping those two stores over, but
then the compiler is a weird beast 8)

Cheers,
Chris

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'tremor-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.