[Tremor] Tremor lowmem on TI 55x DSP
Roland Wintersteller
rwinters at europe.com
Sat Oct 23 12:10:53 PDT 2004
I currently recognised, that a message from me sent to the mailing list
a few weeks ago cannot be displayed by the archive because I've used
HTML format instead of text format. Since I think this message may be
helpful to somebody trying to port Tremor on a TI C5x DSP, here a very
similar message but this time in text format...
I ported the tremor lowmem decoder to a C55x DSP a half year ago. Hear
are my experiences and a few numbers concerning the porting results.
List of the main problems I've had:
A list of the main problems I've had:
- 16 bit char:
function floor1_info_unpack(...)
...
info->class[j].class_subbook[k]=(oggpack_read(opb,8)-1) & 0xff; //
<-- add "& 0xff" here
if(info->class[j].class_subbook[k]>=ci->books &&
info->class[j].class_subbook[k]!=0xff)goto err_out;
...
- 16 bit / 32 bit integer - cast operators have to be added in a few
places:
function oggpack_look(...)
...
if(bits>8){
// added cast to (uint32)
ret|=(uint32)b->headptr[1]<<(8-b->headbit);
if(bits>16){
// added cast to (uint32)
ret|=(uint32)b->headptr[2]<<(16-b->headbit);
if(bits>24){
// added cast to (uint32)
ret|=(uint32)b->headptr[3]<<(24-b->headbit);
if(bits>32 && b->headbit) {
// added cast to (uint32)
ret|=(uint32)b->headptr[4]<<(32-b->headbit);
}
}
}
}
...
- 16 bit integers: all int (int16) types (at least those which needs
32bit) has to replaced by long (int32)
-memoyry leakage problem: have a look in the archives...
http://lists.xiph.org/pipermail/tremor/2004-April/000965.html and
http://lists.xiph.org/pipermail/tremor/2004-October/001112.html
One additional hint:
Try to make your changes also run in a gcc or windows environment. I've
seen that the windows compiler reports warnings ccs does not and vice
versa.
After this changes I was able to add the decoder in a test environment:
- Demo application running on a C5510 DSK, which
- reads an Ogg Vorbis file from the external SDRAM and
- puts out the decoded samples to the headphone jack with DMA/MCBSP.
But the achieved results were pretty poor (stereo with 44.1 kHz
samplerate):
- 20 kW code
- 28 kW heap
- 11 kW constants
- 1 kW static data
- ? kW stack
- 150 MCPS (million cycles per second) for q0
- 164 MCPS for q4
- 190 MCPS for q7
- Accuracy = -410/511 (10 bit)
Since then I worked on optimizing the decoder (reducing lookup tables,
modified IMDCT implementation, s.a. Diploma Thesis and Patch of Johannes
Sandvall: http://lists.xiph.org/pipermail/tremor/2004-March/000957.html,
compressed char-buffers, adapted memory management, ...). The current
memory consumption is...
+ 17 kW code memory (doesn't include floor 0 decoding)
+ 22 kW heap (up to 128kbit/s (q4); has to be increased to at least 23
kW with higher bit rates (244kbit/s <> q7))
+ 0 kW static data (changed to constants)
+ 2 kW constants
+ 0.5 kW stack
+ 2.5 kW application buffers (as a part of the decoder application;
needed for real-time decoding)
(Therefore the decoder is suitable to run on a C5502 DSP; code has to be
executed from an external memory)
The code is not assembler optimized yet. The CPU load is still high...
+ 100 MCPS for q0
+ 110 MCPS for q4
+ 130 MCPS for q7
With a number of assembler optimizations I expect the CPU load to can be
reduced to 40-50 MCPS. (As there is only c-code a performance test on a
C54X DSP would be interesting.)
The achieved accuracy compared to the vorbis reference decoder is -3/+2
digits (3 bit).
Regards, Roland
More information about the Tremor
mailing list