<P>I ported the tremor lowmem decoder to a C55x DSP a half year ago. Hear are my experiences and a few numbers concerning the porting results.</P>
<P>A list of the main problems I've had:<BR>- 16 bit char:<BR> function floor1_info_unpack(...)<BR> ...<BR> info->class[j].class_subbook[k]=(oggpack_read(opb,8)-1) & 0xff; // <-- add "& 0xff" here<BR> if(info->class[j].class_subbook[k]>=ci->books &&<BR> info->class[j].class_subbook[k]!=0xff)goto err_out;<BR> ...<BR> <BR>- 16 bit / 32 bit integer - cast operators have to be added in a few places:<BR> function oggpack_look(...)<BR> ...<BR> if(bits>8){<BR> // added cast to (uint32)<BR> ret|=(uint32)b->headptr[1]<<(8-b->headbit); <BR> if(bits>16){<BR> // added cast to (uint32)<BR> ret|=(uint32)b->headptr[2]<<(16-b->headbit);
<BR> if(bits>24){<BR> // added cast to (uint32)<BR> ret|=(uint32)b->headptr[3]<<(24-b->headbit); <BR> if(bits>32 && b->headbit) {<BR> // added cast to (uint32)<BR> ret|=(uint32)b->headptr[4]<<(32-b->headbit);<BR> }<BR> }<BR> }<BR> }<BR> ...<BR> <BR>- memory leakage problem:<BR> have a look in the archives (<A href="mailto:x0winter@ti.com">x0winter@ti.com</A>)</P>
<P>Afterwards I was able to add the decoder in a test environment: <BR>- C5510dsk with a demo application, <BR>- reading an ogg vorbis file from the external SDRAM and <BR>- putting out the decoded samples to the headphone jack with DMA/MCBSP. </P>
<P>But the achieved results were pretty poor (stereo with 44.1 kHz samplerate):<BR>- 20 kW code<BR>- 28 kW heap<BR>- 11 kW consts<BR>- 1 kW static data<BR>- ? kW stack</P>
<P>- 150 MCPS (million cycles per second) for q0<BR>- 164 MCPS for q4<BR>- 190 MCPS for q7</P>
<P>- accuracy = -410/511 (10 bit)</P>
<P> </P>
<P>Since then I worked on optimizing the decoder. <BR>The current memory consumption is...<BR>+ 17 kW code memory (doesnot include floor 0 decoding)<BR>+ 22 kW heap (up to 128kbit/s (q4); has to be increased with higher bitrates)<BR>+ 2 kW consts</P>
<P>+ 0.5 kW stack <BR>+ 2.5 kW application buffers (as a part of the decoder application; needed for real-time decoding)<BR>(therefore the decoder is suitable to run on a C5502 dsp; code has to be executed from an external memory)</P>
<P>The code is not assembler optimized yet. The CPU load is still high...<BR>+ 100 MCPS for q0<BR>+ 110 MCPS for q4<BR>+ 130 MCPS for q7<BR>With a number of assembler optimizations I expect the CPU load to can be reduced to 40-50 MCPS. (As there is only c-code a performance test on a C54X DSP would be interesting.)</P>
<P>The achieved accuracy compared to the vorbis reference decoder is -3/+2 digits (3 bit).</P>
<P>Regards, Roland</P><BR>
--
<p>___________________________________________________________<br>Sign-up for Ads Free at Mail.com<br>
<a href="http://mail01.mail.com/scripts/payment/adtracking.cgi?bannercode=adsfreejump01" target="_blank">http://www.mail.com/?sr=signup</a></p>