[CELT-dev] CELT grabbing 100KB of memory right off the top

Andrew Lentvorski bsder at allcaps.org
Mon Apr 18 16:23:15 PDT 2011


On 4/18/11 3:32 PM, Timothy B. Terriberry wrote:
>> As for C99, please, think of those of us stuck with embedded toolchains. :)
>
> Sorry, I live in perpetual hope. But that's why I also mentioned alloca.

And, sadly, I live in perpetual despair--also known as "reality".

I suspect this is going to provoke a religious war, but please put 
braces around single statement "if", "for", etc.  I had two particularly 
hard to find bugs because I rearranged stuff in celtenc.c and the indent 
looked okay.

I can cope with the fact that you eschew the One True Brace Style(tm) 
:), but unbraced single statements are just a bug asking to happen.  I 
can maybe see that the braces would be visual noise in tight, 
mathematical kernels, but for user code like celtenc.c, the possibility 
for bugs far outweighs the "visibility" issue.

>> I had to go all over the code to remove the "restrict" keyword (I
>> understand why you want it, but couldn't it be a macro--please?).  I
>
> What's wrong with passing -Drestrict ? Anything can be a macro in C.

<smacks forehead>

I can't believe I didn't think of that.  Stupid.

>> That having been said.  After turning on all the compiler optimizations
>> for speed, it seems that 28KHz, stereo, frame size of 120 bytes, bitrate
>
> I assume you mean 120 samples.

Um, yeah.  And I also mean 2*4* KHz, not 28KHz.  Sigh.

>> of 64kbps, and complexity 5 runs at about 50-60 MIPS on a PIC32 (MIPS
>> R4K @ 80MHz).  My wall clock times show that it is encoding a 1 minute
>
> 50-60 MIPS sounds about the right ballpark for the encoder.

Hrm.  I'm a little surprised.  The presentation claims about 60 MIPS for 
the encoder at complexity 10.  And I presume it's using 48KHz samples 
and a 128kbps bitstream and a framesize of 240 or bigger.

It feels like I'm off by at least a factor of 2 and possibly the whole 
way to a factor of 8.

Is there a good reason why I'm not getting a factor of 2 improvement 
from using 24KHz instead of 48KHz?  How about the drop from 128kbps to 
64kbps?  Framesize?

> There are still probably plenty of things that can be done to make the
> encoder faster. It hasn't been a strong focus, because it's already an
> order of magnitude faster than encoders for similar quality high-latency
> codecs. Contributions welcome!

Like what?  When I asked for suggestions, I wasn't joking.  I'd be happy 
to provide a contribution.

I already did a quick profile of the fixed point encoder and it wasn't 
doing anything obviously stupid.  There were 4 hotspots at 15%, 8%, 7%, 
and 5% respectively.  The 15% involved ilog.  A couple of things already 
had estimations and Newton iterations (that would have been one of my 
first choices to try).  There's certainly no obvious nail sticking up 
that I can see.  Even if I managed to reduce the ilog stuff to 0, I 
don't get enough improvement to make a difference (Amdahl's Law and all 
that).

While I probably don't need a full 50%, I would need to get things under 
about 40MIPS (so that the processor has about 50% idle time to do other 
things).  That's probably 25-30%.  Given that I don't see an obvious 
thing to optimize, it's probably not worth continuing to pour time at 
this unless somebody has some good suggestions (like locking the FFT to 
a specific size or something like that).

-a



More information about the celt-dev mailing list