[CELT-dev] CELT grabbing 100KB of memory right off the top
bsder at allcaps.org
Mon Apr 18 16:23:15 PDT 2011
On 4/18/11 3:32 PM, Timothy B. Terriberry wrote:
>> As for C99, please, think of those of us stuck with embedded toolchains. :)
> Sorry, I live in perpetual hope. But that's why I also mentioned alloca.
And, sadly, I live in perpetual despair--also known as "reality".
I suspect this is going to provoke a religious war, but please put
braces around single statement "if", "for", etc. I had two particularly
hard to find bugs because I rearranged stuff in celtenc.c and the indent
I can cope with the fact that you eschew the One True Brace Style(tm)
:), but unbraced single statements are just a bug asking to happen. I
can maybe see that the braces would be visual noise in tight,
mathematical kernels, but for user code like celtenc.c, the possibility
for bugs far outweighs the "visibility" issue.
>> I had to go all over the code to remove the "restrict" keyword (I
>> understand why you want it, but couldn't it be a macro--please?). I
> What's wrong with passing -Drestrict ? Anything can be a macro in C.
I can't believe I didn't think of that. Stupid.
>> That having been said. After turning on all the compiler optimizations
>> for speed, it seems that 28KHz, stereo, frame size of 120 bytes, bitrate
> I assume you mean 120 samples.
Um, yeah. And I also mean 2*4* KHz, not 28KHz. Sigh.
>> of 64kbps, and complexity 5 runs at about 50-60 MIPS on a PIC32 (MIPS
>> R4K @ 80MHz). My wall clock times show that it is encoding a 1 minute
> 50-60 MIPS sounds about the right ballpark for the encoder.
Hrm. I'm a little surprised. The presentation claims about 60 MIPS for
the encoder at complexity 10. And I presume it's using 48KHz samples
and a 128kbps bitstream and a framesize of 240 or bigger.
It feels like I'm off by at least a factor of 2 and possibly the whole
way to a factor of 8.
Is there a good reason why I'm not getting a factor of 2 improvement
from using 24KHz instead of 48KHz? How about the drop from 128kbps to
> There are still probably plenty of things that can be done to make the
> encoder faster. It hasn't been a strong focus, because it's already an
> order of magnitude faster than encoders for similar quality high-latency
> codecs. Contributions welcome!
Like what? When I asked for suggestions, I wasn't joking. I'd be happy
to provide a contribution.
I already did a quick profile of the fixed point encoder and it wasn't
doing anything obviously stupid. There were 4 hotspots at 15%, 8%, 7%,
and 5% respectively. The 15% involved ilog. A couple of things already
had estimations and Newton iterations (that would have been one of my
first choices to try). There's certainly no obvious nail sticking up
that I can see. Even if I managed to reduce the ilog stuff to 0, I
don't get enough improvement to make a difference (Amdahl's Law and all
While I probably don't need a full 50%, I would need to get things under
about 40MIPS (so that the processor has about 50% idle time to do other
things). That's probably 25-30%. Given that I don't see an obvious
thing to optimize, it's probably not worth continuing to pour time at
this unless somebody has some good suggestions (like locking the FFT to
a specific size or something like that).
More information about the celt-dev