preprocessor performance (was Re: [speex-dev] Memory leak in denoiser + a few questions)
Steve Kann
stevek at stevek.com
Wed Mar 31 11:49:04 PST 2004
Jean-Marc Valin wrote:
>OK, so the problem doesn't seem to be the VAD specifically. Can you tell
>me how much audio you had in the test? It may be that nothing's wrong
>and the code just isn't so fast that you can do 100 channels. Or maybe
>it just needs a bit of optimization...
>
>
In my test, I have a buffer which is 1024x1024 (about 1Million, or 65
seconds) samples long, which I zero and then fill with 537760 (about
500K, or 30 seconds) of sampled audio. The rest of the buffer is empty.
Then, I run the preprocessor over it 5 times; This simulates about 5
minutes of preprocessing, consisting of alternating 30 second segments
of speech and silence.
I sent (off-list) some oprofile output, but I'm not sure what to make of
it. Some operations that don't look any more complicated than others
seem to take a long time. I also tried getting samples on
DATA_CACHE_MISSES. Here's an example of the hotspots I found (in
preprocessor.c, code modified a bit to include local pointers to arrays
in the st struct):
The first four columns are the counter hits and percentage of hits for
CPU_CLK_UNHALTED events (Cycles outside of halt state) with a unit mask
of 0x00 (No unit mask) count 10000 and DATA_CACHE_MISSES events (Data
cache misses) with a unit mask of 0x00 (No unit mask) count 1000
respectively. The hits attributed to inc %ebx might be due to the
previous instruction, though, but clearly this loop itselff is taking
almost 7% of the time, which doesn't make sense..
<p> : for (i=1;i<N;i++)
: 804a340: mov $0x1,%ebx
18 0.0012 0 0.0e+00 : 804a345: cmp %edi,%ebx
: 804a347: jge 804a377
<speex_preprocess+0x3c7>
: 804a349: fldl 0x804d810
11 7.2e-04 0 0.0e+00 : 804a34f: fldl 0x804d818
: zeta[i] = .7*zeta[i] + .3*prior[i];
: 804a355: mov 0xffffffb4(%ebp),%ecx
1494 0.0979 1 0.0695 : 804a358: mov 0xffffffac(%ebp),%eax
22 0.0014 0 0.0e+00 : 804a35b: fld %st(1)
: 804a35d: fld %st(1)
1546 0.1013 1 0.0695 : 804a35f: fxch %st(1)
1532 0.1004 0 0.0e+00 : 804a361: fmuls (%ecx,%ebx,4)
1 6.6e-05 0 0.0e+00 : 804a364: fxch %st(1)
8 5.2e-04 0 0.0e+00 : 804a366: fmuls (%eax,%ebx,4)
1416 0.0928 9 0.6254 : 804a369: faddp %st,%st(1)
1 6.6e-05 0 0.0e+00 : 804a36b: fstps (%ecx,%ebx,4)
102158 6.6924 15 1.0424 : 804a36e: inc %ebx
5864 0.3842 0 0.0e+00 : 804a36f: cmp %edi,%ebx
1564 0.1025 0 0.0e+00 : 804a371: jl 804a355
<speex_preprocess+0x3a5>
: 804a373: fstp %st(0)
144 0.0094 0 0.0e+00 : 804a375: fstp %st(0)
<p>Here, this area of the code is taking (in this example) about 13% of the
execution time:
: zeta1 = zeta[i];
: else
: zeta1 = .25*zeta[i-1] +
.5*zeta[i] + .25*zeta[i+1];
: 804a490: mov 0xffffffb4(%ebp),%edx
4292 0.2812 0 0.0e+00 : 804a493: fldl 0x804d868
287 0.0188 0 0.0e+00 : 804a499: flds (%edx,%ebx,4)
146543 9.6001 26 1.8068 : 804a49c: fxch %st(1)
28942 1.8960 3 0.2085 : 804a49e: fmuls
0xfffffffc(%edx,%ebx,4)
9996 0.6548 1 0.0695 : 804a4a2: fxch %st(1)
: 804a4a4: fmuls 0x804d708
1655 0.1084 0 0.0e+00 : 804a4aa: faddp %st,%st(1)
1030 0.0675 1 0.0695 : 804a4ac: fldl 0x804d868
657 0.0430 0 0.0e+00 : 804a4b2: fmuls 0x4(%edx,%ebx,4)
553 0.0362 0 0.0e+00 : 804a4b6: faddp %st,%st(1)
1129 0.0740 0 0.0e+00 : 804a4b8: fstps 0xffffffe4(%ebp)
53350 3.4950 3 0.2085 : 804a4bb: flds 0xffffffe4(%ebp)
<p>I see that there's probably some optimizations that could be made when
using the preprocessor only for VAD; the reverse fft and writing back
results, etc could certainly be skipped, since if only VAD is enabled,
then there's no point in modifying the samples. But, that isn't the
bulk of the consumption, assuming that what oprofile is telling me is
even close to correct.
<p>--- >8 ----
List archives: http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'speex-dev-request at xiph.org'
containing only the word 'unsubscribe' in the body. No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.
More information about the Speex-dev
mailing list