[speex-dev] XScale realtime encoding possible?

MAL mal at komcept.com
Fri Nov 7 08:08:00 PST 2003



Jean-Marc Valin wrote:
> Hi,
> 
> I have replaced most (but not all) of the float operations by integer
> operations, but it seems like the remaining ones take a long long time
> when emulated in kernel space (hence high system time). The other
> problem is that I don't have access to an ARM-based device (anyone wants
> to send me one?), so I'm doing all this blind... If you'd like to help,
> it can also accelerate things.
> 
> The other thing is that you're probably pushing a bit too much with 44.1
> kHz, which would probably require around 250 MIPS even with the
> fixed-point completed. It could probably run in real-time but it would
> take lots of CPU and require asm optimizations. I suggest you try
> something like 16 kHz. I'm sure the quality will still be enough for you
> (much better than MP3 at 24 kbps anyway :)
> 
> Last thing, can you try the code that's in CVS right now? I removed many
> float ops since 1.1.1, so it may already work better. 

Current CVS, on a 16kHz, 16 bit, mono, 60 second sample:

# time speexenc -w --quality 0 --abr 28800 --comp 0 -V 
test-16kHz-60sec.wav test-16kHz-60sec.spx
Encoding 16000 Hz audio using wideband (sub-band CELP) mode (mono)
Bitrate is use: 35800 bps  (average 28564 bps)
real    3m39.017s
user    1m29.130s
sys     2m0.910s

I got the usage of --abr wrong before, (28 = 28 bits per second :)
I'm also assuming quality=0/comp=0 produces a reasonable output.

3 mins, 39 seconds is still a way off realtime, for a 60 second clip, 
but it's a lot closer than 1.1.1 got.

What still worries me though, is that even if I forget the 2 mins spent 
in the kernel, emulating the floating point, there's 1 min 29 secs of 
userspace time, which is still greater than 60 secs clip.  Are compiler 
optimisations going to shave off those 29 (or more) secs?

My current configure command is:

CFLAGS="-O3 -funroll-loops -ffast-math" CC=arm-linux-gcc 
CPP=arm-linux-cpp LD=arm-linux-ld RANLIB=arm-linux-ranlib 
STRIP=arm-linux-strip ./autogen.sh --host=arm-linux --prefix=/usr 
--with-ogg-dir=/opt/arcom/arm-linux 
--with-ogg-includes=/opt/arcom/arm-linux/include 
--with-ogg-lib=/opt/arcom/arm-linux/lib --enable-fixed-point

I will of course try without ffast-math and funroll-loops, as they can 
decrease speed in some circumstances, but i'm open for further suggestions.

The machine it's running on has an XScale-PXA255 processor at 400MHz, 
with a 200MHz bus.

Regards,
MAL

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'speex-dev-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.



More information about the Speex-dev mailing list