[Speex-dev] Speex requirements on a TI Davinci / ARM926EJ-Sid(wb)

Todd Krein Todd.Krein at ooma.com
Mon Mar 3 13:02:17 PST 2008

(Resending since the archive seems to have eaten the text.)


I'm working on getting Speex running on a TI Davinci (6441), on both the
ARM and the DSP. My immediate goal is to understand the processing
requirements on different architectures so that I can properly specific
an engineering project.


I have it up and running on the ARM, and I just wanted to see if anyone
could sanity-check my results before I continue.


Brief version:

WB decode takes ~24MIPS, encode takes ~243MIPS.

NB decode takes ~10MIPS, encode takes ~102MIPS.

(And by MIPS, I mean ARM CPU cycles, not Dhrystone MIPS)


If I manually (because I couldn't figure out how to make configure do
it) add "-mpcu=arm926ej_s" and "-DSHORTCUTS", the wideband numbers drop
to 22MIPS and 219MIPS, respectively.


More Details:


The commands/options I used to config Speex:


# export ARM_INSTALL_TREE=/extern/toddk/workdir/filesys

export CC=arm_v5t_le-gcc

export CXX=arm_v5t_le-g++

export AR=arm_v5t_le-ar

export RANLIB=arm_v5t_le-ranlib

export NM=arm_v5t_le-nm

export ac_cv_path_NM=arm_v5t_le-nm

export ac_cv_func_setpgrp_void=yes

export host=arm_v5t_le


./configure --prefix /extern/toddk/workdir/filesys/usr/local
--host=arm-linux --disable-float-api --disable-vbr --enable-fixed-point


The source data is from MP3 audio books that have been resampled to 16K
and written out as .WAV files by NCH's Switch software. (It's hard to
get good wideband audio samples...)


For decode, I run the command "time speexdec sample.spx", and then take
sum of the "user" and "sys" times the speed of the processor (297MHz),
and divide it by the "real" (or clock) time. That should give me the
average processing power requirements. (And the number is matched by the
results of "top".)


For encode, I run the command "time speexenc sample.wav - > /dev/null".
Since it consumes 99% of the cpu, and runs faster than real time, the
MIPS requirement is calculated by taking sum of the "user" and "sys"
times the speed of the processor (297MHz), and dividing by the playback
time determined by the decode step.


Does this seem like a reasonable way to test the code?

Are the results in line with what people would expect?




Todd Krein

Director of Architecture


todd.krein at ooma.com


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.xiph.org/pipermail/speex-dev/attachments/20080303/99db6c43/attachment.html

More information about the Speex-dev mailing list