[Speex-dev] TI 6xxx platform performance
Jerry Trantow
jtrantow at ieee.org
Wed Jan 18 21:16:23 PST 2006
I'm trying to make a design decision between a TI 6416 or DM642 (fixed
point) and 6713 (floating point) platform. The application is a 32 channel
speech encoder. (CBR only, 8khz, 8kbps)
To get a feel for the computational load, I am running 1 second (50 frames)
of voice through the encoder.
My profile of the 6416 indicates I'm at 27.4M cycles/channel. I need to get
below 720Mhz/32 channels = 22.5M cycles per channel. I did a little work on
inner_prod() and normalize16() and I'm confident I can get 32 channels by
optimizing 5 or 6 functions. I expect these numbers to translate over the
DM642.
Symbol Name Count cycle.Total: Incl.
cycle.Total:Excl.
compute_weighted_codebook 200 4511420 4511420
iir_mem2 599 3338308 3338308
filter_mem2 799 2323655 2323655
compute_impulse_response 200 1800518 1800518
pitch_gain_search_3tap 199 4726604 1744952
open_loop_nbest_pitch 199 4204121 1641016
vq_nbest 800 1626252 1626252
lpc_to_lsp 50 1612650 1558133
nb_encode 50 27412845 1179551
fir_mem2 50 1097300 1097300
inner_prod 27469 1072299 1072299
split_cb_search_shape_sign_N1200 7310588 1007711
normalize16 597 303378 303378,
A lower cost option would be to use a floating point 6713. I thought that a
300Mhz floating point would come out even or ahead in an encoding
comparison. Instead of the 300M/32=9.3M cycles per channel that I need, I
see 71.5M cycles per channel!!!
compute_weighted_codebook 200 8709029 8709029
filter_mem2 799 8322224 8322224
inner_prod 27469 5911396 5911396
vq_nbest 800 5465094 5465094
iir_mem2 599 5378906 5378906
split_cb_search_shape_sign_N1 200 18106210 3694787
compute_impulse_response 200 3084502 3084502
open_loop_nbest_pitch 199 18400309 2817913
pitch_gain_search_3tap 199 7002859 2696353
_spx_autocorr 50 2211100 2211100
lsp_to_lpc 450 2076854 2076854
nb_encode 50 71523682 1938067
fir_mem2 50 1777450 1777450
cheb_poly_eva 9634 1564172 1564172
lsp_weight_quant 100 1032600 1032600
Does this make sense?
I'm generating floating point code, using the optimizer, etc...
Has anyone posted DM642, C64xx or C67xx benchmarks?
More information about the Speex-dev
mailing list