[Speex-dev] TI 6xxx platform performance

Jerry Trantow jtrantow at ieee.org
Wed Jan 18 21:16:23 PST 2006


I'm trying to make a design decision between a TI 6416 or DM642 (fixed
point) and 6713 (floating point) platform.  The application is a 32 channel
speech encoder. (CBR only, 8khz, 8kbps)

To get a feel for the computational load, I am running 1 second (50 frames)
of voice through the encoder.

My profile of the 6416 indicates I'm at 27.4M cycles/channel.  I need to get
below 720Mhz/32 channels = 22.5M cycles per channel.  I did a little work on
inner_prod() and normalize16() and I'm confident I can get 32 channels by
optimizing 5 or 6 functions.  I expect these numbers to translate over the
DM642.

Symbol Name	 			Count cycle.Total: Incl.
cycle.Total:Excl.
 compute_weighted_codebook	200	 4511420	4511420
 iir_mem2				599	 3338308	3338308
 filter_mem2			799	 2323655	2323655
 compute_impulse_response	200	 1800518	1800518
 pitch_gain_search_3tap		199	 4726604	1744952
 open_loop_nbest_pitch		199	 4204121	1641016
 vq_nbest				800	 1626252	1626252
 lpc_to_lsp				50	 1612650	1558133
 nb_encode				50	27412845	1179551
 fir_mem2				50	 1097300	1097300
 inner_prod				27469	 1072299	1072299
 split_cb_search_shape_sign_N1200	 7310588	1007711
 normalize16			597     303378     303378,

A lower cost option would be to use a floating point 6713.  I thought that a
300Mhz floating point would come out even or ahead in an encoding
comparison.  Instead of the 300M/32=9.3M cycles per channel that I need, I
see 71.5M cycles per channel!!!

compute_weighted_codebook	200	8709029	8709029
 filter_mem2			799	8322224	8322224
 inner_prod				27469	5911396	5911396
 vq_nbest				800	5465094	5465094
 iir_mem2				599	5378906	5378906
 split_cb_search_shape_sign_N1 200 18106210	3694787
 compute_impulse_response	200	3084502	3084502
 open_loop_nbest_pitch		199  18400309	2817913
 pitch_gain_search_3tap		199	7002859	2696353
 _spx_autocorr			50	2211100	2211100
 lsp_to_lpc				450	2076854	2076854
 nb_encode				50   71523682	1938067
 fir_mem2				50	1777450	1777450
 cheb_poly_eva			9634	1564172	1564172
 lsp_weight_quant			100	1032600	1032600


Does this make sense?  
I'm generating floating point code, using the optimizer, etc...

Has anyone posted DM642, C64xx or C67xx benchmarks?
 



More information about the Speex-dev mailing list