[Speex-dev] Programming questions

Ashhar Farhan farhan at phonestack.com
Mon Jan 17 01:07:04 PST 2005


> I have to encode a stream with 44100 Khz 16 bits Stereo and I would encode
> it by using packets containing a duration of about 1/10 sec of unencoded
> speech (this value is not fixed and will be an integer multiple of the
> Speex encoder frame size to have a duration of about 1/10 of a second).
> The duration imposed by the Speex encoder frame size is a little too short
> for the application.

you are better off using the vogg orbis codec. speex is meant 
specifically for telephonic voice. it takes a single human voice and 
compresses it well. it cannot handle muliple voices or music very well. 
For instance, a whistle can after a while, completely fade out on speex.

> My problems are the following:
>      0) Is it possible to manipulate voice data representing more than
>         the frame size of encoder (this seems possible in Speex Java code
>         used until now).

what every your sized chunks you have, for speex, you will have to repack 
them as 160 samples per frame.

>      1) How can I specify to the encoder that I use 16 bits stereo data ?
>         The ctl function allows only to set the sampling rate. I have
>         found nowhere some details. In speexenc.c I have seen the use
>         of some related to stereo functions but without understanding
>         well how to use them.

simply use two instances of speex state structures, one for each channel.

>      2) What is exactly the frame size ? It is an integer representing
>         a number of sample or a number of bytes or even a number of
>         shorts (in sampleenc.c this seems to be a number of shorts)

for sanity's sake, just accept that speex will work only with 160 samples 
per frame.

>      3) How (if this is possible) to decode a certain amount (known) of
>         coded data if these coded data represent more than the frame size
>         of encoder. This situation arise when the coded data for a
>         duration greater than the frame size of decoder are used. Should
>         I call some integer number of times the decoding procedure (like
>         it is done in the java code of Speex).

the decode funtion will internally decode the data 160 samples at time 
from each frame. It can detect that the bit stream has multiple frames 
(each decoding to 160 samples).

>      4) What is the difference "SPEEX_GET_FRAME_SIZE" and
>         "SPEEX_MODE_FRAME_SIZE" operations. This should not be identical
>         since these operations are provided by two different functions
>         but I don't see any explanation of a difference in the
>         documentation.
SPPEX_GET_FRAME_SIZE will return the size of the uncompressed frame(whic 
is 160). it is the number of PCM samples that each speex frame represents.

Speex is a variable bit rate codec. Hence, the 160 samples are compressed 
into a speex packet of a size that is determined by the selected mode of 
compression. SPEEX_MODE_FRAME_SIZE returns the size of the compressed 
frame. and it will vary depending upon the mode selected.

- farhan

> Iwould greatly appreciate If somebody has some answer to these questions.
> Thanks and best regards,
> Alain Aubord
> _______________________________________________
> Speex-dev mailing list
> Speex-dev at xiph.org
> http://lists.xiph.org/mailman/listinfo/speex-dev

More information about the Speex-dev mailing list