[Speex-dev] sanity check

khaynes at kirkgames.com khaynes at kirkgames.com
Tue Dec 12 18:39:40 PST 2006

It's working, and it's marginally intelligible, but not usable, so I thought I'd post a message to make sure I'm still sane.

I have the app capturing from the mic at 16kHz. I'm using 3200 byte buffers to read the captured data, which is 100 ms of 16 bit sample data at 16kHz. (1600 samples)
When I pass this data unaltered to the playback stream it plays fine with maybe 200 ms lag, which is correct.

Encoding: Quality 6, VBR

mResult = speex_encoder_ctl( mEncodeState, SPEEX_GET_FRAME_SIZE, & mFrameSize);

The encoder returns a frame size of 320. Sanity check 1. This means 320 sample frames of 16 bit samples, or 620 bytes of data right?

So encoding my 100 ms, 3200 byte buffer of mic captured samples compresses 5 frames of source at 640 bytes (320 samples) each.
You were correct in that each frame of talking data compresses to around 70 bytes or less, which is roughly 10:1 compression ratio, which is around 320 bytes of compressed data per 100ms (3200 bytes) of capture data.

I then immediately decode the 5 encoded packets to a 3200 byte buffer.
Then I play the sound in the output stream.

The sound is quite awful, and mostly unintelligible at best, but I can definately tell that what's being compressed/decompressed 'feels' similar to what I'm saying into the mic, and on a rare occasion, it actually is marginally intelligible.

I have checked all of the return values for errors, there are none.
I have also tried non-VBR, and different quality settings including 10 but with no real difference.

I'm asuming that I'm mistaken about one or more of these figures and that's where the problem lies. Please send the men in white coats.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.xiph.org/pipermail/speex-dev/attachments/20061212/e725ba12/attachment.htm

More information about the Speex-dev mailing list