[Speex-dev] Speex encoding/decoding producing garbled audio
Nathaniel Meyer
nath_meyer at hotmail.com
Tue Sep 14 14:40:12 PDT 2004
Whoops, left this message in my outbox. I managed to fix the problem.
Apparently I was only copying 160 bytes (Frame Size) back into the
audio stream when I should have been copying 320 (chars <-> shorts confused
me there). Hence why I could hear myself yet it was distorted. Half the wav
was missing =)
To answer some of the other questions here, for any insight into what I'm
doing:
Colin,
I'm using DirectSoundCapture to get the audio. I set it up so I only have
one capture buffer running and one stream buffer per person. Only active
people have their buffers running while others remain dormant. I noticed
Microsoft just fills the buffer with silence and continues playing it, but I
figure that's going to have an impact on performance. To eliminate garbled
playback, I also set a timer so that streamed audio only plays after X
milliseconds from the first packet received. Something along the lines of
300-500 milliseconds, which is enough time to give the stream a head start,
unless you have a really bad connection. But for testing purposes with this
API I'm feeding the captured audio directly into the encoder / decoder with
no network.
Thanks for the data conversion tip too.
Reed,
DirectSound uses void* as a datatype. In general it takes bytes (for
playback), but when in 16-bit mode it expects shorts for capturing packets.
It accepts ranges from -128 -> 127 (8-bit mode) or -32768 -> 32767 (16-bit
mode).Currently I'm recording mono 8000Hz at 16-bits/sample and I'm
expecting the decoder to produce the same. Default settings show it operates
with 1.875KB/sec bitrate, 8000Hz sample, and 16-bits/sample. Seems
appropriate.
Nate
> ----- Original Message -----
> From: "Nathaniel Meyer" <nath_meyer at hotmail.com>
> To: <speex-dev at xiph.org>
> Sent: Sunday, September 12, 2004 7:03 PM
> Subject: [Speex-dev] Speex encoding/decoding producing garbled audio
>
>
>> I'm getting garbled playback with decoded fragments and I'm hoping
>> someone here can point me in the right direction to correcting the
>> problem.
>>
>>
>>
>> Essentially I'm capturing audio from the microphone. I stream it over the
>> net, but for testing purposes with this API I'm just grabbing the whole
>> chunk and encoding / decoding it right away and then updating the sound
>> buffer for playback. The playback sounds very scratchy with a bit of a
>> buzz sound and some skipping; yet I can still make it out somewhat. At
>> first I though maybe I was doing data conversion between bytes and shorts
>> incorrectly, so I temporarily moved over to a short-based system. Still
>> the problem persisted, so perhaps it could be a setting or two I'm
>> missing? I posted my code below, demonstrating how I'm encoding and
>> decoding the buffers. I can't see anything wrong with it, so I'm guessing
>> my problem lies elsewhere. If anyone experienced a similar problem
>> beforehand, it would be nice to know what I could be doing wrong. As far
>> as the system itself, I can perfectly record audio at any channel
>> setting, sample rate, or bit-rate and play it back fine.
>>
>>
>>
>> - I'm using Speex version 1.1.6. I've also used 1.0.4 beforehand and
>> experienced the same problem with it.
>>
>>
>>
>> 1) I initialize the bits, encoder, and decoder as normal (default
>> settings seemed appropriate):
>>
>> speex_bits_init(&mBits);
>>
>> mEncode = speex_encoder_init(&speex_nb_mode);
>>
>> mDecode = speex_decoder_init(&speex_nb_mode);
>>
>>
>>
>> 2) I record my audio at mono 8000Hz, 16bits per sample.
>>
>>
>>
>> 3) I encode frame-sized (320 bytes) fragments. Since I deal only with
>> char data types, I convert to 2-byte short values first and then set the
>> float buffer.
>>
>> char *CSpeex::encode (char *buffer, int size, int &encodeSize)
>>
>> {
>>
>> char *encodedBuffer = new char[160];
>>
>> short speexShort;
>>
>> float *speexFloat = new float[160];
>>
>>
>>
>> // Convert the audio to a short then to a float buffer
>>
>> for (int i = 0; i < 160; i++)
>>
>> {
>>
>> memcpy(&speexShort, &buffer[i*2], sizeof(short));
>>
>> speexFloat[i] = speexShort;
>>
>> }
>>
>>
>>
>> // Encode the sound data using the float buffer
>>
>> speex_bits_reset(&mBits);
>>
>> speex_encode(mEncode, speexFloat, &mBits);
>>
>> encodeSize = speex_bits_write(&mBits, encodedBuffer, 160);
>>
>> delete[] speexFloat;
>>
>>
>>
>> // Return the encoded buffer
>>
>> return encodedBuffer;
>>
>> }
>>
>>
>>
>> 4) I immediately decode the encoded buffer. Encoded size is always 38
>> bytes for this sample set and expected decoded size is 320 bytes
>>
>> char *CSpeex::decode (char *buffer, int encodeSize)
>>
>> {
>>
>> char *decodedBuffer = new char[320];
>>
>> short speexShort;
>>
>> float *speexFloat = new float[160];
>>
>>
>>
>> // Decode the sound data into a float buffer
>>
>> speex_bits_reset(&mBits);
>>
>> speex_bits_read_from(&mBits, buffer, encodeSize);
>>
>> speex_decode(mDecode, &mBits, speexFloat);
>>
>>
>>
>> // Convert from float to short to char
>>
>> for (int i = 0; i < 160; i++)
>>
>> {
>>
>> speexShort = speexFloat[i];
>>
>> memcpy(&decodedBuffer[i*2], &speexShort, sizeof(short));
>>
>> }
>>
>> delete[] speexFloat;
>>
>>
>>
>> // Return the buffer
>>
>> return decodedBuffer;
>>
>> }
>>
>>
>>
>>
>>
>> Hope no one minds the source post. I'm really stumped on this one, but
>> the benefits of using Speex versus the bloat offered in the competitors
>> are well worth the hassle. I'm looking forward to incorporating this into
>> several games for VoIP support.
>>
>>
>>
>>
>>
>> Thanks.
>> _______________________________________________
>> Speex-dev mailing list
>> Speex-dev at xiph.org
>> http://lists.xiph.org/mailman/listinfo/speex-dev
>>
>
More information about the Speex-dev
mailing list