[Speex-dev] Speex encoding/decoding producing garbled audio

Nathaniel Meyer nath_meyer at hotmail.com
Tue Sep 14 14:40:12 PDT 2004


Whoops, left this message in my outbox. I managed to fix the problem. 
Apparently I was only copying 160 bytes (Frame Size) back into the
audio stream when I should have been copying 320 (chars <-> shorts confused 
me there). Hence why I could hear myself yet it was distorted. Half the wav 
was missing =)

To answer some of the other questions here, for any insight into what I'm 
doing:

Colin,

I'm using DirectSoundCapture to get the audio. I set it up so I only have
one capture buffer running and one stream buffer per person. Only active
people have their buffers running while others remain dormant. I noticed
Microsoft just fills the buffer with silence and continues playing it, but I
figure that's going to have an impact on performance. To eliminate garbled
playback, I also set a timer so that streamed audio only plays after X
milliseconds from the first packet received. Something along the lines of
300-500 milliseconds, which is enough time to give the stream a head start,
unless you have a really bad connection. But for testing purposes with this
API I'm feeding the captured audio directly into the encoder / decoder with
no network.

Thanks for the data conversion tip too.


Reed,

DirectSound uses void* as a datatype. In general it takes bytes (for
playback), but when in 16-bit mode it expects shorts for capturing packets.
It accepts ranges from -128 -> 127 (8-bit mode) or -32768 -> 32767 (16-bit
mode).Currently I'm recording mono 8000Hz at 16-bits/sample and I'm
expecting the decoder to produce the same. Default settings show it operates
with 1.875KB/sec bitrate, 8000Hz sample, and 16-bits/sample. Seems
appropriate.

Nate


> ----- Original Message ----- 
> From: "Nathaniel Meyer" <nath_meyer at hotmail.com>
> To: <speex-dev at xiph.org>
> Sent: Sunday, September 12, 2004 7:03 PM
> Subject: [Speex-dev] Speex encoding/decoding producing garbled audio
>
>
>> I'm getting garbled playback with decoded fragments and I'm hoping 
>> someone here can point me in the right direction to correcting the 
>> problem.
>>
>>
>>
>> Essentially I'm capturing audio from the microphone. I stream it over the 
>> net, but for testing purposes with this API I'm just grabbing the whole 
>> chunk and encoding / decoding it right away and then updating the sound 
>> buffer for playback. The playback sounds very scratchy with a bit of a 
>> buzz sound and some skipping; yet I can still make it out somewhat. At 
>> first I though maybe I was doing data conversion between bytes and shorts 
>> incorrectly, so I temporarily moved over to a short-based system. Still 
>> the problem persisted, so perhaps it could be a setting or two I'm 
>> missing? I posted my code below, demonstrating how I'm encoding and 
>> decoding the buffers. I can't see anything wrong with it, so I'm guessing 
>> my problem lies elsewhere. If anyone experienced a similar problem 
>> beforehand, it would be nice to know what I could be doing wrong. As far 
>> as the system itself, I can perfectly record audio at any channel 
>> setting, sample rate, or bit-rate and play it back fine.
>>
>>
>>
>> - I'm using Speex version 1.1.6. I've also used 1.0.4 beforehand and 
>> experienced the same problem with it.
>>
>>
>>
>> 1) I initialize the bits, encoder, and decoder as normal (default 
>> settings seemed appropriate):
>>
>> speex_bits_init(&mBits);
>>
>> mEncode = speex_encoder_init(&speex_nb_mode);
>>
>> mDecode = speex_decoder_init(&speex_nb_mode);
>>
>>
>>
>> 2) I record my audio at mono 8000Hz, 16bits per sample.
>>
>>
>>
>> 3) I encode frame-sized (320 bytes) fragments. Since I deal only with 
>> char data types, I convert to 2-byte short values first and then set the 
>> float buffer.
>>
>> char *CSpeex::encode (char *buffer, int size, int &encodeSize)
>>
>> {
>>
>>      char *encodedBuffer = new char[160];
>>
>>      short speexShort;
>>
>>      float *speexFloat = new float[160];
>>
>>
>>
>>      // Convert the audio to a short then to a float buffer
>>
>>      for (int i = 0; i < 160; i++)
>>
>>      {
>>
>>            memcpy(&speexShort, &buffer[i*2], sizeof(short));
>>
>>            speexFloat[i] = speexShort;
>>
>>      }
>>
>>
>>
>>      // Encode the sound data using the float buffer
>>
>>      speex_bits_reset(&mBits);
>>
>>      speex_encode(mEncode, speexFloat, &mBits);
>>
>>      encodeSize = speex_bits_write(&mBits, encodedBuffer, 160);
>>
>>      delete[] speexFloat;
>>
>>
>>
>>      // Return the encoded buffer
>>
>>      return encodedBuffer;
>>
>> }
>>
>>
>>
>> 4) I immediately decode the encoded buffer. Encoded size is always 38 
>> bytes for this sample set and expected decoded size is 320 bytes
>>
>> char *CSpeex::decode (char *buffer, int encodeSize)
>>
>> {
>>
>>      char *decodedBuffer = new char[320];
>>
>>      short speexShort;
>>
>>      float *speexFloat = new float[160];
>>
>>
>>
>>      // Decode the sound data into a float buffer
>>
>> speex_bits_reset(&mBits);
>>
>>      speex_bits_read_from(&mBits, buffer, encodeSize);
>>
>>      speex_decode(mDecode, &mBits, speexFloat);
>>
>>
>>
>>      // Convert from float to short to char
>>
>>      for (int i = 0; i < 160; i++)
>>
>>      {
>>
>>            speexShort = speexFloat[i];
>>
>>            memcpy(&decodedBuffer[i*2], &speexShort, sizeof(short));
>>
>>      }
>>
>>      delete[] speexFloat;
>>
>>
>>
>>      // Return the buffer
>>
>>      return decodedBuffer;
>>
>> }
>>
>>
>>
>>
>>
>> Hope no one minds the source post. I'm really stumped on this one, but 
>> the benefits of using Speex versus the bloat offered in the competitors 
>> are well worth the hassle. I'm looking forward to incorporating this into 
>> several games for VoIP support.
>>
>>
>>
>>
>>
>> Thanks.
>> _______________________________________________
>> Speex-dev mailing list
>> Speex-dev at xiph.org
>> http://lists.xiph.org/mailman/listinfo/speex-dev
>>
> 


More information about the Speex-dev mailing list