[Speex-dev] Shoehorning speex is confusing a newbie

Michael Jacobson Michael.Jacobson at ultratec.com
Tue Jul 24 13:59:16 PDT 2007


I thought I should be getting 20 words because in the windows app the
command line repeats "nbBytes = 20" through the entire decode phase.  I
wonder how the frames got misaligned...right now I get 10 words of data
followed by 10 words of 0's which I look at through a memory window in
code composer. now I am pretty sure there is a terminator word that gets
in there so that would make 11 words being used.  what part of the
encoding phase sets the size of the encoded data and how do I change
that to 20 instead of 10?  
 
well the project doesn't have a "main loop" per-say...it uses the DSP
BIOS that comes with code composer which manages all the main looping
parts... I'm running off a HW timer interrupt that is set for 8KHz and
is constantly taking sound in from the Mic. and outputting to the
speaker.  when I push a button and tell it to start collecting data from
the Mic. and put it into a 160 word buffer, when the buffer is full I
trigger a SW interrupt and switch to another buffer (ping-pong), the
software interrupt used to do G729A but I have it doing speex instead. 
speex encodes from that full buffer and places it into the recording
buffer.  when the recording buffer is full it stops recording and when I
hit another button the HW interrupt is told to start outputting to the
speaker, when decoding is needed I set a SW interrupt again and data is
taken from the recording buffer, decoded and placed in a playing buffer.
when the playing buffer is half way through I decide to decode the next
chunk to another buffer (ping-pong) and it does that until I run out of
data to decode.   our ADCs and DACs are 16 bit, so I don't think I need
to worry about bytes vs words in the output
 
I don't know if you care about any of that but maybe it'll help.  it's
not exactly an idle loop but it's close enough so that it should be able
to work the same right?  the only deviation I do from the testenc is I
don't use the callbacks, set the complexity, or disable VBR and I have
my own bit buffer so I don't rewind or reset the bits.
 
-Mike

>>> "Jim Crichton" <jim.crichton at comcast.net> 07/24/07 3:19 PM >>>

Mike,
 
Generally "Invalid mode encounterd" == "frames are misaligned"
 
You should be getting 20 bytes from the encoder each time, and passing
20 bytes to the decoder each time.  Is it correct that you have modeled
your  main loop after testenc-TI-c5x.c?
 
If you look at encoded silence with a binary editor, you should be able
to see the 20-byte repetition pattern.  You can also use the sample
simulator build, if you can capture your encoder output to a file. 
There is a DECODE_ONLY switch in that build, that lets you run just that
portion.
 
- Jim 
----- Original Message ----- 
From: Michael Jacobson 
To: Jim Crichton ; speex-dev at xiph.org 
Sent: Tuesday, July 24, 2007 3:48 PM
Subject: Re: [Speex-dev] Shoehorning speex is confusing a newbie


Jean-Marc was correct in that the 16bit value was the culprit for my
encoding woes.  after I changed that to a 32 bit value I believe it
encodes correctly, but I really don't have much of any way to know this
absolutely. 
 
I am using the 1.2beta2.  I would use the enctest program, I have
looked it over and based a lot of what I am doing on that code but the
project I am developing this on is a little too integrated to be able to
start from scratch so I believe I'm going to have to find a way to get
it to work in this-here-dohickey.
 
Jim thank you for the files I'm sure they will come in handy.  I have
currently hacked out enough of the project that is not required for this
proof-of-concept test so that everything seems to fit properly at the
moment and I am able to encode over a second and a half of audio to
test.  I will be looking over the code shortly to see what I can use.
 
but I'm still having problems.
 
I decided to run my test now that encode "works" and I get a hopeful
result: something I encoded and then decoded came back and I could hear
it and I could understand it! unfortunately the quality is so poor it
sounds like a cylon, and I know speex is much better than this because I
have a windows version that works and the quality is quite high (by
quality I mean how it sounds, both systems are using a quality of 4 with
8KHz and 8kbs voice).  I think I've narrowed down where the problem is
coming from but I have no idea why it is happening.
 
my problem is that when I go to decode my data the decoder doesn't do
anything a lot of the time because it goes here:
 
        if (speex_bits_remaining(bits)<5)
            return -1;
         wideband = speex_bits_unpack_unsigned(bits, 1);
         if (wideband) /* Skip wideband block (for compatibility) */
         {
            //int submode;
            int advance;
            advance = /*submode =*/ speex_bits_unpack_unsigned(bits,
SB_SUBMODE_BITS);
            speex_mode_query(&speex_wb_mode,
SPEEX_SUBMODE_BITS_PER_FRAME, &advance);
            if (advance < 0)
            {
               speex_notify("Invalid mode encountered. The stream is
corrupted.");
               return -2; //<----------------------*this is where I
drop out*
            } 
and drops out with a return of -2 because advance < 0 because wideband
= 1.  I don't know why it does this. I am guessing that it thinks it's
in wideband mode but then isn't encoded for wideband so it drops out
with an error because it isn't encoded in the way it thinks it should
be.   but if the encode didn't work why would it work some of the time
and not all of the time?  I've also noticed that the encoder will always
encode the 160 words into 11 words even though I allocated 21 words for
each frame. The guy who did the windows app said that the encoding
portion uses a dynamic number of bytes so I don't know if he is wrong or
if my encoder isn't doing what it should.  if you have any spare time to
reply I would really appreciate it.
 
-Mike

>>> "Jim Crichton" <jim.crichton at comcast.net> 07/24/07 1:33 PM >>>

Mike,
 
I did this a year ago on C55 (svn build 11463), and have attached my
patches to nb_celp.c, modes.c, and the project file, to remove all modes
but 8kbps.  I have also attached a debug version of stack_alloc.h, which
tracks the maximum depth of the scratch stack, so that you can tweak the
sizes in config.h.  You just have to declare the variable and add this
init before the call to speex_encoder_init, and go back and look at the
variable after you run some data through.
 
#ifdef STACKDBG
    spxGlobalScratchFree = spxGlobalScratchPtr;
#endif

If you think that the C5416 build is broken, then (as Jean-Marc said)
tell us what version you are using.  There is a C54x project in the
speex source tree that runs directly in the Code Composer simulator.  If
that fails in the latest code, let me know and I will work on tracking
it down.  As Jean-Marc indicated, this build does break from time to
time because of 16/32-bit conversion problems.  I have also had problems
with 16/32 bit mismatch in function parameters leading to really bizarre
behavior.
 
You should be able to use the simulator build as a reference to isolate
problems between your encoder and decoder (as long as your audio samples
are small, the simulator is pretty slow for C54).
 
Good luck.
 
- Jim
 
----- Original Message ----- 
From: Michael Jacobson 
To: speex-dev at xiph.org 
Sent: Monday, July 23, 2007 12:54 PM
Subject: [Speex-dev] Shoehorning speex is confusing a newbie


This is going to take some explaining and I apologize in advance if any
of this is found in the manual or sample code but I couldn't find it.  I
just graduated last may and this is my first experience with vocoders
and dissecting a professional's code.
 
I work for a company that is currently using a G729A vocoder from a 3rd
party software company and is looking into speex so they no longer have
to pay royalties.  The product we are trying to force speex into is
based on a TI C5416 DSP that did narrowband 8-bit, 8kbs.  The product
was fairly full as it is so some modifications had to be made in order
to fit speex into the project just to allow it to link.  The
modifications are based off assumptions that I made when looking over
the code so I may have been absolutely wrong.
 
The main assumption was about the exc tables.  In looking through
modes.c it appeared that only one table was required for 8kbs so I
commented out the portions of the code that referenced the other tables
and modified "static const SpeexNBMode nb_mode" structure so that the
pointers to the other structures that referenced the tables were NULL. 
We did not have enough data memory to store all the tables.  I thought
this would work with my initialization but when I stepped through the
code in nb_celp for encode it would put it in mode 6 (instead of mode
3), which I believe is 18.2kbs (table 8.2). My set up code is:
 
  st = speex_encoder_init(&speex_nb_mode);
  speex_bits_set_bit_buffer(&bits, &G729_tx, COMPRESS_LENGTH);
  tmp=TESTENC_QUALITY;  //=4
  speex_encoder_ctl(st, SPEEX_SET_QUALITY, &tmp);
  speex_encode_int(st, (spx_int16_t *)samples_in, &bits);
 
which I thought would put it in 8kbs narrowband.  I tried to use
SPEEX_SET_MODE in there too but it just got overwritten by the set mode
in the encode function.  So I thought I'd try to force it into mode 3
and see what happens, and I got A result, but when I try to decode it my
decode stage gets stuck in an infinite loop:
 
while (st->voc_offset<st->subframeSize)
            {
               if (st->voc_offset>=0)
                  exc[st->voc_offset]=sqrt(1.0*ol_pitch);
               st->voc_offset+=ol_pitch;
            }
 
because both voc_offset and ol_pitch is = 0 because this code is never
entered:
 
if (SUBMODE(lbr_pitch)!=-1)
   {
      ol_pitch = st->min_pitch+speex_bits_unpack_unsigned(bits, 7);
   } 
 
This is how I set up the decoder:
 
  dec = speex_decoder_init(&speex_nb_mode);
   speex_bits_set_bit_buffer(&bits, &Speex_enc_buffer[0 +
COMPRESS_LENGTH*Speex_player_frame], COMPRESS_LENGTH);
   tmp=0;
   speex_decoder_ctl(dec, SPEEX_SET_ENH, &tmp);
   speex_decode_int(dec, &bits, (spx_int16_t *)samples_out); 
 
There are some things I am sure will be asked.  Yes I set flags for the
TI_54X part, disable wideband, manual allocation, and fixed point in a
config file and defined the #define value needed to include that config
file.  I do have a heap for the setup of the state structure for encode
and decode and yes I made sure it was big enough to allocate enough to
both.  Yes I destroy the structures after I am done en/decoding them. 
If there is anything you need to help you help me then I am defiantly
willing to share.  I am thoroughly confused and could use some help.
 
Thanks.
 
-Mike Jacobson
michael.jacobson at ultratec.com


_______________________________________________
Speex-dev mailing list
Speex-dev at xiph.org
http://lists.xiph.org/mailman/listinfo/speex-dev


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.xiph.org/pipermail/speex-dev/attachments/20070724/0fa32bde/attachment-0001.htm


More information about the Speex-dev mailing list