[speex-dev] Speex settings and jitter

John Hayes jhayes at thereinc.com
Tue May 20 14:27:51 PDT 2003

> -----Original Message-----
> From: Tom Grandgent [mailto:tgrand at canvaslink.com]
> Sent: Tuesday, May 20, 2003 1:28 PM
> To: jhayes at thereinc.com
> Cc: speex-dev at xiph.org
> Subject: RE: [speex-dev] Speex settings and jitter
> [Just curious, and seizing the opportunity to communicate with
> other folks who are doing the same kind of thing I am...]
> How are you measuring the latency?  I tried measuring it with my
> program (also Win32-based, also using DirectSound[Capture]) and came
> up with around 130ms.  To measure it, I placed the mic near a
> speaker to get feedback going, had my program connect to itself
> (local loopback), and made a sound into the mic.  The whole while
> I'd have CoolEdit recording from the mic.  So the sound would go:
> Sound -> captured via mic -> transmitted (to itself) -> played via
> speaker -> captured via mic (quieter)
> I measured the start-to-start time between the original sound and the
> "echoed" sound as indicated in CoolEdit and found it to be around 130ms.
> I'm not totally sure if this process results in a correct latency
> measurement, but I think it's ok...  I'm curious to know what you're
> doing.

I did the same sort of thing but in software - a made a mock soundcard that
used the real one as a timebase and instead of recording from the
microphone, it recorded blank or a square wave (one packet, once a second).
The server could be configured to send echo noise back to the sender and
then the mock soundcard would listen for any playback at half the original
volume (and estimate the actual play time based on the real soundcard).

This testing showed that recording and playing back 32kHz drifted apart so
eventually the receiver would buffer overrun. Recording at 44.1kHz and
downsampling to 32kHz had no such issues.

> My program is set up to use Speex in wideband mode, transmitting at
> a rate of 25 packets per second (two Speex frames per packet).
> Incoming packets are placed into a queue, and a playback thread
> receives notifications from DirectSound at an interval matching the
> amount of sound in a packet (40ms in this case).  When a notification
> is received, the playback thread attempts to remove a packet from the
> queue and copy it into the playback buffer ahead of the notified
> area.

I also have a high priority thread, but it wakes up based on a fixed polling
interval (using a Waitable Timer - which are *very* accurate, it'll even
split a scheduling quanta to wake it up on time). During polling it either
pulls a packet from the network or checks the play position on the soundcard
to see if it's time to panic before the next polling interval (write
position is less than 20ms away from the last written position).
Unfortunately I can't sleep in select because lots of other traffic comes in
on the port and the thread switching was way too slow.

> Note that playback notifications on a secondary buffer are only
> reliable if you create the buffer as a software buffer!  I have
> witnessed the Audigy drivers screwing up notifications for buffers
> created in hardware.  Getting newer drivers helped but did not
> completely solve the problem.  My SB Live! Platinum, on the other
> hand, seems to work just fine...

Playback notifications are only implemented in WDM drivers, usually a driver
update will fix that - but I don't use them. Instead I just have a circular
buffer poll the write position.

> Anyway, to handle jitter, the queue removal behavior is like this:
> If the playback buffering thread finds that the incoming packet queue
> is empty, it waits for at least two packets to be present in the queue
> before continuing to withdraw from it.  (In the meantime it inserts
> silence, though I'm going to change this to have Speex fabricate the
> missing data.)  When it successfully withdraws a packet from the
> queue, it will continue withdrawing until there is less than two
> packets in the queue.

I just use the playback buffer for my queue (it's about 1 second long), so
every packet gets decoded and inserted on receipt. I consider late and out
of order packets[1] the same as dropped ones, so if there's a late packet, I
make speex fabricate the next one - then drop it when it's received. My
header sends a 3 bit sequence number so I can check for out-of-orders. When
two packets in a row are missing I reset the stream.

> The goal is to minimize latency but permit some amount of jitter by
> maintaining a "reserve" of one packet to draw upon in the event of
> delay or loss.  This seems to work well and is easily adjustable to
> more than one packet, if need be...

I have two bits to figure out how much jitter buffer should be used, win98
machines will suspend your thread for up to 70ms if you wait on something so
the presence of a win98 machine on one end will add 50ms of jitter. Other
than that, it's a fixed interval (45ms) - it will start with that amount of
jitter buffer and panic when that's exhausted. Ideally it would be good to
tune to the sender in smaller increments (like adding 5ms of jitter every
underrun and dividing the difference by two every successful complete) - but
I didn't have implementation time for that.

My biggest problem with speex is I want it to be a lot faster - but I have
to learn quite a bit more about signal processing before attempting that.


[1] I keep statistics and have never seen an out-of-order packet.

> John Hayes (jhayes at thereinc.com) wrote:
> >
> > Right - and I deal with that on the receiver end based on an
> approximation
> > of sender's and receiver's responsiveness - the minimum latency
> I've been
> > able to get into the system is about 150 ms. Of that, jitter
> buffering is
> > about 40-100ms. I'd love to figure out how to get that down
> without killing
> > myself on thread switching or Win32 kernel calls, but ms has to actually
> > implement the DSBCAPS_CTRLPOSITIONNOTIFY capability in direct
> sound capture
> > ....
> >
> > John
> >
> > > -----Original Message-----
> > > From: owner-speex-dev at xiph.org [mailto:owner-speex-dev at xiph.org]On
> > > Behalf Of Allen Drennan
> > > Sent: Tuesday, May 20, 2003 11:18 AM
> > > To: speex-dev at xiph.org
> > > Subject: RE: [speex-dev] Speex settings and jitter
> > >
> > >
> > > In my experience most of the jitter related issues are
> because people are
> > > using too small of audio buffer sizes that match the framing size
> > > of Speex -
> > > particularly in Windows.  This isn't a problem with Speex, but as a
> > > programmer you should collect and append a few frames to
> match the size of
> > > your output audio frame buffer before attempting to play the sound.
> > > ...

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'speex-dev-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.

More information about the Speex-dev mailing list