[Speex-dev] New jitter.c, bug in speex_jitter_get?
Jean-Marc Valin
jean-marc.valin at usherbrooke.ca
Wed May 3 18:12:56 PDT 2006
> We just return a frame with the return value JB_DROP, which tells the
> caller to drop this frame, and call jb_get again.
>
> When the caller is done with the jitterbuffer, it calls jb_getall()
> repeatedly, until it's empty, and then it can discard all the frames.
Hmm, looks a bit error-prone to me. Especially considering I still have
to explain that "no, you can't pass ulaw instead of float to
speex_encode and expect 4x better compression" :-)
> Perhaps it's not expensive, but it's unnecessary.. It also means
> that the jitterbuffer's pointers can point to structures, or other
> types of data, and the jitterbuffer doesn't need to understand them.
> In particular, it's designed to be able to buffer and reorder frames
> (things) which aren't audio -- like video and control frames.
How are video frames or control frames different. My jitter buffer only
takes raw packets (i.e. N bytes), it doesn't care about the content or
meaning. Also, why would you want to give it structs? AFAIK, IP packets
can only contain bytes anyway.
> > A couple things I don't understand. Why do you need both the local
> > clock
> > and the remote clock and how do you define those anyway?
>
> There's the "timestamp", which the remote side puts on frames, and
> the local time, which is used for jb_put and jb_get. They're defined
> in milliseconds.
I think this is equivalent to what I'm doing with jitter_buffer_tick(),
except that my approach doesn't require explicit knowledge of the local
time (I don't care what the local time is, just how it's incremented).
Also, I think milliseconds (which I used before) is bad because 1) RTP
uses the sampling clock and 2) in many cases, it's not an integer. I've
even used my jitter_buffer to send raw PCM packets of 1/3 ms each (16
samples at 48 kHz).
>
> > BTW, does your
> > jitter buffer consider that there can be overlaps or holes between
> > frames, especially if using PCM?
>
> There shouldn't be overlaps, but I think it currently deals with some
> types of messy timestamps that are, for example, +- one frame length
> from precise. (i.e. sequences of 20 ms frames with timestamps like
> 0, 18, 42, 55, 82, instead of 0, 20, 40, 60, 80)..
Why no overlap? What if you want to include a bit of redundancy (doesn't
have to be 100% either) to make your app more robust to packet loss? You
could want to send a packet that covers 0-60 ms, followed by 40-100 ms,
followed by 80-140 ms, ...
> For reference, the API I've shown is already used in asterisk and
> iaxclient. There are two other jitterbuffers that I know of which
> are using basically the same API: A non-adaptive jitterbuffer, and
> another one which more closely follows the research models I was
> looking at when I wrote mine, by Jesse Kaijen. The former is in one
> of asterisk's SVN branches and the bugtracker, and I think that
> Jesse's is at speakup.nl.
Well, that API clearly has limitations that mean I can't use them to do
what I need. Unless you're willing to change that (and even then I'm not
sure), there's no way we can use the same API. I still suspect it may be
possible to wrap by current API in that API. Of course, some features
would just not be available.
> My jitterbuffer is by no means perfect -- mainly it seems to still
> have situations where there's garbage input which confuses it in a
> way that I'd hope it could recover from. (i.e. nonsensical
> timestamps that it gets sent).
I have yet to see any situation that confuses my jitter buffer to that
point. For example, if you feed it two streams with different timestamp
sequences, it will just follow one of them. The only case where it can
get a little confused is if there's a sudden jump in the timestamp
sequence, but even then it will automatically re-sync after a second or
two.
Jean-Marc
More information about the Speex-dev
mailing list