[Speex-dev] questions on serving up streaming speex

Sun Dec 26 10:44:45 PST 2004

Hi guys,

I am working on an application that gathers and stores 
toll-quality/narrow-band voice data. It will allow clients to request 
this data and stream it to them on the fly. I'm planning on this data 
all being stored in the speex format (possibly encapsulated in an Ogg 
file header). I was wondering what method the members of this list would 
recommend for streaming the data to clients. Please excuse my ignorance 
in this field, as I'm not an expert with the subject matter.

Here are some requirements, and more detail of the problem domain:
-The client will be a PC, usually on a corporate LAN, but we might be 
streaming to a remote PC living off of a modem as well.
-The resultant stream (in whatever format) must be playable in a custom 
client side app (written in python) and/or a java applet on the client. 
So I would need libraries or library bindings to allow me to do this for 
those languages. The streaming implementation on the server can be 
written in straight C, or Python (I'm using Pyrex for this...straight C 
is better for speed purposes)
-Clients will be individually streamed data. That is, if there are N 
clients listening to streams, there will be N streams (i.e. no 
'broadcasting' will happen).
-Streams must work over TCP (corporate network firewalling).
-The solution should be both fast and simple. There could be many 
streams being sent out at once, and many other things will be happening 
on the box.
-The solution should be open source, preferably LGPL or BSD-licensed.

Truthfully, I'm a bit lost in all the competing ways to do this. I have 
a few possible solutions I've thought about, and would appreciate some 
input on what might be the "best" one for the problem at hand:

*Frame the speex data in Ogg (if not already done) and stream these Ogg 
Speex files to clients via Icecast2: Even though icecast2 is intended 
more for many clients, few streams broadcasting, this solution might 
very well work. In the server side of my app, I would use libshout to 
send data to the icecast server. A list of currently unused streams 
would be maintained, and connecting clients would be allocated one of 
these. When they were done, the stream would go back to an unused 
status. AFAIK, icecast doesn't currently support Ogg Speex, but it does 
support Ogg Vorbis/Ogg Theota, so I could either transcode to ogg vorbis 
or try to add ogg speex capabilities to Icecast. From looking at the 
source, this doesn't seem too hard, especially now that the speex ogg 
encapsulation format is on the speex website. The other issue is that 
Icecast has a configurable limit on the number of  sources it can have, 
but I can deal with that.

*Encapsulate speex in RTP, and use RTCP and/or RSCP as well: This was a 
possibility at first, but seems to be overkill for what I need...these 
protocols are quite complex compared to something like the shoutcast 
streaming protocol, it works over UDP, and I wasn't impressed from what 
I saw of the offerings in this area (didn't really fit the simple 
requirement :).

*What about this speex jitter buffer I've been hearing about? I'm unsure 
how that fits in with these other media streaming techniques. Does that 
mean I can just send raw speex data across a TCP socket and let the 
libspeex decoder do the work of jitter buffering for me? If that is the 
case, do I really need a streaming media server, or could I just ferry 
the data to multiple clients using poll()/select() and non-blocking sends?

*Any other possible solutions?

Thanks in advance,

Robby