[Speex-dev] Recommended GUI for Speex

Conrad Parker conrad at metadecks.org
Tue Jan 17 21:07:14 PST 2006

On Tue, Jan 17, 2006 at 02:19:17PM -0500, me at sbooth.org wrote:
> Hi all,
> I have recently added Speex support to my app (http://sbooth.org/Max/) and
> am in the process of creating a GUI for the user to control the codec
> parametere.  I am new to Speex and as such I am not really sure which
> parameters are more important than others, and deserve prominent
> placement, etc.  Is there a recommended GUI for people to follow or use as
> an example?

Hi Stephen,

I designed the encoding GUI for Sweep a few years ago, with a lot of
help from Jean-Marc about the various options:


(that screenshot is from sweep 0.9.0 which is being released today ;-)

The first thing is to choose the mode (narrowband, wideband or
ultra-wideband). These modes are designed for 8kHz, 16kHz and 32kHz
sampling rates respectively.

Next is a matter of choosing the encoding features. Not all combinations
are meaningful ("meaningful" meaning that, for example, VAD or DTX are
ignored if VBR is not specified); hence in order to make a GUI that only
provides correct options, it's better to combine the feature flags. In
Sweep this was done with a drop-down list.

For these feature flags:

/* Feature flags */
#define FEAT_VBR 1
#define FEAT_VAD 2
#define FEAT_DTX 4

These are the meaningful combinations presented in the list:

choices[] = {
  { 0, N_("Constant bitrate (CBR) with no features"),
  { FEAT_VAD, N_("CBR with Voice Activity Detection (VAD)"),
    N_("VAD generates low bitrate comfort noise to replace non-speech")
    N_("CBR with VAD and Discontinuous Transmission (DTX)"),
    N_("DTX marks extended pauses with a minimum bitrate signal")
    N_("Variable bitrate (VBR) with VAD"),
    N_("VBR allows the bitrate to adapt to the complexity of the speech; "
       "this selection uses VBR without DTX, which may improve performance "
       "compared to full VBR in the presence of background noise.")
    N_("Variable bitrate (VBR) with all features"),
    N_("VBR allows the bitrate to adapt to the complexity of the speech, "
       "and handles pauses using VAD and DTX")

Encoding complexity and quality are slightly different. The quality relates
to how good the actual audio is, complexity relates to how much compute
power to put into working it out; or something like that. In any case,
apparently complexity is less important than all of these (we hid it in
the "extras" tab along with the number of frames per packet).

hope that helps :-) though I haven't even described any of the newer
options in speex 1.1 ...



More information about the Speex-dev mailing list