[xiph-cvs] cvs commit: speex/src speexenc.c

Thu Mar 13 13:34:26 PST 2003

jm          03/03/13 16:34:26

  Modified:    doc      manual.lyx
               src      speexenc.c
  Log:
  ...

Revision  Changes    Path
1.57      +65 -24    speex/doc/manual.lyx

Index: manual.lyx
===================================================================
RCS file: /usr/local/cvsroot/speex/doc/manual.lyx,v
retrieving revision 1.56
retrieving revision 1.57
diff -u -r1.56 -r1.57

--- manual.lyx	11 Mar 2003 04:45:25 -0000	1.56
+++ manual.lyx	13 Mar 2003 21:34:26 -0000	1.57
@@ -1,11 +1,13 @@
 #LyX 1.3 created this file. For more info see http://www.lyx.org/
 \lyxformat 221
 \textclass article
+\begin_preamble
+\usepackage{times}
+\end_preamble
 \language english
 \inputencoding auto
 \fontscheme default
 \graphics default
-\float_placement h
 \paperfontsize default
 \spacing single 
 \papersize Default
@@ -336,7 +338,7 @@
  (DTX)
 \layout Standard
 
-Discontinuous transmission is an addition to VAD operation, that allows
+Discontinuous transmission is an addition to VAD/VBR operation, that allows
  to stop transmitting completely when the background noise is stationnary.
  In file-based operation, since we cannot just stop writing to the file,
  only 5 bits are used for such frames (corresponding to 250 bps).
@@ -1209,7 +1211,15 @@
 
 \layout Standard
 
-reserved
+Asks (if 1) the encoder to be less 
+\begin_inset Quotes eld
+\end_inset 
+
+agressive
+\begin_inset Quotes erd
+\end_inset 
+
+ due to high packet loss
 \end_inset 
 </cell>
 </row>
@@ -2289,7 +2299,7 @@
 Linear prediction is at the base of may speech coding techniques, including
  CELP.
  The idea behind it is to predict the signal 
-\begin_inset Formula $x(n)$
+\begin_inset Formula $x[n]$
 \end_inset 
 
  using a linear combination of its past samples:
@@ -2427,7 +2437,7 @@
 \end_inset 
 
  is toeplitz hermitian.
- Also, it can be proved that all the roots of 
+ Also, it can be proven that all the roots of 
 \begin_inset Formula $A(z)$
 \end_inset 
 
@@ -2524,6 +2534,9 @@
  (LSP) coefficients which have a much better behaviour with quantization,
  one of them being that it's easy to keep the filter stable.
  
+\layout Comment
+
+Describe LSP's
 \layout Subsection
 
 Pitch Prediction
@@ -2560,15 +2573,7 @@
 \begin_inset Formula $\beta$
 \end_inset 
 
- is the pitch gain and 
-\begin_inset Formula $c(n)$
-\end_inset 
-
- is taken from the 
-\emph on 
-innovation codebook
-\emph default 
-.
+ is the pitch gain.
  We call that long-term prediction since the excitation is predicted from
  
 \begin_inset Formula $e[n-T]$
@@ -2596,7 +2601,12 @@
 \begin_inset Formula $c[n]$
 \end_inset 
 
- taken from a fixed codebook.
+ taken from a fixed codebook, hence the name 
+\emph on 
+Code
+\emph default 
+ Excited Linear Prediction.
+ The final excitation is given by:
 \layout Standard
 
 
@@ -2605,7 +2615,11 @@
 
 \end_inset 
 
-This is where most of the bits in a CELP codec are allocated.
+The quantization of 
+\begin_inset Formula $c[n]$
+\end_inset 
+
+ is where most of the bits in a CELP codec are allocated.
  It represents the information that couldn't be obtained either from linear
  prediction or pitch prediction.
  In the 
@@ -2727,6 +2741,31 @@
 \end_inset 
 
 .
+\layout Standard
+
+Analysis-by-synthesis refers to the fact that when trying to find the best
+ pitch parameters (
+\begin_inset Formula $T$
+\end_inset 
+
+, 
+\begin_inset Formula $\beta$
+\end_inset 
+
+) and innovation signal 
+\begin_inset Formula $c[n]$
+\end_inset 
+
+, we do not work by making the excitation 
+\begin_inset Formula $e[n]$
+\end_inset 
+
+ as close as the original one (which would be simpler), but apply the synthesis
+ (and weighting) filter and try making 
+\begin_inset Formula $X_{w}(z)$
+\end_inset 
+
+ as close to the original as possible.
 \layout Section
 \pagebreak_top 
 Speex narrowband mode
@@ -2877,7 +2916,8 @@
 \end_inset 
 
  instead.
- The period and quantized gains are determined in closed loop.
+ The period and quantized gains are determined in closed loop (analysis-by-synth
+esis).
  In most modes, the pitch period is encoded with 7 bits in the 
 \begin_inset Formula $\left[17,144\right]$
 \end_inset 
@@ -2894,10 +2934,11 @@
 Innovation Codebook
 \layout Standard
 
-In Speex, the innovation signal is quantized using shape-only vector quantizatio
-n (VQ).
- That means that the codebooks that are used represent both the shape and
- the gain at the same time.
+In Speex, the innovation signal is quantized using sub-vector shape-only
+ vector quantization (VQ).
+ That means that the innovation signal is divided in sub-vectors (of size
+ 5 to 20) and quantized using a codebook that represents both the shape
+ and the gain at the same time.
  This save many bits that would otherwise be allocated for a separate gain
  at the price of a slight increase in complexity.
  
@@ -2907,7 +2948,7 @@
 \layout Standard
 
 There are 7 different narrowband bit-rates defined for Speex, ranging from
- 200 bps to 18.15 kbps, although the modes below 5.9 kbps should not be used
+ 250 bps to 24.6 kbps, although the modes below 5.9 kbps should not be used
  for speech.
  The bit-allocation for each mode is detailed in table 
 \begin_inset LatexCommand \ref{cap:bits-narrowband}
@@ -4172,7 +4213,7 @@
 
 \layout Standard
 
-No sound (VBR only)
+No transmission (DTX)
 \end_inset 
 </cell>
 </row>
@@ -4436,7 +4477,7 @@
 
 \layout Standard
 

--
+10.5
 \end_inset 
 </cell>
 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">

<p><p>1.75      +4 -4      speex/src/speexenc.c

Index: speexenc.c
===================================================================
RCS file: /usr/local/cvsroot/speex/src/speexenc.c,v
retrieving revision 1.74
retrieving revision 1.75
diff -u -r1.74 -r1.75
--- speexenc.c	3 Mar 2003 06:52:41 -0000	1.74
+++ speexenc.c	13 Mar 2003 21:34:26 -0000	1.75
@@ -230,8 +230,8 @@
       {"help", no_argument, NULL, 0},
       {"le", no_argument, NULL, 0},
       {"be", no_argument, NULL, 0},
-      {"lin8", no_argument, NULL, 0},
-      {"lin16", no_argument, NULL, 0},
+      {"8bit", no_argument, NULL, 0},
+      {"16bit", no_argument, NULL, 0},
       {"stereo", no_argument, NULL, 0},
       {"rate", required_argument, NULL, 0},
       {"version", no_argument, NULL, 0},
@@ -341,10 +341,10 @@
          } else if (strcmp(long_options[option_index].name,"be")==0)
          {
             lsb=0;
-         } else if (strcmp(long_options[option_index].name,"lin8")==0)
+         } else if (strcmp(long_options[option_index].name,"8bit")==0)
          {
             fmt=8;
-         } else if (strcmp(long_options[option_index].name,"lin16")==0)
+         } else if (strcmp(long_options[option_index].name,"16bit")==0)
          {
             fmt=16;
          } else if (strcmp(long_options[option_index].name,"stereo")==0)

<p><p>--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'cvs-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.