[Vorbis-dev] intro + specification remarks + some questions

Tue Dec 5 06:46:30 PST 2006

hi everyone,

since about 10 days I've been working on an implementation of ogg vorbis 
*decoder* from scratch, and I've got a few constructive remarks about 
the specs and some questions.

--------------------------
First a question:

 > 9.2.4. low_neighbor
 >
 > "low_neighbor(v,x)" finds the position n in vector [v] of the greatest
 > value scalar element for which n is less than [x] and vector [v]
 > element n is less than vector [v] element [x].

I'm thoroughly confused by this definition... Does this mean:

A)
   for (n=x-1; n>=0; n--) if (v[n]<v[x]) return n;

B)
low_neighbor(v, x)
{
     max_pos = 0;

     for (n=0...x-1)
         if (v[n] < v[x])
             if (v[n] >= v[max_pos])
                 max_pos = n;

     return max_pos;
}

In case of A), perhaps the specs should be rewritten for clarity, and 
some pseudocode could be added... In case of B), doesn't this mean that, 
depending on the loop and last if( ) one can get different results as in:

http://bram.smartelectronix.com/tmp/low_neighbor.png

( line = vector values, red cross = x, green dots = possible values of n )

--------------------------
Secondly, some remarks on the clarity of the specifications.

* some additional pseudocode overview of the whole process would be 
highly appreciated by newbies like me, even something as simple as:

   - read one packet and decode as header information
   - read one packet and decode as comments
   - read one packet and decode as setup:
       - decode codebooks
       - decode floors
       - decode residues
       - decode mappings
       - decode modes
   - read (packets) as audio packet:
       - header
       - 'packet decode' floor curves
       - 'packet decode' residues
       - synthesise

I.e. a bit more highlevel structure to the docs would be cool and 
provide a lot of clarity. The lowlevel pseudocode is great and highly 
readable, but I did notice that somewhere halfway (after decoding the 
headers) the style changed quite a bit leading me to suspect two authors 
at work.

* The second part (packet decode / synthesis) is the probably the 
hardest to read/understand, and some of the pseudocode is on the edge of 
difficult to grasp.

* 7.2.2. header decode / 7.2.2.1. packet decode
it is quite confusing for a newbie that 7.2.2.1 is a subsection of 7.2.2 
as it is *not* the logical "next step" when decoding the header. Packet 
decode should really have a line saying something like: "when decoding 
an audio packet, the floor is decoded as follows". This goes for both 
floors and residues.

* it would be interesting to note the size of vectors in the pseudocode. 
Some parts of the pseudocode become unusable without this information:
a particular example from "8.6.2. packet decode": "allocate and zero all 
vectors that will be returned." This is a problem throughout the 
specifications.

I hope you guys like constructive criticism!

kindest regards and thanks for any help offered,

  - bram de jong

-- 
http://www.splicemusic.com
http://freesound.iua.upf.edu
http://www.smartelectronix.com
http://www.musicdsp.org