[ogg-dev] OggPCM proposal feedback

Fri Nov 11 01:10:26 PST 2005

On Fri, Nov 11, 2005 at 07:17:53PM +1100, Erik de Castro Lopo wrote:
> We're talking about a file header here. Even if the header is a kilobyte in
> size, it will be completely **dwarfed** by the audio data following. So why 
> are you counting single bits like this?

Why waste?  You only have to read the header once for a stream, and libogg2 
provides a convient bitpacker which can easily grab any number of bits and give 
you an int containing their value.

Just because you /CAN/ waste a few bytes, what's the reason to do so?

> > I really don't like this idea, but I will entertain, formatting it as follows:
> > 
> > ID  Type  Bits  
> > 0   Int   8
> > 1   uInt  8
> > 2   Int   16
> > 3   Int   24
> > 4   Int   32
> > 5   Float 32
> > 6   Float 64
> > 7   Extended - Unsupported by any v1.0 software
> > 
> 
> 
> I have suggested something better.

I went back and re-read your suggestion, I'll paste in effort to ensure we're 
talking about the same thing:

>   c) I think having separate fields for things like signed/
>      unsigned/float and bit width is a mistake. I would suggest
>      instead a single field that encodes all this information
>      in a enumeration. Ie:
>
>         OGG_PCM_U8          /* Unsigned 8 bit */
>         OGG_PCM_S8         /* Signed 8 bit. */
>         OGG_PCM_S16
>         OGG_PCM_S24
>         OGG_PCM_S32
>         OGG_PCM_FLOAT32
>         OGG_PCM_FLOAT64
>
>      and so on. This scheme makes it very difficult to get
>      signed/unsigned and bitwith messed up.

With the exception of the top two being inverted, these tables look identical.  
You went on to suggest:

>   e) Consider whether the endianness should also be encoded
>      in the enumeration above. I would recommend that it is
>      resulting in:
> 
>          OGG_PCM_U8          /* Unsigned 8 bit */
>          OGG_PCM_S8         /* Signed 8 bit. */
>          OGG_PCM_LE_S16
>          OGG_PCM_BE_S16
>          OGG_PCM_LE_S24
>          OGG_PCM_BE_S24
>          ...
>          OGG_PCM_LE_FLOAT32
>          OGG_PCM_BE_FLOAT32
>          ...

This I did not put into the draft because, while a seperate endian flag is 
unused for 8bit samples, it greatly simplifies implementation when the libogg 
bitpacker is used, which is on a seperate layer than data decoding.  Both 
libogg1 and libogg2 allow you to read multi-byte values of up to 32 bits (no, 
this doesn't include 64-bit values) with the endian you specify.

It also nearly doubles the number of values which must be known/tested by 
implementors, which is something I believe should be avoided.

> A number of people have agreed with my suggestion. As yet, noone has stepped 
> forward to support your proposal.  Why is that?

I just did a scan over the thread, the wiki, and all the private mail on this 
issue - not one wrote in agreement with e), though many supported a method 
similar to c).  

No, I didn't put suggestion e) verbatim on the current draft.  It's still up for 
discussion and debate, however, the tone of your comments makes this difficult.

I know you're new to our little development community, so I'm responding with 
constructive feedback rather than ignoring your statements.  We don't operate in 
the bullish verbal-fistfight manner that some groups you may have worked with 
before do, the only thing such behaviors result in is either getting ignored or, 
if an issue becomes too "hot", it's dropped for a period of time (often months).

We work by rough consensus of affected parties.  I, for example, am writting the 
framework and likely the inital set of codec plugins, so I have an invested 
interest in having something useable.  Monty, JeanMarc, and Josh have all 
written Ogg audio codecs, and will likely be working with this in the future, so 
their approval is sought.  Outside developers who will be implementing the 
format, such as Josh, are involved because the design decidions effect them too.

You were asked to contribute feedback because you have experience in raw audio 
codec implementations.  You are not familiar with Ogg, nor apparently, the 
existing audio codecs within it, as you yourself disclaimed in the beginning of 
your first email in this thread:

> However even I would not consider myself an expert; there are area to do with 
> channel assignments that I know I am ignorant of.
> I am also quite ignorant of the Ogg container format.

My intention here is not to insult your abilities or knowledge, but to point out 
that you, like the rest of us, have strong points and weak points.  We also have 
different styles of doing things, different philosophies about how they should 
be done, and different intentions for how we plan on using these things.

Debate and compromise, in an open friendly atmosphere, is thus critical.  Please 
help us maintain that atmosphere so this work can continue, as we value your 
contributions, and expect you to value all of ours, even when we disagree.

> > Now - channel association/scheme, or whatever a more appropriate name would be.. 
> > this has got to be a lookup table, too.  I'm thinking:
> >  0 Mono
> >  1 Stereo
> >  2 Quadraphonic
> >  3 Ambisonic
> >  4 Dolby 5.1
> >  5 Dolby 6.1 (used in Dolby Digital EX)
> >  6 Dolby 7.1
> >  7 Extended
> > 
> > This is 3 bits, leaving one bit free for an extra flag, if needed, for one other 
> > bit of configuration which may be needed.  We could also extend this to 4 bits, 
> > leaving many of them "Extended" for future minor versions.
> 
> Until yesterday, you were not even aware of this issue and now you think you
> know enough to design something that implements it? Thats nuts! Your chances
> of getting it wrong are close to 100%.

To avoid repeating myself about maintaining a friendly atmosphere..

The suggestion I made in that email was based on feedback from four audio 
developers plus research I conducted during the day.  My personal previous 
experience in the matter of surround sound is irregardless to my ability to ask 
and research the nessesary information.

Furthermore, as I hope you've checked the wiki and more recent emails before 
sending this reply, there's a movement away from implementing surround sound on 
the codec level anyways, especially since none of the current Ogg codecs has any 
markings regarding channel assignment or method in their headers.  

A larger solution to provide this information, which includes support for 
channels of different sample sizes and sample rates, as well as codecs, instead 
of implementing this on a per-codec basis is one of the solutions that's been 
mentioned.  Another is to encode it into the Comments header as a standard 
comment.  Neither of these require further minding surround sound or channel 
labels in the OggPCM protocol, so unless someone can make an argument against 
this, it's time to drop the issue - take these ideas to the metainfo codec 
sylvia suggested instead.

-- 

The recognition of individual possibility,
 to allow each to be what she and he can be,
  rests inherently upon the availability of knowledge;
 The perpetuation of ignorance is the beginning of slavery.

from "Die Gedanken Sind Frei": Free Software and the Struggle for Free Thought
 by Eben Moglen, General council of the Free Software Foundation