[ogg-dev] OggPCM proposal feedback
Arc
arc at Xiph.org
Fri Nov 11 01:10:26 PST 2005
On Fri, Nov 11, 2005 at 07:17:53PM +1100, Erik de Castro Lopo wrote:
> We're talking about a file header here. Even if the header is a kilobyte in
> size, it will be completely **dwarfed** by the audio data following. So why
> are you counting single bits like this?
Why waste? You only have to read the header once for a stream, and libogg2
provides a convient bitpacker which can easily grab any number of bits and give
you an int containing their value.
Just because you /CAN/ waste a few bytes, what's the reason to do so?
> > I really don't like this idea, but I will entertain, formatting it as follows:
> >
> > ID Type Bits
> > 0 Int 8
> > 1 uInt 8
> > 2 Int 16
> > 3 Int 24
> > 4 Int 32
> > 5 Float 32
> > 6 Float 64
> > 7 Extended - Unsupported by any v1.0 software
> >
>
>
> I have suggested something better.
I went back and re-read your suggestion, I'll paste in effort to ensure we're
talking about the same thing:
> c) I think having separate fields for things like signed/
> unsigned/float and bit width is a mistake. I would suggest
> instead a single field that encodes all this information
> in a enumeration. Ie:
>
> OGG_PCM_U8 /* Unsigned 8 bit */
> OGG_PCM_S8 /* Signed 8 bit. */
> OGG_PCM_S16
> OGG_PCM_S24
> OGG_PCM_S32
> OGG_PCM_FLOAT32
> OGG_PCM_FLOAT64
>
> and so on. This scheme makes it very difficult to get
> signed/unsigned and bitwith messed up.
With the exception of the top two being inverted, these tables look identical.
You went on to suggest:
> e) Consider whether the endianness should also be encoded
> in the enumeration above. I would recommend that it is
> resulting in:
>
> OGG_PCM_U8 /* Unsigned 8 bit */
> OGG_PCM_S8 /* Signed 8 bit. */
> OGG_PCM_LE_S16
> OGG_PCM_BE_S16
> OGG_PCM_LE_S24
> OGG_PCM_BE_S24
> ...
> OGG_PCM_LE_FLOAT32
> OGG_PCM_BE_FLOAT32
> ...
This I did not put into the draft because, while a seperate endian flag is
unused for 8bit samples, it greatly simplifies implementation when the libogg
bitpacker is used, which is on a seperate layer than data decoding. Both
libogg1 and libogg2 allow you to read multi-byte values of up to 32 bits (no,
this doesn't include 64-bit values) with the endian you specify.
It also nearly doubles the number of values which must be known/tested by
implementors, which is something I believe should be avoided.
> A number of people have agreed with my suggestion. As yet, noone has stepped
> forward to support your proposal. Why is that?
I just did a scan over the thread, the wiki, and all the private mail on this
issue - not one wrote in agreement with e), though many supported a method
similar to c).
No, I didn't put suggestion e) verbatim on the current draft. It's still up for
discussion and debate, however, the tone of your comments makes this difficult.
I know you're new to our little development community, so I'm responding with
constructive feedback rather than ignoring your statements. We don't operate in
the bullish verbal-fistfight manner that some groups you may have worked with
before do, the only thing such behaviors result in is either getting ignored or,
if an issue becomes too "hot", it's dropped for a period of time (often months).
We work by rough consensus of affected parties. I, for example, am writting the
framework and likely the inital set of codec plugins, so I have an invested
interest in having something useable. Monty, JeanMarc, and Josh have all
written Ogg audio codecs, and will likely be working with this in the future, so
their approval is sought. Outside developers who will be implementing the
format, such as Josh, are involved because the design decidions effect them too.
You were asked to contribute feedback because you have experience in raw audio
codec implementations. You are not familiar with Ogg, nor apparently, the
existing audio codecs within it, as you yourself disclaimed in the beginning of
your first email in this thread:
> However even I would not consider myself an expert; there are area to do with
> channel assignments that I know I am ignorant of.
> I am also quite ignorant of the Ogg container format.
My intention here is not to insult your abilities or knowledge, but to point out
that you, like the rest of us, have strong points and weak points. We also have
different styles of doing things, different philosophies about how they should
be done, and different intentions for how we plan on using these things.
Debate and compromise, in an open friendly atmosphere, is thus critical. Please
help us maintain that atmosphere so this work can continue, as we value your
contributions, and expect you to value all of ours, even when we disagree.
> > Now - channel association/scheme, or whatever a more appropriate name would be..
> > this has got to be a lookup table, too. I'm thinking:
> > 0 Mono
> > 1 Stereo
> > 2 Quadraphonic
> > 3 Ambisonic
> > 4 Dolby 5.1
> > 5 Dolby 6.1 (used in Dolby Digital EX)
> > 6 Dolby 7.1
> > 7 Extended
> >
> > This is 3 bits, leaving one bit free for an extra flag, if needed, for one other
> > bit of configuration which may be needed. We could also extend this to 4 bits,
> > leaving many of them "Extended" for future minor versions.
>
> Until yesterday, you were not even aware of this issue and now you think you
> know enough to design something that implements it? Thats nuts! Your chances
> of getting it wrong are close to 100%.
To avoid repeating myself about maintaining a friendly atmosphere..
The suggestion I made in that email was based on feedback from four audio
developers plus research I conducted during the day. My personal previous
experience in the matter of surround sound is irregardless to my ability to ask
and research the nessesary information.
Furthermore, as I hope you've checked the wiki and more recent emails before
sending this reply, there's a movement away from implementing surround sound on
the codec level anyways, especially since none of the current Ogg codecs has any
markings regarding channel assignment or method in their headers.
A larger solution to provide this information, which includes support for
channels of different sample sizes and sample rates, as well as codecs, instead
of implementing this on a per-codec basis is one of the solutions that's been
mentioned. Another is to encode it into the Comments header as a standard
comment. Neither of these require further minding surround sound or channel
labels in the OggPCM protocol, so unless someone can make an argument against
this, it's time to drop the issue - take these ideas to the metainfo codec
sylvia suggested instead.
--
The recognition of individual possibility,
to allow each to be what she and he can be,
rests inherently upon the availability of knowledge;
The perpetuation of ignorance is the beginning of slavery.
from "Die Gedanken Sind Frei": Free Software and the Struggle for Free Thought
by Eben Moglen, General council of the Free Software Foundation
More information about the ogg-dev
mailing list