[ogg-dev] OggPCM proposal feedback
Erik de Castro Lopo
mle+xiph at mega-nerd.com
Wed Nov 9 15:13:19 PST 2005
Hi all,
Siliva contacted me about this OggPCM proposal and asked me
to join in. For those who don't know me, I am the main author
and maintainer of libsndfile and therefore know quite a bit
about how uncompressed audio is stored in sound files. However
even I would not consider myself an expert; there are areas
to do with channel assignments that I know I am ignorant of.
I am also quite ignorant of the Ogg container format.
I have now read:
http://wiki.xiph.org/OggPCM
and find that it has a number of short comings.
a) There is no marker to distinguish little endian data
from big endian data.
b) There is no mention of audio data being help in double
precision (64 bit) floating point. Current this is
supported in libsndfile by WAV, AIFF, AU, IRCAM and the
two different Matlab/Octave file formats (I may also
have overlooked some).
c) I think having separate fields for things like signed/
unsigned/float and bit width is a mistake. I would suggest
instead a single field that encodes all this information
in a enumeration. Ie:
OGG_PCM_U8 /* Unsigned 8 bit */
OGG_PCM_S8 /* Signed 8 bit. */
OGG_PCM_S16
OGG_PCM_S24
OGG_PCM_S32
OGG_PCM_FLOAT32
OGG_PCM_FLOAT64
and so on. This scheme makes it very difficult to get
signed/unsigned and bitwith messed up.
d) Don't bother implementing unsigned PCM for bit widths
greater than 8 bits. No other common file format uses
it and those unsigned formats are a pain to work with.
e) Consider whether the endianness should also be encoded
in the enumeration above. I would recommend that it is
resulting in:
OGG_PCM_U8 /* Unsigned 8 bit */
OGG_PCM_S8 /* Signed 8 bit. */
OGG_PCM_LE_S16
OGG_PCM_BE_S16
OGG_PCM_LE_S24
OGG_PCM_BE_S24
...
OGG_PCM_LE_FLOAT32
OGG_PCM_BE_FLOAT32
...
f) Encoding of channel information. In a two channel file,
is the audio data a stereo image or two distinct mono
channels? For a file with N (> 2) channels, are there
pairs of channels which should be considered as a stereo
pairs or do you want to place these stereo pairs as
separate streams within a single ogg container? What
about multi channel surround sound (there are a number
of different formats like 5.1 and 7.1) or quadraphonic?
How are you going to specify which channel is which.
Being able to encode this stuff easily is **vital**.
g) With things like surround sound, are you going to allow
24 bit audio for the main stereo pair and 16 bits for
the side channels? This might best be achieved using
separate stream, but that would make channel information
all that more important. Is it useful to have PCM for the
main stereo pair and say vorbis encoding for the side
channels?
Please realize that this is all just off the top of my head.
There may be a bunch of other stuff I have overlooked.
Is it OK if I can get some other people that know more about
this stuff involved?
Erik
--
+-----------------------------------------------------------+
Erik de Castro Lopo
+-----------------------------------------------------------+
"I'm not proud .... We really haven't done everything we could
to protect our customers ... Our products just aren't engineered
for security." -- Brian Valentine, Senior Vice President of
Microsoft's Windows development team
More information about the ogg-dev
mailing list