[ogg-dev] OggPCM proposal feedback
Arc
arc at Xiph.org
Thu Nov 10 13:35:47 PST 2005
On Thu, Nov 10, 2005 at 07:03:43PM +1100, Erik de Castro Lopo wrote:
>
> WAV is usually little endian but there is also a (very rare) big endian
> version. AIFF is usually little endian but also supports big endian
> encoding. CAF, AU, IRCAM and a number of others support both endian-nesses
> equally.
This doesn't seem to be a large issue - a single bit in the header could
specify it, 0=MSB, 1=LSB, or vice versa.
VorbisFile will export either endianness, this seems to be the end of
this part of the debate.
> "Please don't make determination of the data format depend on
> multiple fields. Instead use an enumeration so that something
> like little endian 16 bit PCM can be specifed as OGG_PCM_LE_PCM_16
> and big endian 64 bit doubles can be specified as OGG_PCM_BE_FLOAT_64.
> This scheme is far more transparent and self documenting. If the
> format field is 8 bits, this scheme supports 256 formats; if its 16
> bit it will support 65536 formats.
You're still working with the philosophy of FourCC-world, where based on
wether a plugin or application supports a 32-bit identifier you know if
it either has full support or no support.
We aren't working by that philosophy. We do not need to maintain an
table of predefined formats, extended each time someone wants to use a
new format, since no application needs to support any combination of
encoding parameters.
Honestly, as far as I'm concerned unsigned samples can go away... almost
nothing uses 8-bit samples anymore, and unsigned 8-bit even less so.
However, support for (ie) 48-bit-float should not have to be created,
the values for how many bits to use and wether it's int or float should
be seperate, as should the number of channels/etc.
On Thu, Nov 10, 2005 at 03:44:53PM +0800, illiminable wrote:
>
> I think this is the wrong approach, flac and other codecs operate on a
> tighter subset, because they have to perform complex transformations on the
> data, and supporting too many types increase complexity. A raw format
> essentially needs no processing, it just needs copying into a buffer that
> supports that type of data.
The complexity isn't increased by added flexibility, and that
flexibility completely eliminates the same issue created from FLAC -
FLAC was designed to losslessly support every common audio format, and
yet, you find it's subset of formats too tight.
Don't you see the inherit issue here? It comes back to someone deciding
which formats should be valid, and which ones wont, and enforcing that
by using an index# to a table of supported formats vs leaving it
freeform for future implementors to use.
Changing the spec a bit, where the samplesize must be a multiple of 8
and may not exceed 128bit (4-bit field), seems like something worthwhile
to eliminate the padding issue.
But between float and int, why /not/ allow someone to do something
insane like 96-bit audio? 20 years ago, we thought that 16 bit, or
prehaps 24 bit, was the maximum we could do. Why would anyone want more
than 24 bit? And yet, the issue was raised that 64-bit audio samples
are nessesary. In another 20 years, will people be arguing that 128bit
samples are nessesary? Or than 48bit is a good tradeoff between 32bit
and 64bit?
No - it does not increase complexity, nor does it impose any
requirements on implementations, since instead of a 32-bit identifier we
use the entire first packet of the stream to check for compatability.
No, your media player does -NOT- have to support 256 channel audio, nor
must it support audisonic, or 64-bit audio, etc. There's no reason,
however, to force everything into artificial, arbitrary limitations
based on what we believe is reasonable for today.
If a media player only supports a subset of what the codec supports,
that's completely fine and expected.
> I have little more experience than you. I sent invitations for people
> to join this discussion to the music-dsp mailing list. I hope somebody
> knowledgeable will show up.
There's a difference between experience and differences of design
philosophy. This isn't the issue of right or wrong, but two different
styles of designing codecs.
Raw fourcc codecs are each setup for a different format, or small set of
formats. RIFF/WAVE uses a subset of formats, expecting all applications
which support it's FourCC to understand all those formats. Again, this
is done under the concept that a codec should either be fully supported
or unsupported.
Whereas, not all audio codecs are going to support even the subset that
you provided (64-bit float, for example). Nor are all applications
which use Ogg going to support anything but 16-bit signed int, nor
should they be expected to.
I think it's reasonable to do away with unsigned because modern codecs
just aren't going to use it, but I'm not going to try to predict wether
someone will want to use 48-bit audio, or 128-bit audio, and wether
they'll use int or float.
> Different bitwidth makes sense. You need to high dynamic range
> on your main stereo signal, but probably not on the side channels.
>
> Different sample rates also makes sense. If the main stereo pair
> is sampled at 96kHz it makes sense to have the sub bass signal
> (ie all the low frequencies) sampled at a much lower rate. For a
> sub-bass signal 8kHz might be appropriate.
I think, for these, given Ogg's use of granulepos and the syncing
complexity which allowing different channels to be different rates and
sizes, this is something best left to muxed raw channels and have any
codec which supports this draw from the different raw channels.
> Not yet, but we haven't heard from anyone else yet. I would like
> to see input (or at least an OK) from a large number of people in
> the audio field.
I think this is good to emphasis - it's ok to support some combinations
of formats which are not used, since they'll simply be ignored if
they're infavorable to implement, but missing something nessesary is
something we need to make sure not to do.
I've put a reduced config set on the wiki.
--
The recognition of individual possibility,
to allow each to be what she and he can be,
rests inherently upon the availability of knowledge;
The perpetuation of ignorance is the beginning of slavery.
from "Die Gedanken Sind Frei": Free Software and the Struggle for Free Thought
by Eben Moglen, General council of the Free Software Foundation
More information about the ogg-dev
mailing list