[ogg-dev] OggPCM proposal feedback

Arc arc at Xiph.org
Thu Nov 10 16:49:13 PST 2005

On Fri, Nov 11, 2005 at 09:13:01AM +1100, Erik de Castro Lopo wrote:
> > However, support for (ie) 48-bit-float should not have to be created, 
> Where are you going to find a 48 bit float? Is there an IEEE
> standard for that?

It's not about what's now, it's about what could be, and nobody has been able to 
make good predictions, except your logic here:

> If the RGB explanation is not what you are thinking about, do
> you realise that 96 bits gives a dynamic range of over 500 
> decibels? That means that if the largest sample corresponds to
> 1000 volts, the smallest sample will correspond to 5e-26 volts.
> I think you'll find that this is less than the voltage on an
> electron which is approx. 1e-19 volts.

This is an excellent argument.  It's still not a cap, since spintronics or all 
sorts of other forms of transferance is possible, but it's not feasable for 
those to be needed for this revision.

Ok so we cap it to 64bit, since much more than that doesn't make sense (96bit 
would be a "long double" C type)

I really don't like this idea, but I will entertain, formatting it as follows:

ID  Type  Bits  
0   Int   8
1   uInt  8
2   Int   16
3   Int   24
4   Int   32
5   Float 32
6   Float 64
7   Extended - Unsupported by any v1.0 software

This is what I consider a compromise, an additional minor revision can be made 
to add an additional set of types not covered by this layout while not loosing 
compatability with these types.

I added this to the wiki, to look at.  It requires 3 bits to encode, which when 
combined with the MSB/LSB flag bit, leaves a nibble for extended types or 
channel association. 

Now - channel association/scheme, or whatever a more appropriate name would be.. 
this has got to be a lookup table, too.  I'm thinking:
 0 Mono
 1 Stereo
 2 Quadraphonic
 3 Ambisonic
 4 Dolby 5.1
 5 Dolby 6.1 (used in Dolby Digital EX)
 6 Dolby 7.1
 7 Extended

This is 3 bits, leaving one bit free for an extra flag, if needed, for one other 
bit of configuration which may be needed.  We could also extend this to 4 bits, 
leaving many of them "Extended" for future minor versions.

Following these should be a 1-byte ID for each channel, the ID table specific to 
the format being used.  This is used, vs establishing a "standard order", so 
that multiple streams can supply different bitrates/etc for different channels 
if such support is needed.  Even mono audio should do so, simply to specify that 
the audio is "center".  

On Thu, Nov 10, 2005 at 04:23:23PM -0500, John Koleszar wrote:
> Basically I was trying to provide a method where a logical bitstream
> could contain only a subset of the total number of channels of the
> source. 
> This is similar to the page serial number, I know, but I don't think that's 
> sufficient if data from the same source is spread across multiple logical 
> streams, and you want to support multiple sources in the same overall stream. 

Hmm.  If they're going to reference each other, it should be through their 
stream's serialno, not by the order in which they appear in the physical stream.  
But I'm not convinced they either need to or should reference each other, this 
is something we need to work on in the "long term" way.

Another solution to this would be to leave it at channel #, ignoring what I 
wrote above re: channel association, and instead have these implemented via a 
vorbiscomments header page.  That way, since Vorbis and FLAC both implement the 
same vorbiscomments (and speex, too?), standard comment fields could be used to 
identify the surround sound method and the channels used.

There are many situations in multitrack recording where your microphone layout 
is per-instrument or per-vocalist, and you want to save this data for mixing, 
where none of this applies.  You would, however, want some way of labeling those 
tracks, and I'm thinking one possibility is with VorbisComments, too.

> I know that splitting a source up across multiple logical streams is
> ugly, but I can't think of any clean way to provide multiple sampling
> parameters within a single stream. In most cases, all channels will have
> the same sample parameters, so they will all be in a single logical
> stream. I really think that requiring fixed sample parameters per
> logical stream is a smart constraint to make.

That makes two of us.  I think that the multiple logical streams is also good 
across the board, such that (ie) Vorbis could encode 5.1 channels the same way.  
I think that's a project for another time, though, since it's an entirely 
different layer - presentation.


The recognition of individual possibility,
 to allow each to be what she and he can be,
  rests inherently upon the availability of knowledge;
 The perpetuation of ignorance is the beginning of slavery.

from "Die Gedanken Sind Frei": Free Software and the Struggle for Free Thought
 by Eben Moglen, General council of the Free Software Foundation

More information about the ogg-dev mailing list