[ogg-dev] OggPCM (uncompressed Ogg audio)

Silvia.Pfeiffer at csiro.au Silvia.Pfeiffer at csiro.au
Wed Nov 9 03:21:04 PST 2005


Hi Arc, illi,

I think it would be advantageous if we take the emotion out of this discussion , so let's just argue technically. My experience with xiph is that we are a very friendly community and trying to help each other and listen, so let's keep that culture up.

I think we all agree: it is a good idea to have an media mapping for ogg for uncompressed PCM.

As for what is required in a media mapping, the RFC (http://www.ietf.org/rfc/rfc3533.txt) and the file that used to be at http://www.xiph.org/ogg/doc/vorbis-stream.html specify it:
- a bos page (the "header" page) that identifies the codec and sets it up
- potentially secondary header pages (e.g. for vorbiscomment type stuff)
- and a granule pos scheme.

Data pages are identified to be part of a logical bitstream through their serial number, so don't need any additional identifiers. Thus, Arc, I don't quite understand why you would require another 32 bits at the beginning of each data packet, when ogg pages are already covering that information?

As for the information that goes into the bos page, I can see at least an endianness indicator missing. I actually think we should ask sombody like the author of libsndfile, Erik de Castro Lopo, to give his input on what fields should be in the BOS page for such a media mapping. His vast experience on parsing different audio formats should provide us with more complete information.

As for the last bit required for specifying media mappings: I cannot see a specification for what the granule pos means on a ogg pcm stream? My suggestion is the total number of PCM samples encoded after all frames finished on this ogg page. Whas that the intention?

By the way, I think it would be advantageous to specify the bos page mapping in a bit pattern as is common within the IETF. This will help identify if any fields cross byte boundaries and would therefore be hard to parse (which I haven't seen in the ogg pcm spec, but this format is still a nicer specification).

e.g.  the Ogg page header has the following format:

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| capture_pattern: Magic number for page start "OggS"           | 0-3
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| version       | header_type   | granule_position              | 4-7
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                                                               | 8-11
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                               | bitstream_serial_number       | 12-15
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                               | page_sequence_number          | 16-19
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                               | CRC_checksum                  | 20-23
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                               |page_segments  | segment_table | 24-27
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ...                                                           | 28-
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Regards,

Silvia.


-----Original Message-----
From:	ogg-dev-bounces at xiph.org on behalf of Arc
Sent:	Wed 11/9/2005 8:10 PM
To:	ogg-dev at Xiph.org
Cc:	
Subject:	[ogg-dev] OggPCM (uncompressed Ogg audio)
Moved off OggYUV thread as this is off-topic for it..

On Wed, Nov 09, 2005 at 04:51:51PM +0800, illiminable wrote:
> >
> >Um, data packets consist of a 32-bit header followed by PCM data.  Where 
> >did you get the idea that "sync" data was in the packet?
> 
> That's what the first 32 bits of "header" are.

Um, no, the first 32 bits of the header is a packet type ID followed by the 
3-byte codec label.  What part of "\xFFPCM" could possibly be considered sync 
information?  

Seriously, I've gotten the strong impression you're arguing from the hip without 
actually reading any of the draft specs we're talking about.

Here's a link, since you've obviously not read it sufficiently:
 http://wiki.xiph.org/OggPCM


> >KISS - Keep It Simple, Stupid.  This is a raw data format.  It's value to
> >everyone is primarily how simple it is to implement.
> 
> I don't know whether to laugh or cry. Yes... it's a raw data format... and 
> look at the header you've given it.

Number of channels, bits per sample, samples per second, sample format.. exactly 
what in this would you consider extranious?

The data packet header was given because with Ogg codecs you need a one-byte 
identifier to differentiate a data packet from a header packet, and the "PCM" is 
free because otherwise we'd be buffering out to the next 32-byte word anyway.

Again, I don't know what you've been reading, or if you're seeing the same spec, 
how you could have possibly misunderstood the fairly plain data layout, but this 
is the smallest formatted PCM codec I've seen, especially compared to RIFF/WAVE.

I'm interested in hearing your specific criticisms on this.
 

-- 

The recognition of individual possibility,
 to allow each to be what she and he can be,
  rests inherently upon the availability of knowledge;
 The perpetuation of ignorance is the beginning of slavery.

from "Die Gedanken Sind Frei": Free Software and the Struggle for Free Thought
 by Eben Moglen, General council of the Free Software Foundation
_______________________________________________
ogg-dev mailing list
ogg-dev at xiph.org
http://lists.xiph.org/mailman/listinfo/ogg-dev



More information about the ogg-dev mailing list