[vorbis-dev] clarifications on comments spec

Ralph Giles giles at xiph.org
Mon Jun 30 04:17:24 PDT 2003



On Sunday, June 29, 2003, at 02:37 am, Scott Wheeler wrote:

> I've been hacking on an audio metadata library (creatively called 
> TagLib)
> which we'll be using in KDE in the next release ( == I need to get 
> around to
> finishing it).  I've got the Vorbis comment implementation working, 
> but just
> wanted to clarify a couple of things:

cool!

> *) The vendor ID -- "vendor" is ambiguous here.  Is this the "vendor" 
> of the
> standard?  The encoder?  (Seems to be implied by the value mentioned 
> in the
> spec.)  The tool writing the comment?  In my current implementation, 
> I've
> assumed the second and as such don't modify the vendor ID, but I 
> didn't know
> if I should stuff something in there mentioning TagLib.

This is meant to identify the 'encoder' so it's a little ambiguous when 
you're rewriting the file. In as much as this is metadata describing 
the compressed content itself, I'd leave it alone if you're just 
editing the tags.

> *) There seems to be no provision for "padding" (to use the ID3v2 
> word).
> Since the comment is very near to the beginning of a Vorbis stream, 
> assuming
> a file based implementation, the overwhelming majority of the file's 
> contents
> will have to be moved when the comment size changes.  Is this correct?

That's correct.

> *) To what extent is/will a Vorbis comment be an Ogg comment?  Will/do 
> the
> other Ogg formats (FLAC and Speex come to mind) use the same format?  
> In my
> (C++) implementation I have separate classes for files and tags.  i.e. 
> With
> mp3 there are two tag types -- id3v1 and id3v2.  TagLib::VorbisFile and
> TagLib::Vorbis::Comment are independant; as such it would be possible 
> for the
> comment to become something that could be mapped to multiple Ogg based 
> file
> formats (which would extract the comment from the format's stream) -- 
> i.e.
> TagLib::Ogg::Comment -- if appropriate.

Well, specification of the packet structure is *entirely* up to the 
codec, so I wouldn't use Ogg::Comment. However, we do by convention use 
this structure in our designs for speex, vorbis and theora. (I'm not 
sure about Ogg FLAC) so maybe Xiph::Comment?

There is one hitch in that the various codecs have different preambles 
before the common decode. Vorbis starts with 0x03,'vorbis'; Theora 
begins 0x81,'theora'; Speex has no preamble at all and begins the 
packet directly with the vendor length. You'll have to handle this 
variation somehow.

> And now for the obligatory comments section --
>
> *) First, I feel compelled to thank you guys for coming up with a 
> scheme
> that's relatively (i.e. relative to ID3v2) easy to parse and render --
> thanks.

You're certainly welcome!

> *) Presuming there's no scheme for "padding" it would be nice if some
> convention could be adopted.  This makes "tagging" much faster since 
> in most
> cases it won't require rewriting the entire file.  This could be as 
> simple as
> a standard comment field with an obvious name -- but I think I've come 
> up
> with a better solution; more on that later.

It may help to think of ogg at a bitstream rather than a file format. 
That's really the point of view from which is was designed. This is 
actually true of mp3 as well, but the folks who designed the tagging 
system didn't appreciate that.

> *) There's no location to read the complete length of the header from. 
>  This
> makes parsing in a streaming situation, where data is being pushed 
> rather
> than pulled, more complicated.

Aye, you have to buffer a bit. Look at icecast2 if you're curious. 
That's why we specify a page flush after the last header packet; you 
can just watch the granulepos and know when you've got the headers 
without parsing pages.

> *) Rerendering the full Ogg page(s) seems to be a requirement of the 
> current
> scheme.  This isn't particularly difficult, but could be simplified.

Yes, but yours is really the only application where that makes any 
sense. The page mechanism or something like it is required to limit the 
overhead, so you have to be able to handle all of Ogg to deal with the 
actual data. As you say, it's not particularly difficult, and libogg if 
available if you want some help. :-)

> [...]
>
> If it was required, as is true with the Vorbis identification header, 
> that
> future implementations put the comment in an Ogg stream page by itself 
> this
> could solve the above issues.  It would then be possible to deduce the 
> total
> size of the comment from the Ogg page header.  Though I haven't tried 
> it yet,
> I also presume that empty space at the end of an Ogg page would simply 
> be
> ignored, thus providing a mechanism for padding.

/page/packet/ here. It usually takes a while to grok the two levels.

The spec doesn't say either way about extra data in the header packet, 
but presumedly a good decoder would handle that. The reason we won't 
put it on a page by itself is that limits the length to 64k. Is that 
enough for everyone? That will also be a significant fraction of a 
low-bitrate file if you always use the full length for padding.

We have considered ideas of this sort in the past, particularly when we 
wrote our own example tag editor. So you're not the first to have to 
deal with this stuff. Generally our conclusion has been that it's not 
worth adjusting the spec for the convenience of only that application.

My thoughts, FWIW,
  -r

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'vorbis-dev-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.



More information about the Vorbis-dev mailing list