[Vorbis-dev] Cover art

Ian Malone ibmalone at gmail.com
Wed Apr 1 11:00:10 PDT 2009

2009/4/1 Ian Malone <ibmalone at gmail.com>:
> 2009/4/1 Mathias Kunter <mathiaskunter at yahoo.de>:
>>> arbitrary binary content is most likely not a valid UTF-8 sequence and may
>>> cause current software to fail.

To be honest software should be robust against that for security
reasons if nothing else, but we should keep to the spec.

>>> the UTF-8 encoding will add an overhead of 50%, while the Base64
>>> encoding adds an overhead of 33%. [...] as is the case with e.g. WinAmp, not only is the
>>> BINARY_COVERART not shown in the file info dialog, but it's removed from
>>> the file if other comments are edited with WinAmp.

>> So can we say that the best suggestion for including cover art within ogg vorbis files in terms of compatibility is to put the FLAC picture block as base64-encoded text comment into the file?
> So far as I can see, base-64 coding will be invariant under UTF-8?  In
> which case it wont break anything to use it for this and it has the
> nice property that the decoding length is predictable.

So what we have is base64 encode FLAC block, place in contents of

On the wiki the points against the Audio Shell etc. COVERART approach were:
    *  no additional information like a description about the cover
art is provided,
    * the base64 string is displayed within many tag editors as plain
text because of their missing support for this "COVERART" field
    * it may breaks the playback on hardware players because of a
large vorbis comment header
    * the cover art can't be linked
On the other hand it is at least supported in a few places.

The BINARY_COVERART proposal (with initial nul) had:
    *  Easy to use for developers since the identical (or similar)
structure is also used by FLAC and MP3, which means that chances are
good that people and software programmers are willing to support this.
    * Old C / C++ based implementations don't display the binary data
as string since it always starts with a zero byte at the first
position, which is an empty string when interpreted as UTF-8.
    * The cover art can either be linked or embedded within the stream.
    * All common picture file formats are supported (jpg, gif, whatever).
    * Additional information like a description or the picture type
(front cover, back cover...) is supported.

If we change to base64 encoded we lose the advantages that the initial
nul was supposed to gain (hiding binary data), but retain the
advantages using the FLAC block provides (similarity to FLAC, MP3).
Is the possibility of linking the art provided by the FLAC structure
(I can't see it in the format) or was it supposed to be an alternative
use for the COVERART tag?

Base64 versus UTF-8 has a size advantage, predictable encode-decode
lengths and '\0' will not occur.  It's possibly that '\0' in the
stream that's provoking the Winamp bug.  However having encoded binary
data appear when reading comments (eg to the command line) is very
annoying.  So we could either:
a. Insert a '\0' before the encoded data (at the start of the comment)
to preserve this hiding behaviour.  Readers expecting the data could
skip past this byte.  It may result in mangling by poorly designed
comment editors.
b. Modify the Xiph tools that display comments to hide these tags
unless told otherwise.

Further comment on case a: we still need to be able to get this info
out using the standard command line tools, so if we go for a. then
vorbiscomment etc. would benefit from an option to bypass nul bytes
(for example, replace them with spaces, this can be done as the length
of the comment is known).

To me, keep the FLAC block and insert a proceeding nul before the
encoded data seems like the best option. It helps to prevent the coded
data showing up where not wanted (although it does not always do so,
particularly on platforms which do not use nul-terminated strings).

Any further suggestions?

We should probably refer to http://tools.ietf.org/html/rfc4648#page-5.


More information about the Vorbis-dev mailing list