[vorbis] Win32 multiple file tag editor?
Beni Cherniavksy
cben at techunix.technion.ac.il
Thu Dec 27 03:48:09 PST 2001
On 2001-12-26, Glenn Maynard wrote:
> On Wed, Dec 26, 2001 at 05:37:34PM -0600, Gregg Mattingly wrote:
> > reads IDv2/3 tag's on my ogg's( I attach the tags to my wave file's when I
>
> Err. ID3v2 on Ogg? Guess you can do it, but ...
>
I think from the end of the sentecne that he meant ID3v2/3 tag's on his
mp3's. But what is ID3v3??? Or was ID3v1/2 meant?
> > rip 'em from the cd before I encode to ogg) and puts the tag info into a
> > mySQL database. What I would like to do is also have the ogg tags available
> > with ogg file info as well.
>
> I'd like to see a way to put Ogg tags on MP3s in some form, so I can
> have a single tag set. (Of course, id3v2 offers a lot that ogg tags
> currently don't, but that can be added later, separately. Little of
> that is used, since neither editors nor players tend to support them. I
> don't know if this is a simple chicken-and-egg problem or if people
> wouldn't use it if it was there. I would use untimed lyrics and might
> use loosely timed lyrics once in a while--another thing that could use
> multiple language support, but that's a bit more difficult.)
>
[Ideas go from the ones I liked the least to the best one IMO in the end.
You can just skip the first ones.]
The simplest way is just adding some undefined experimental frame with the
exact content of the Vorbis comment header. Call it something like XVTG
(Vorbis TaG) and you are done. Or maybe use a General Encapsulation
Frame. How about attaching the ogg version of the file in this frame ;-)?
The only problem is that nobody will see it in their players. I
understand that you try to put it in some readable field.
The TXXX frames can do it quite straightforwardly (almost)-:
>From id3v2.4.0-frames.txt:
| 4.2.6. User defined text information frame
|
| This frame is intended for one-string text information concerning the
| audio file in a similar way to the other "T"-frames. The frame body
| consists of a description of the string, represented as a terminated
| string, followed by the actual string. There may be more than one
| "TXXX" frame in each tag, but only one with the same description.
|
| <Header for 'User defined text information frame', ID: "TXXX">
| Text encoding $xx
| Description <text string according to encoding> $00 (00)
| Value <text string according to encoding>
If not the "only one with the same description" restriction, one could
just put one vorbis tag per such frame. UTF-8 is one of the possible
encodings:
>From id3v2.4.0-structure.txt:
| $03 UTF-8 [UTF-8] encoded Unicode [UNICODE]. Terminated with $00."
However this can't represent vorbis' comments (can contain 0's). Worse
yet, the phrase <text string according to encoding> is defined in
id3v2.4.0-structure.txt to mean text without newlines!
Unsynced lyrics/text (TUSL) is indeed a better choice since it allows
newlines - but it still doesn't allow many tags with the same descriptor.
One approach could be mangling the content descriptor to include a number:
PERFORMER=1\0Foo Bar
PERFORMER=2\0Baz Quux
(The `=' is the only ascii printable character that can't appear in vorbis
tag names). It's quite ugly, any way around.
I have still some doubts though that many programs can display multiple
lyric tags if at all ;-/. Besides it's not quite lyrics. I think that
COMM (Comments) field is more appropriate and has higher probability for
being displayed.
>From id3v2.4.0-frames.txt:
| 4.10. Comments
|
| This frame is intended for any kind of full text information that
| does not fit in any other frame. It consists of a frame header
| followed by encoding, language and content descriptors and is ended
| with the actual comment as a text string. Newline characters are
| allowed in the comment text string. There may be more than one
| comment frame in each tag, but only one with the same language and
| content descriptor.
|
| <Header for 'Comment', ID: "COMM">
| Text encoding $xx
| Language $xx xx xx
| Short content descrip. <text string according to encoding> $00 (00)
| The actual text <full text string according to encoding>
I haven't seen any programs that show multiple comments either, though.
Even less :) that can edit them. And I have seen none (a trend is
visible in ID3v2 support, isn't it) that do something with content
descriptors - show them / allow to choose them on editing...
Maybe the best way is to encode all data in a single comments frame in a
simple text format? For example:
TITLE=Here goes some UTF-8 text. It
can continue on the many lines. It can contain zeroes
(why\0not\0?).
TITLE=And here goes another one... (not quite titles but who cares)
PERFORMER=Foo Bar
TITLE= <--- that's part of his name :-)
PERFORMER=Baz Quux
A `:' would be better than `=' for readability. But only `=' and newline
can mark the end of a vorbis comment name well. Also first line is
different from the previous ones, which is slighly ugly. Maybe this:
TITLE:
Tagging example.
TITLE:
An example whose drawback is that it wastes line,
which is bad.
OTOH, a `=' is a better marker that these are vorbis comments. A person
seeing `=' will understand (hopefully) that some well-defined format is
being used.
Or this:
TITLE=An example that is better.
The problem with it is that the indentation changes.
PERFORMER=Foo Bar (see?)
The problem is to make a format that is well defined but has a high
probability that someone changing it (without knowing the rules) will
leave it the same. What should the program do when reading broken
fields? A good parsing algorithm is probably to scan all lines after the
first, find the smallest common indentation and remove it from these
lines. But what about representing:
TITLE=First line is supposed to not be indented.
But these lines *are* supposed
to be indented.
Maybe allow also the `:' another syntax for the rare cases:
TITLE:
First line is supposed to not be indented.
But these lines *are* supposed
to be indented.
Using `=' would mean there is an empty first line.
--
Beni Cherniavsky <cben at tx.technion.ac.il>
(also scben at t2 in Technion)
I like Common Lisp* more than Common Source
and Open Source more than Open Collector.
(*)Scheme is better than any scheme.
<p>--- >8 ----
List archives: http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'vorbis-request at xiph.org'
containing only the word 'unsubscribe' in the body. No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.
More information about the Vorbis
mailing list