[Vorbis-dev] padding in comment header / implementation of re-commenting code

Daniel Holth dholth at fastmail.fm
Sun Feb 5 15:48:32 PST 2006


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Ian Malone wrote:

> By sheer coincidence[1] the Speex comment header is identical to
> the Vorbis one and the Theora comment header is very similar (as
> in: the documentation suggests there is a difference but I haven't
> checked exactly what it is yet).
>
> [1] No, I suppose not.

The comment headers are all slightly different. I've figured it out
last December with some Python code that uses oggz to detatch and
re-attach edited comments, however, the same thing could be written
with greater difficulty in C.

# Vorbis comment header begins with \x03vorbis and end with \x01.
# Theora comment header begins with \x81theora and has no leftover byte
# Speex comment header begins with (len)vendor string, no preamble.

http://dingoskidneys.com/~dholth/oggzcomment-0.0.1.tar.gz

The code will read and write comments from streams with any
combination of vorbis, theora, and speex comment headers. It may be
good documentation for someone else.

Not sure what my code thinks about padding :-)

- - Daniel Holth
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (Darwin)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFD5o7QVh4W2pVfoMsRAqEvAKDndTkjcTHSLtynBm+JCiZ4RwIHwwCfQSq7
Y68GD+Pm3HXO01nHvjzrWiA=
=N+TH
-----END PGP SIGNATURE-----

-------------- next part --------------
#!/usr/bin/python
#
# Parse vorbis/theora/speex comments from comments-only file made
# with oggzdc.  Note stripped comment file is partly in native byte
# order so may not work on a different kind of machine.

# Copyright (C) 2005 Daniel Holth <dholth at fastmail.fm>
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.

import struct
import sys

# Vorbis comment header begins with \x03vorbis and end with \x01.
# Theora comment header begins with \x81theora and has no leftover byte
#  Speex comment header begins with (len)vendor string, no preamble.

class VorbisComment(object):
    def __init__(self, serialno, granulepos, packetno, data = None):
        self.serialno = serialno
        self.granulepos = granulepos
        self.packetno = packetno
        self.preamble = ""
        self.vendor = ""
        self.comments = []
        self.postamble = ""

        if data:
            self.unpack(data)


    def __str__(self):
        rep = ""
        rep += "s %d, g %d, p %d\n" % (self.serialno, self.granulepos, self.packetno)
        rep += self.vendor + "\n"
        for c in self.comments:
            rep += c.strip() + "\n"

        return rep


    def pack(self):
        packet = self.preamble
        packet += struct.pack("<I", len(self.vendor))
        packet += self.vendor
        packet += struct.pack("<I", len(self.comments))
        for c in self.comments:
            packet += struct.pack("<I", len(c))
            packet += c
        packet += self.postamble
        return packet


    def unpack(self, cmt):
        for preamble in ("\x03vorbis", "\x81theora"):
            if cmt.startswith(preamble):
                self.preamble = preamble
                cmt = cmt[len(preamble):]
                break

        vendor_length = struct.unpack("<I", cmt[:4])[0]
        cmt = cmt[4:]
        vendor = cmt[:vendor_length]
        self.vendor = vendor
        cmt = cmt[vendor_length:]

        count = struct.unpack("<I", cmt[:4])[0]
        cmt = cmt[4:]

        while(len(cmt) >= 4):
            length = struct.unpack("<I", cmt[:4])[0]
            cmt = cmt[4:]
            note = cmt[:length]
            cmt = cmt[length:]
            self.comments.append(note)

        self.postamble = cmt


if __name__ == "__main__":
    cmt = file(sys.argv[1]).read()
    out = file("newcomment.cmt", "wb")  # Comments with new comment added
    
    comment_headers = []

    while cmt:
        serialno, granulepos, packetno, bytes = struct.unpack("iqqi", cmt[:24])
        cmt = cmt[24:]
        single = cmt[:bytes]
        decoded = VorbisComment(serialno, granulepos, packetno, single)
        comment_headers.append(decoded)
        cmt = cmt[bytes:]

    for decoded in comment_headers:
        # Add our custom comment
        for i in range(1000):
            decoded.comments.append("COMMENTED=oggzdc")

        print decoded

        packed = decoded.pack()
        out.write(struct.pack("iqqi", decoded.serialno, 
                                      decoded.granulepos,
                                      decoded.packetno, len(packed)))
        out.write(packed)

    out.close()


More information about the Vorbis-dev mailing list