[ogg-dev] Fixing ogg vorbis corruption caused by bad metadata

Adam Rosi-Kessel adam at rosi-kessel.org
Wed Jul 15 11:34:41 PDT 2009


Adam Rosi-Kessel wrote, on 7/14/2009 2:12 PM:
>> Cool that you've gotten them fixed!
> Yes, and I take it you are the rogg author, so thanks for saving my
> 1000+ files! For posterity, I will post my ultimate unified solution
> here -- I think it should be almost entirely automated.

So I've written a script to do the following:

(1) Set the serial numbers on all packets for the "good" ogg and the 
"bad" ogg to the same (shelling out to rogg_serial)

(2) String-scraping the "bad" ogg to extract the metadata such as 
artist, album, genre, etc. This metadata appears to be intact in all of 
my "bad" oggs.

(3) Extracting all packets with granulepos 0 from the "good" ogg

(4) Extracting all packets with granulepos >0 from the "bad" ogg

(5) Concatenating the results of 3 & 4

(6) Putting the correct metadata back in with vorbiscomment

This appears to work flawlessly with some files. For others, although 
the output is a valid ogg, the sound is scrambled. I'm assuming this is 
because I need to swap in a different "good" header for those oggs, 
presumably because they were, e.g., ripped with different settings.

Any ideas on how to find the proper "good" header to go with each 
corrupted ogg? My first idea was to grep for the ripper/codec 
identification string (e.g., "Xiph
.Org libVorbis I 20050304"), which sometimes works but not always. In 
other words, there are some bad files with "Xiph
.Org libVorbis I 20050304" where a good file with that string in the 
header still generates distorted sound.

Anything I could drill down to on the byte level in the headers to 
automate the matching process? The container documentation is pretty 
clear, but I'm not sure what I should be looking for at the codec level 
to try to make the match.

Adam


More information about the ogg-dev mailing list