[vorbis-dev] using vorbis for finding structure in music

Kasper Souren Kasper.Souren at ircam.fr
Sat Jan 25 05:45:24 PST 2003



Hi,

The topic of my PhD is finding structure in a musical piece, starting from 
audio. The idea is to use audio descriptors in order to find repetition and 
maybe also transformations. Currently I am using descriptors that can be 
described as a small subset of the FFT of the FFT, which gives me a 
not-too-many-dimensional vector for one second of sound, with a stepsize of 
0.1 second. I am also thinking of using MFCC, but it might be more 
interesting to start with the Vorbis data, and process them. Maybe do an FFT 
on them, in order to diminish the amount of data.

The advantages are that all the soundfiles I want to process can be stored as 
.ogg, and the decoding into raw PCM audio can be skipped.

Another advantage could be that the signal is already stripped from the - for 
human hearing - irrelevant information. This could however also turn out 
negatively, since the sounds that are masked for the hearing, might be heard 
anyway, and thus might also account for the cognition of structure. (Maybe I 
should think of an example for this, to make myself clearer.)

So now I am studying the Vorbis specs. I was looking for a simple overview 
(preferably graphical!) to make things clearer, a bit less technical than the 
libvorbis-1.0/doc/ file. And more technical than most web-articles about Ogg 
Vorbis.
I checked out http://citeseer.nj.nec.com/cs (which is excellent for finding 
science stuff) for ogg and vorbis, but unfortunately I found nothing.

bye,
Kasper
--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'vorbis-dev-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.



More information about the Vorbis-dev mailing list