[vorbis-dev] using vorbis for finding structure in music
Kasper Souren
Kasper.Souren at ircam.fr
Sat Jan 25 05:45:24 PST 2003
Hi,
The topic of my PhD is finding structure in a musical piece, starting from
audio. The idea is to use audio descriptors in order to find repetition and
maybe also transformations. Currently I am using descriptors that can be
described as a small subset of the FFT of the FFT, which gives me a
not-too-many-dimensional vector for one second of sound, with a stepsize of
0.1 second. I am also thinking of using MFCC, but it might be more
interesting to start with the Vorbis data, and process them. Maybe do an FFT
on them, in order to diminish the amount of data.
The advantages are that all the soundfiles I want to process can be stored as
.ogg, and the decoding into raw PCM audio can be skipped.
Another advantage could be that the signal is already stripped from the - for
human hearing - irrelevant information. This could however also turn out
negatively, since the sounds that are masked for the hearing, might be heard
anyway, and thus might also account for the cognition of structure. (Maybe I
should think of an example for this, to make myself clearer.)
So now I am studying the Vorbis specs. I was looking for a simple overview
(preferably graphical!) to make things clearer, a bit less technical than the
libvorbis-1.0/doc/ file. And more technical than most web-articles about Ogg
Vorbis.
I checked out http://citeseer.nj.nec.com/cs (which is excellent for finding
science stuff) for ogg and vorbis, but unfortunately I found nothing.
bye,
Kasper
--- >8 ----
List archives: http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'vorbis-dev-request at xiph.org'
containing only the word 'unsubscribe' in the body. No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.
More information about the Vorbis-dev
mailing list