[vorbis-dev] Re: Speex: Open-source, patent-free speech coding

Mon Apr 1 13:29:19 PST 2002

Hi Kevin, Jean-Marc, all --

I've been following this thread with great
interest, since object-based coding is one of my
research interests.

If we're thinking about creating Ogg encodings for
Speex and for describing language-specific mixing, it
would be very interesting to go one step further and
create a full-fledged just-in-time audio mastering
capability like the one in MPEG-4.

What this could consist of would be a set of header
instructions that describe the number of tracks in the
file and the way they get mixed together.  Using a
simple DAG structure with the right set of nodes can
be both straightforward to implement and allow a lot
of very interesting functionality -- not only multilingual
soundtracks, but interactive ones where the author can
permit (or not, if she chooses) dynamic remixing.

I wrote a paper on how this is done in MPEG-4, where
it's called AudioBIFS (BIFS is "Binary Format For Scene
Description"):

  E. D. Scheirer, J. Huopaniemi, and R. Vaanaanen,
  "AudioBIFS: Describing Audio Scenes with the MPEG-4
  Multimedia Standard."  IEEE Transactions on Multimedia
  1:3 (Sept. 1999), pp. 237-250.

It's on my WWW page at http://web.media.mit.edu/~eds/papers/
if someone's interested.

I'd love to help design this as an optional capability or
extension to Ogg and Vorbis and Speex if there are other
interested people that want to work on it.  I already have
public domain source code for the MPEG-4 version.  (And
the technology is free and clear of patents, too).

Best to all,

 -- Eric

----
Eric D. Scheirer, Ph.D.
edsmedia at alum.mit.edu
+1 617 666 8905
http://sound.media.mit.edu/~eds

----- Original Message -----
From: "Kevin Marks" <kmarks at apple.com>
To: <vorbis-dev at xiph.org>
Cc: "Jean-Marc Valin" <jean-marc.valin at hermes.usherb.ca>
Sent: Monday, April 01, 2002 4:17 PM
Subject: Re: [vorbis-dev] Re: Speex: Open-source, patent-free speech coding

<p>>
> On Saturday, March 30, 2002, at 08:58 AM, DSPguru wrote:
> >>> the first step is :
> >>> - decide how we extract the 'common' track
> >>
> >> I'll leave that one to you. I have no idea about the properties of the
> >> different tracks.
> >
> > ok. maybe anyone else in the list have ideas ?
>
> When a TV program is edited, the Music & Effects is a separate track
> form the speech precisely so that different langauges can be added
> afterwards. They are only mixed at the mastering stage.
>
> You should start from the unmixed tracks and compress them separately.
>
>
> --- >8 ----
> List archives:  http://www.xiph.org/archives/
> Ogg project homepage: http://www.xiph.org/ogg/
> To unsubscribe from this list, send a message to
'vorbis-dev-request at xiph.org'
> containing only the word 'unsubscribe' in the body.  No subject is needed.
> Unsubscribe messages sent to the list will be ignored/filtered.

<p>--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'vorbis-dev-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.