[vorbis] Why 64kbps is important

Erato erato at musemagic.com
Tue Aug 7 16:50:50 PDT 2001



MPEG-4, T/F coder profile, VERSION 2, LOW DELAY AAC for interactive...
(otherwise known as MPEG2-AAC, LD)

this profile was designed to bridge gap between the speech and audio
coding schemes....

max delay of MPEG2-AAC, LD profiles is 20ms...average POTS (telephone delay)
50ms.

Standard is being used currently by NTT DoCoMo others for IP type delivery
structures INCLUDING 3G (3rd generation wireless).

In other words...the original post of the VoIP requirements is work going on
right now in other standards bodies and manufacturer's consortiums.  There
appears to be a technical strategy of interactive speech + audio, one codec,
(44.1kHz SR) and LD for interactive.

max delay 20ms...average POTS (telephone delay)  50ms.

In other words, the original post of 64kb/s and VoIP and the merge of
speech and audio signals for utilizing one audio codec...and interactive
beyond LD ACELP, etc. type video conferencing standards is going on in
MPEG-4
right now.

 MuseMagic
 1.240.209.7765 voicemail/fax
  thalia at musemagic.com
  www.musemagic.com

"What am I homeless and all of the idiots have the BMW's, oh yeah, I
forgot...chicks can't code, better wipe my HD again thank you"

-----Original Message-----
From: owner-vorbis at xiph.org [mailto:owner-vorbis at xiph.org]On Behalf Of
Gregory Maxwell
Sent: Tuesday, August 07, 2001 4:26 PM
To: vorbis at xiph.org
Subject: Re: [vorbis] Why 64kbps is important

On Tue, Aug 07, 2001 at 03:59:11PM -0700, Michael Paine wrote:
> > I find this very hard to believe: Both MP3Pro and WMA (and Vorbis too)
> have
> > *way* too much codec delay to be at all suitable for VoIP.
>
> Using superscaler techniques, I know someone who can decode MP3 @ 300x on
a
> PIII 500mhz and encode @ 40x [ I just wish he would release his code :( ]

I'ts not about speed (and even if it were, you arn't going to put a PIII 500
in every IP phone, they would melt! :) ).

It's about codec delay. I didn't say codec speed. Delay. Mp3/wma/vorbis are
all transform codecs. They require a fairly large block of samples, i.e. in
Vorbis you need to input 3072 samples (or is it 4096) from each channel
before you can get any output. This could be reduced, trading off quality,
but could not be reduced enough.

This delay (3072 samples) is 139ms for 22050, giving a minimum round trip
delat of 278ms, assuming an infinatly fast network. This is unacceptibly
long, and would make echo canclation very difficult.

WMA and MP3 have the same problem. (Though mp3's delay is smaller, it's
still too big for realtime full duplex voice).

> But people aren't satisfied with their quality (especially with
speakphones
> or conference calls).  IF they could hear stereo 44khz instead, they would
> never want to go back!

I'm not so sure of that. 8bit ulaw is perceptually as good as 16bits for
voice, and almost all the energy of voice is below 4k, yes 16KHz would be an
improvement (and give better results for people with shrill voices), but for
VOICE, you aren't going to be sound much better then regular telephone with
more bits.

I'll assume you're right for the purposes of this converstation, but that
doesn't change the fact that a codec with 139ms delay is not going to be
very fun, esp for speaker phones and I'll totally ignore the pratcically of
stereo on normal interfaces.

>  I agree with you on the fallback audio issue, but do
> you disagree that cable softphones aren't just around the corner?

>  Would you
> agree that 2.5G+ cell phones aren't going to look into supporting > uLaw
> audio in a year or two (some cell phones already support MP3 codecs,
> although seperated)?  Would you agree that there is a trend of PDAs to
merge
> with cellphones, giving > uLaw possibilities (especially with ARM chips),
> while supporting "cleaning" code (such as Clarity and Wavemakers).

Today cell phones intentionally sacrifice quality to achieve lower bitrate
operation. The RF nature makes it impossible to have flawless
communications, so the users will tolerate some lossage, they take it a step
beyond that with their agressive compression. I would be truely supprised to
see better then toll quality cell phones.

People might pay more for CD quality phones, but not as much as more
customers would pay regular rates to recieve sub-toll over the same
spectrum.

> Time will tell, but since I'm in a position to add Vorbis to major telecom
> equipment, I'm going to say "wanna bet?"

Not trying to be rude, but if you don't understand codec delay, I'd have a
hard time beliving that you are in any such position.

Obviously, I'd never be one to disuade someone from using Vorbis... But
real-time full-duplex voice? Square-peg-round-hole. Not that WMA, or MP3pro
are any better fits.

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'vorbis-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'vorbis-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.



More information about the Vorbis mailing list