[vorbis-dev] AMBISONIC critique

Ralph Giles giles at snow.ashlu.bc.ca
Mon Aug 14 23:55:27 PDT 2000



On Tue, 15 Aug 2000, Thomas Marshall Eubanks wrote:

[wavelength analysis]

> SO there are 3 rough frequency ranges for localization:
> 
> 20 to 200 Hz              - no localization
> 200 Hz - 8 to 10        - phase interferometry
> 8 kHz   - 20 kHz        - intensity interferometry

Note that the brain also uses intensity interferometry at lower
frequencies. In general it picks the 'most believable' information out of
what's available, which is why the traditional stereo pan works so well.
In the case of real world sources and film, this includes influence from
visual cues as well.

If anyone's wondering, up/down vs. in-plane directionality is determined
from the way the sound passes through and diffracts around the head (which
is only left-right symmetric) and ears, and the frequency dependence of
same. Otherwise, the two-detector description above only tells you the
angle relative to a line passing through each ear.

[conventional practice snipped]

> Surround sound in theaters is intended for two purposes :
> 1,) So that the speech is better localized at the screen, even if you don't sit in the
> center of the auditorium (I.e., the "sweet spot" is expanded.)
> 2.) To have the occasional sound come from "behind" (like the creak of a door in a
> thriller).
> 
> In a home stereo, #1 is not thought to be so important, but number 2 (for
> reflected sounds) is, Reflected sounds have lower SNR, so the "surround"
> part of the sound does take as many bits.

It's not just that. For a long time there's been interest in creating
"soundscapes" which form a more immersive environment. The listener can be
in the middle of the sound, not off to the side. That's not really
possible with 5.1 or conventional stereo.

> So, the conventional practice skews the coordinate system forward, and reserves
> most of the bits for the conventional stereo.

[comments on ambisonics]

> 3.) In NO case will you have the localization ability of the 5 channel or even 4 channel
> Dolby scheme ( for a similar total bit rate).

I'd been assuming the complaints about traditional quadruphonic
recordings in the ambisonic documentation were against four *speakers* as
inadequite, not four channels. By mixing the WXY channels into six or more
speakers, you get better reproduction of the soundfield so the directional
discrimination you've described so well work better, even if there's
actually *less* information in the recording. 

But you may well be right that we can't get the same accuracy of
localization with only 3 channels. I guess that means we need to allow
the quadrupole terms as well? Again, not all of them, just what we want.

I'm perfectly fine with allowing "just" planar surround as an option.
What I don't like is skewing the samples like Dolby Surround does without
it being general. Our sole target is *not* the installed hardware in
theatres.
 
[snip]

> "These [Y, Y nd Z] signals are equivalent to
> three figure-of-eight microphones at right angles to each other, together with an
> omnidirectional unit, all of which have to be effectively coincident over the frequency range of
> interest"
> 
> This reliance on a particular transceiver is BAD. If you don't have these, what
> sort of sweet spot will you have (you can always make things work at a point).

No no, you do an additional decode step to map the spherical harmonics
onto speakers. This is a very good thing. It means we can tune the output
to the hardware, instead of always failing to achieve the perfect speaker
positioning. No matter how many speakers you've got. If you really care,
we can take into account your exact speaker positions, frequency response,
and room accoustics. We can used a head-model transfer function to give
excellent reproduction on headphones. And in the default case, an extra
matrix multiply at the end is hardly a serious complication.

> In summary, I  simply do not think that the Ambisonic scheme is particularly efficient nor does
> it
> scale. It seems way to mathematically rote to me, not tuned to the actual
> physics of the situation.

You must be an experimentalist. :^)

 -ralph

Speaking of delegation, does anyone have enough equipment to actually test
any of the above? What we really need is a way to drive multiple (cheap) 
soundcards with sufficient sync to not cause phase errors. Too bad the 8
channel ones are $1000...


--
giles at ashlu.bc.ca
gourmet leatherman biker sysadmins, yes.

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/




More information about the Vorbis-dev mailing list