[opus] ambisonics formats and channel mappings

Marc Lavallée marc at hacklava.net
Sun May 29 00:50:43 UTC 2016

On Sat, 28 May 2016 16:21:33 -0700,
Michael Graczyk <mgraczyk at google.com> wrote :

> Hi Marc,

Hi Micheal.

> On Sat, May 28, 2016 at 10:44 AM, Marc Lavallée <marc at hacklava.net>
> wrote:
> > I subscribed because your discussion on the IETF draft ("Ambisonics
> > in an Ogg Opus Container") was mentioned on the sursound list.
> Thanks for your interest! Please feel free to voice your support for
> this work on the codec at ietf.org mailing list. The more support the
> better.

I will!
> > I tried Opus for ambisonics more than a year ago. It does works with
> > uncoupled channels (I had to patch the encoder). I don't know what
> > else could be done to optimize support for ambisonics, as I'm not a
> > codec expert.
> I agree. We are decoupling channels and setting the bitrate
> allocation to give progressively fewer bits to higher order channels.
> We have also explored dynamic interchannel bitrate allocations based
> on a worst case analysis (over all head rotations) of rendered
> binaural masking levels. We did not find dynamic allocations to
> improve subjective quality enough to explore that potential encoder
> optimization in depth.

So, channels will stay uncoupled for Ambisonics? That would be great,
because considering how good Opus already is, I don't think that more
optimization is required, and it would put content creators at ease;
they often believe that any compression or channel coupling could
break ambisonic decoding.

> > The allowed number of channels should not be restrained to a list
> > like 1,4,9,16,etc, because ambisonics can support mixed-order
> > schemes.
> The way the Ogg headers work, it is possible to send 2,3,5,etc
> channels by sending 4,4,9 respectively, and setting the channel index
> to 255 for those channels which are not included ({2,3}, {3},
> {5,6,7,8} in this example). We reserved actual channel number "C" ==
> 2,3,5,etc so that if mixed order schemes with different basis
> functions become common in the future, we could add them without
> breaking existing implementations.

I see, thanks for the explanation. I'm not familiar enough with Ogg, as
I was using AAC with the m4a contrainer for my "ambisonics on the web"
experiments. I'm now confident that Opus will be largely adopted.

> > The Ambix format was adopted by Google, but it's a new format;
> > the FuMa format is widely used and could easily be supported as
> > well. https://en.wikipedia.org/wiki/Ambisonic_data_exchange_formats
> Although FuMa conventions are commonly used, the ambix conventions of
> ACN channel ordering and SN3D normalization are increasingly becoming
> the norm. FuMa generally uses maxN normalization, which becomes
> complicated for fourth order and above. I decided to require ACN and
> SN3D to limit fracturing and potential confusion.

I understand your point, and content creators will simply convert their
FuMa sources to the Ambix format. That's not a major problem, and most
will agree.

> > Down-mixing to stereo is a sort of ambisonic decoding; it is simple
> > and could be included in the Opus decoder. But decoding to binaural
> > or 5.1 is not trivial. Usually, decoding ambisonics is the job of a
> > dedicated decoder, and depends on the role, number and positions of
> > output channels. so I don't think that the Opus decoder should be
> > involved, unless approximate methods could be considered good
> > enough as default.
> I included downmixing to stereo because it is simple and there is a
> clear right way to do it. For 5.1 or other surround setups, there is
> no obvious best solution so I did not include those downmixing
> matrices. The purpose of this downmixing matrix is to give guidelines
> for what to do in the likely case that an ambisonic Opus stream is
> encountered, but no ambisonic decoder is available, In that case, the
> best thing to do (arguably besides giving an error) is to decode with
> the provided downmixing mixing matrix. Are you suggesting we exclude
> the stereo downmixing matrix?

I'm certainly suggesting to *include* stereo down-mixing, because it's
trivial to do, and it would allow compatibility, something very
important to promote Ambisonics. But I would not include decoding to
other output schemes like 5.1, because it would be like putting a square
peg in a round hole; decoding Ambisonics for irregular arrays can be an
art form, it's not like working with common surround formats.

One important reason to include support for mixed-order ambisonics:
with only 3 channels, horizontal surround is possible and works very
well with headphones or 4 speakers (even 3); while it does not support
height information, I believe that horizontal only surround will a be
valid option for a long time, despite the sudden interest for
"VR" (which is why Ambisonics is becoming popular). Horizontal HOA
needs 5 channels for 2nd order, one channel less than 5.1! Using the
extra channel adds FOA vertical information, for a much better
experience than plain 5.1.

I would suggest to also include support for the UHJ format, which is
roughly equivalent to 3 channels FOA, but using only 2 channels; even
soundfield rotation is possible with UHJ (although less good), enabling
horizontal "VR audio" with only 2 channels.

Ok, enough proselytism... VR+Ambisonics+Opus is a winning combination.


More information about the opus mailing list