[opus] Channel Mapping Family for Ambisonics

Tue Apr 26 00:50:18 UTC 2016

On Mon, Apr 25, 2016 at 9:32 AM, Timothy B. Terriberry
<tterribe at xiph.org> wrote:
> Jean-Marc Valin wrote:
>>
>> Would it make sense to allow an arbitrary number of channels and just
>> "truncate" the list of channels. For example, two-channel ambisonics
>> would be W plus X and three-channel would be W, X and Y. The idea is
>> that you would get these mappings for free -- if there's any use for
>> them anyway.
>
>
> Well, more specifically, the channel count is stored in the main header, not
> in the channel mapping table, so it's possible to put any value in there, up
> to 255 (instead of, for example, simply storing l). We should say what a
> decoder is expected to do in the case it encounters a stream with such a
> value. "Treat the stream as invalid," is a perfectly acceptable option, as
> is Jean-Marc's, "pretend the missing channels are silent," but we have to
> pick one.
>
It would still be possible to send WYZ by sending 4 channels and
setting index 4 (X) to the "pure silence" value 255 in the channel
mapping array. I would consider it a special case optimization for an
ambisonic renderer to decide to simplify computation because X is
silent. In fact, most cases would not involve truncation but rather
"holes" in the mapping, ie WY_X instead of WYZX for horizontal only.

Restricting channel counts to (l + 1)^2 would allow us to include
rules for "hemispherical" ambisonics. These are sets of ambisonic
components with different shapes more suitable for encoding certain
content. An example is described in the ambix paper [1]. I need to
talk with some of the VR people here to figure out whether these
hemispherical encodings are likely to be used by anyone (at Google or
elsewhere). I could not find a commonly used set of basis functions
like the spherical harmonics used in ordinary ambisonics, so we may
want to keep things simple and avoid including them.

If we were to include hemispherical ambisonics, streams with 3, 6, 10,
etc channels could signal their usage.

>>> 1. Should I be more explicit about what "ambisonics" is, what
>>> normalization is, and what each channel really means?
>>
>>
>> For these kinds of things, you can just use references.
>
>
> Yes, hopefully this is all well-described somewhere (at least enough for
> someone to implement it), so we don't have to reinvent the wheel.
>
I will include a reference to the Ambix paper, which describes
ambisonics, SN3D normalization, and ACN channel order.

>>> 2. Do I need to list the specific meaning of each possible channel
>>> count? For example, should I write "4 channels: First order
>>> ambisonics" similar to what is done for surround sound?
>>
>>
>> I'm not sure I understand what your asking here.
>
>
> I think he's just asking if he needs to make a bulleted list as we did in
> 5.1.1.1 and 5.1.1.2 of the Ogg Opus draft. To which I think the answer is
> "no". Those are in lists because it was the easiest way to present the
> information. When you have a parametric mapping, a description in terms of
> the parameters makes more sense.
Yes, I meant a bulleted list. Great, I'll stick with the parametric description.

>>> 3. Should we define whether downmixing should occur when Ogg Opus
>>> players do not support the channel mapping? Since channel 1 is a mono
>>> channel, it is always possible to play something reasonable.
>>
>>
>> It's not absolutely required, but it would indeed be nice if you could
>> define a simple way that players can downmix. I guess mono is just "take
>> the W channel", but maybe a reasonable stereo would be nice too.
>
>
> Well, there will always be players that don't support the channel mapping at
> all, just because there's 4 years' worth of deployed software that knows
> nothing about it. I think the actual question is, "Should we give advice on
> how players with common speaker configurations (instead of a custom
> ambisonics rig or a virtual environment or whatever) can render the stream?"
> If you want to see support in common media players like VLC, I think we
> should.
>
Okay, I will include mixing matrices for stereo and common surround setups.

I will talk with the VR people here about hemispherical ambisonics and
the mixing matrices, then follow up with a revised version of the
mapping description.

[1] http://iem.kug.ac.at/fileadmin/media/iem/projects/2011/ambisonics11_nachbar_zotter_sontacchi_deleflie.pdf

-- 

Thanks,
Michael Graczyk