[Vorbis] Proposal for Ambisonics format in vorbis comment.
Martin Leese
martin.leese at stanfordalumni.org
Mon Nov 30 12:57:23 PST 2015
"Gabriel I." wrote:
> Greetings,
>
> I apologize if I posted this in the wrong list, I wasn't sure where to post
> it, but seeing as the tags are called "vorbis comments" I thought vorbis,
> rather than ogg-dev, would be the right choice. (actually, I'm not even a
> developer anyway)
Hi Gabriel,
I doubt whether the Xiph community would
promote a file format for Ambisonics without
first seeing whether it had the support of the
Ambisonic community *and* seeing it used in
the wild. The Ambisonic crowd all hang out on
the sursound list,(1) so you should post your
proposal over there. (Links have been
collected together at the end.) However, there
was a heated discussion in August/September
2008 on that list about a new file format.
Despite several hundred posts, no consensus
emerged. My guess is that nobody over there
has the stomach for another round (I know I
don't). This also might explain why you
received no replies when you posted your
proposal on the Ambisonics list.(2) (So many
lists.)
I have interspersed some comments below. If
you do not understand anything, feel free to
e-mail me off list.
> What I'd like to propose is a simple way to encode ambisonic files in vorbis
> comments as simple tags. By this I don't mean a single change to the format
> itself or the codec, but a simple "official" tag so that hopefully, in the
> future, we'll have decoders complying with it. Nobody ever wants to take
> ambisonic storage off the ground in an *universal* fashion because there's
> no standard in encoding the *channel orderings*, what *channels are
> present*, and the *normalization*, and people don't agree on one thing for
> some reason. (perhaps being stubborn)
I was a little surprised that you did not discuss
the ".amb" format as this is the official file
format for Ambisonic B-Format.(3) This has
also been used in the wild for many years,
particularly at Ambisonia.com.(4) It uses
Furse-Malham component ordering and MaxN
normalization. (It also only works up to
third-order.)
Note that Vorbis is lossy. Ambisonics is picky
about low-frequency phase and, as far as I
know, nobody has checked the extent to which
this is preserved in Vorbis. This may or may
not be a problem. (It is obviously not a
problem with lossless FLAC.) Your proposal
can utilize all 256 Vorbis channels, which is
good; not all proposals for Ambisonics in
Vorbis have allowed this.
> My proposal is different because it solves all issues: it allows only
> Pantophonic (or planar/2D) signals if you wish, as probably most music and
> people will not even have a 3D system which includes height... at the same
> time you can specify a full 3D sphere encoding, or somewhere in between. The
> former is especially important because it needs far less number of channels
> and thus consumes far less amount of space and bandwidth, so instead the
> order of ambisonic or quality of the audio itself can be increased.
The "*.amb" format has the same property, so
your proposal is not *that* different.
> Note that this proposal is *infinitely* extensible to an arbitrary
> "ambisonic order", *and* it can specify the normalization. I haven't decided
> on the default normalization scheme, I'd like it to be N3D (why? well, just
> because? none is objectively superior but we have to agree on *something*
> for a standard) but it doesn't really matter as it can be specified.
The ".amb" format is limited to third-order and
uses MaxN normalization (with the exception
of a -3dB correction factor for W).(5)
Unfortunately, the coefficicents for the latter
cannot be specified algebraically above
third-order (but they can be calculated
numerically). Dumping this normalization is
therefore probably a good thing.
> Basically, it uses the ACN channel ordering described here:
> http://ambisonics.ch/standards/channels/ (it is mathematically defined by
> the relationship l*(l+1) + m; where l is the mathematical degree, and m is
> the mathematical order). (note that in ambisonics jargon, the 'order' of
> ambisonics actually refers to the mathematical degree)
The ".amb" format uses Furse-Malham
channel ordering.(6) This is complicated and
counter-intuitive, and was used only for
compatibility with (then) current practice.
Dumping it is therefore probably a good thing.
> However the filetypes are described here:
> http://ambisonics.ch/standards/filetypes/
>
> (Please note I have no affiliation with that site, I just found it and it is
> the best way to describe ambisonics material)
>
> This allows us to *uniquely* identify the channels used without wasting
> space on empty channels at all. Because you specify both the "degree" of the
> Pantophony and the "degree" of the height individually. The value (3,0)
> would thus mean "third order ambisonics pantophony" having channels
> 0,1,3,4,8,9,15 present with no height component at all because it is degree
> 0 for height, which means a 2D/Planar signal requiring *only* 7 channels
> instead of 16! Of course if you wanted a full-sphere 3D field, then you'd
> use (3,3) and get all 16 channels in the file. Lowering the second degree
> simply lowers the "order" or "resolution" of the height component.
>
> The important thing to remember is that by just these two values, the
> decoder knows *exactly* which channels are present and in what order,
> because they are defined precisely from it. No empty channels that waste
> space and bandwidth. Plus, the decoder is not confused as it knows exactly
> how and which channels and in what order they are present (there are only 7
> in the 2D case).
Your proposal for mixed order is the same as
the ".amb" format. It has the disadvantage
that as a source leaves the horizontal, its
sharpness degrades rapidly to that of the
height-order. An alternative scheme, which
does not have this problem, is "Complete
mixed-order sets".(7) However, I don't know
of anybody who has experience with decoding
such sets.
...
> The last thing to add is the normalization which I think can simply be added
> after a colon. Thus finally, my proposal would be to add tag like this as a
> vorbiscomment:
>
> AMBISONIC=(3,0):N3D
>
> The above tag defines a 2D planar file with "third order ambisonics" and no
> height at all, using the N3D normalization scheme. Thus, when a decoder sees
> this, it knows this file has 7 channels and they are ACN 0,1,3,4,8,9,15. The
> following tag:
>
> AMBISONIC=(3,3):SN3D
>
> defines a full-sphere 3D field using the SN3D normalization scheme. When the
> decoder sees it, it knows the file has 16 channels, them being ACN 0...15.
> (of course the decoder can refuse to decode if it cannot! that's beside the
> point!)
>
> Would acknowledging such a tag as official format be much trouble and to be
> added to the spec?
Adding new VorbisComment tags to the Vorbis
spec does not happen lightly; Xiph has an
official policy of neglect with respect to tags.
Back in July 2009 Xiph *asked* me to survey
what tags were being used in the wild, and to
propose additions to the Vorbis spec.(8) Even
these were not added. (Me, bitter and twisted?
Never!)
> I simply want an *official* way to send this very simple information
> requiring no more than just two values and the normalization scheme and
> store it in a file. I already use this tag format on my things right now
> (unreleased because I need to know it is the best way) because I really want
> to take Ambisonics off the ground (even for music which is what I do). I
> want it officially because then decoders will hopefully be made to comply
> with it. Alone, I have no power to influence that, sadly, so I turned to
> you.
>
> I need your help here. This can work in FLAC too with vorbiscomments. Maybe
> other formats will follow if they see this take off. And if possible it
> should work on any other format that can specify tags, like Opus, I just
> need the official recognition. There is zero change in the codec itself or
> the format, it's just an officially recognized tag in a way declared in the
> spec, so decoders can know how to comply. Please if you do take this to
> heart, and decide to implement it, feel free to describe it in much better
> detail or technical terms as needed. I just wanted to explain it in an easy
> to understand manner.
Note that Native FLAC is limited to 8 channels.
Obviously you could include two or more
FLAC streams in an Ogg container
(Ogg FLAC), and so have an unlimited
number of channels. However, in this case,
the metadata should not be in the FLAC
stream(s) but in the Ogg container. There is
no simple way to do this, but possibilities
include name-value pairs in an Ogg Skeleton
stream and a XMLEmbedding stream.(9)
Finally, your proposal only considers
Ambisonic B-Format. UHJ and G-Format are
also part of Ambisonics. There is an official
file format for UHJ,(10) and a proposal for
G-Format.(11) UHJ could be accommodated
into your proposal quite simply using the
VorbisComment "AMBISONIC:UHJ".
G-Format is a lot more complicated, and
should probably be ignored.
> If you have an alternative way to do this officially or a superior method
> (but this one proposed has *zero* shortcomings as far as storing ambisonic
> material is concerned that I'm aware of), please tell me so I will use it
> instead! Even if rejected, I will continue to use it just because I want to
> see it off the ground. I truly hope you'll consider it as an official tag
> format (I will encourage its use if so).
>
> Thank you for your time and once again I am sorry if I mailed this in the
> wrong section, as this isn't necessarily about the codec, but I did not know
> where to put it (because I'm not a developer).
Regards,
Martin
(1) https://mail.music.vt.edu/mailman/listinfo/sursound
(2) http://ambisonics.ch/mailman/listinfo/ambisonics
(3) http://members.tripod.com/martin_leese/Ambisonic/B-Format_file_format.html
(4) http://www.ambisonia.com/
(5) https://en.wikipedia.org/wiki/Ambisonic_data_exchange_formats#maxN
(6) https://en.wikipedia.org/wiki/Ambisonic_data_exchange_formats#Furse-Malham
(7) https://en.wikipedia.org/wiki/Mixed-order_Ambisonics#Complete_mixed-order_sets_.28.23H.23V.29
(8) https://wiki.xiph.org/Field_names
(9) https://wiki.xiph.org/Metadata#Ogg_Skeleton
(10) http://members.tripod.com/martin_leese/Ambisonic/UHJ_file_format.html
(11) http://members.tripod.com/martin_leese/Ambisonic/G-Format_chunk.html
--
Martin J Leese
E-mail: martin.leese stanfordalumni.org
Web: http://members.tripod.com/martin_leese/
More information about the Vorbis
mailing list