[Vorbis-dev] Muxing vorbis in WebM

Chris Cunningham chcunningham at chromium.org
Thu Jun 18 18:56:21 PDT 2015


Hi Vorbis-Dev,

I'm investigating various WebM/Vorbis bugs in chromium. AFAIK muxing Vorbis
inside of WebM does not have an official specification, so I'm using
ffmpeg's implementation to try to answer 2 questions:

   1. Under what circumstances is it valid to find WebM Blocks containing
   Vorbis data with zero duration? (This would mean the next Block in the
   Cluster has the exact same timecode).

   2. FFmpeg seems to use granulepos for presentation timestamp - is this
   correct? See here:

   https://github.com/FFmpeg/FFmpeg/blob/master/libavcodec/libvorbisenc.c#L345l

To me it seems the breakdown of ffmpeg's libvorbis_encode_frame is:

   - get an ogg_packet from libvorbis (I think this contains a *single*
   vorbis block, right?)
   - store just the data from that packet (no header) in ffmpeg's own
   AVPacket struct
   - eventually this AVPacket data will be inserted as the contents of a
   WebM (Matroska) block:

   https://github.com/FFmpeg/FFmpeg/blob/master/libavformat/matroskaenc.c#L1584

Starting with question 1, some important context comes from this excerpt of
the vorbis spec:


*Data is not returned from the first frame; it must be used to ’prime’ the
decode engine. The encoder accounts for this priming when calculating PCM
offsets; after the first frame, the proper PCM output offset is ’0’
- http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-190001.3.1
<http://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-190001.3.1>*

Am I right that "frame" and "block" are interchangable in the vorbis spec?
If so, I would then expect the granulepos of the first ogg_packet processed
by ffmpeg to be 0. I'm using this definition of granulepos:

*This is the last sample, frame or other unit of information ('granule')
that can be completely decoded from this packet
- https://xiph.org/ogg/doc/libogg/ogg_packet.html
<https://xiph.org/ogg/doc/libogg/ogg_packet.html>*

IIUC, we can derive duration from granulepos by simply scaling the count of
samples up by the number of samples / time unit. So my answer to question 1
would be: the first block could / should have 0 duration. And my answer to
question 2 would be: they should not use granulepos for presentation time -
they will always be off by the duration of a packet.

What do you guys think? What am I missing?

Thanks!
Chris
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.xiph.org/pipermail/vorbis-dev/attachments/20150618/c5532dfe/attachment.htm 


More information about the Vorbis-dev mailing list