[vorbis-dev] Importing Ogg Vorbis

Fri May 18 23:58:44 PDT 2001

Steve Nicolai wrote:

> I'm moving this discussion to the QuickTime-API list in hopes to
> reach a wider audience of QuickTime knowledge.
> 
> First a little background, to make sure that we're all on the same
> page (if I've got some of these facts wrong, please correct me):
> 
> 0.  Ogg Vorbis is a variable bitrate audio compression format.  Details
>     at http://www.xiph.org/ogg/vorbis/index.html

Correct.

> 1.  Neither Ogg nor Vorbis has information at the front of the stream
>     telling how long it is.  You need to look at the granulepos on the
>     last page in the Ogg framing to learn that.  The granulepos is the
>     absolute frame number in the raw stream of the last frame encoded in
>     that packet/page.

You need to get the granulepos of the first and last samples in the file;
Ogg Vorbis allows sample-precise stream cutting/editing.

> 2.  Vorbis has a large (~8K-16K) code book near the front of the stream
>     that must be received error-free to decode the rest of the stream.

The codebook can be several megabytes big, although we won't do that
normally, esp. because we want to be able to play Vorbis on embedded
stuff as well.  When streaming, the codebook (and other headers) are
normally transmitted out-of-band, over a reliable (e.g., tcp) transport.

> 3.  Vorbis packets are small (generally less than 200 bytes).

Generally, yes.

> 4.  Ogg framing groups vorbis packets into pages, preserving the
>     packet boundaries.  Pages are around 4K.

Don't depend on this, either...

> 5.  Ogg framing is allowed to fragment a packet across pages.

Yes.

> 6.  Ogg framing discards the granulepos information for all packets
>     except for the last complete packet on that page.

Note that this allows to recompute the granulepos for every packet.
This needs the Vorbis data, however; you can't do it at the Ogg level.

> 7.  Either an Ogg page or a Vorbis packet can be dropped/corrupt,
>     and the reader can get resynced at the start of the next page or
>     packet.

Yes.

8.  Not all packets in a stream contain the same number of samples; the
    first Vorbis packet for example won't give you any samples in the
    current Vorbis.  (Vorbis uses an overlapp/add transform, so the first
    packet alone can't completely decode any samples).

> Features I would like the QuickTime import and decompression component
> pair to have (because other tools have them :-)
> 
> A.  Opens/Imports files quickly and efficiently.  I.e. a 5MB .ogg file
>     on my iDisk shouldn't read the entire 5MB until I hit play.  (I'd
>     live with an in the background import)
> 
> B.  Seeks efficiently in the file.  Decoding vorbis takes a substantial
>     portion of the CPU on my 8500/120 MHz 604e, I don't want to wait

Expect big speed improvements over the next few months, esp. for PPC.

>     a minute to seek to the middle of the song, because QT has to decode
>     from the beginning.

I don't know anything about how QT handles seeking, so no comments from me
for this one.

> C.  Support fast-start HTTP streaming playback.
> 
> I realize that the goals are at odds with each other.  Creating a single
> sample description for the whole file would make A easy, but conflicts
> with B & C.  Dividing the file up into multiple sample descriptions
> pretty much means reading the whole file.

I'm a bit lost here.  I'd really like to see an example QT VBR codec :-)

> Sample descriptions in QuickTime are independent, they don't depend on
> any other sample description for state (at least as far as the sound media
> handler is concerned).
> 
> I've found that there is a set of newly "officially" undocumented import
> functions (in the QT 5 reference).  I'm guessing that MovieImportIdle
> and friends are the answer to C, and maybe A.  Any hints on how to implement
> these?

And do these exist in 4.1 as well?

> The bigger problem is the disconnect between the fragmenting of packets
> and the independence of sample descriptions.  I pretty much need a sample
> description per page, but the only way I can think of making that work
> with a fragmented packet is to extend the sample description to include
> a portion of the previous page that begins the fragment.  If a sample

The Ogg/Vorbis standard libraries allow you to get any number of samples
from a stream, not just complete packets or pages.  It does internal
buffering to achieve this.

> description ended with a fragment that would be discarded.  To properly
> validate the checksum on the page, I'd need to allow the sample descriptions
> to overlap, would that cause problems?
> 
> What I really want is a data format layer between the media handler
> and the data handler.  It could deal with these issues.
> 
> Suggestions?

CC:'ed to vorbis-dev, to have people double check on the Vorbis facts.

Cheers,

Segher

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'vorbis-dev-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.