[xiph-cvs] cvs commit: ogg/doc ogg-multiplex.html
Monty
xiphmont at xiph.org
Thu Feb 12 22:20:58 PST 2004
xiphmont 04/02/13 01:20:58
Added: doc ogg-multiplex.html
Log:
In progress; avoid losing work
Revision Changes Path
1.1 ogg/doc/ogg-multiplex.html
Index: ogg-multiplex.html
===================================================================
<HTML><HEAD><TITLE>xiph.org: Ogg documentation</TITLE>
<BODY bgcolor="#ffffff" text="#202020" link="#006666" vlink="#000000">
<nobr><a href="http://www.xiph.org/ogg/index.html"><img src="white-ogg.png" border=0><img src="vorbisword2.png" border=0></a></nobr><p>
<h1><font color=#000070>
Page Multiplexing and Ordering in a Physical Ogg Stream
</font></h1>
Last update to this document: February 13, 2004</em><br>
<p>
The low-level mechanisms of an Ogg stream (as described in the Ogg
Bitstream Overview) provide means for mixing multiple logical streams
and media types into a single linear-chronological stream. This
document discusses the high-level arrangement and use of page
structure to multiplex multiple streams of mixed media type within a
physical Ogg stream.
<h2>Design Elements</h2>
<h3>Chronological arrangement</h3>
The Ogg bitstream is designed to provide data in a chronological
(time-linear) fashion. This design is such that an application can
encode and/or decode a full-featured bitstream in one pass with no
seeking an minimal buffering. Seeking to provide optimized encoding
(such as two-pass encoding) or interactive decoding (such as scrubbing
or instant replay) is not disallowed or discouraged, however no
bitstream feature must require nonlinear operation on the
bitstream.<p>
<i>As an example, this is why Ogg specifies bisection-based exact seeking
rather than building an index; an index requires two-pass encoding and
as such is not acceptible according to original design requirements.
Even making an index optional then requires an application to support
multiple methods (bisection search for a one-pass stream, indexing for
a two-pass stream), which adds no additional functionality as
bisection search delivers the same functionality for both stream
types.</i><p>
<h4>Multiplexing</h4>
Ogg bitstreams multiplex multiple logical streams into a single
physical stream at the page level. Each page contains an abstract
time stamp (the Granule Position) that represents an absolute time
landmark within the stream. After the pages representing stream
headers (all logical stream headers occur at the beginning of a
physical bitstream section before any logical stream data), logical
stream data pages are arranged in order of chronological absolute time
as specified by the granule position. <p>
The only exception to arranging pages in strictly ascending time order
by granule position is those pages that do not set the granule
position value. This is a special case when exceptionally large
packets span multiple pages; the specifics of handling this special
case are described later under 'Continuous and Discontinuous
Streams'.<p>
<h4>Buffering</h4>
Ogg's multiplexing design minimizes extraneous buffering required to
maintain audio/video sync by arranging audio, video and other data in
chronological order. Thus, a normally streamed file delivers all
data for decode 'just in time'; pages arrive in the order they must
be consumed.<p>
Buffering requirements need not be explicitly declared or managed for
the encoded stream; the decoder simply reads as much data as is
necessary to keep all continuous stream types gapless (also ensuring
discontinuous data arrives in time) and no more, resulting in optimum
buffer usage for free. Because all pages of all data types are
stamped with absolute timing information within the stream,
inter-stream synchronization timing is always explicitly
maintained.<p>
<h2>Granule Position</h2>
<h3>Description</h3>
The Granule Position is a signed 64 bit field appearing in the header
of every Ogg page. Although the granule position represents absolute
time within a logical stream, its value does not necessarily directly
encode a simple timestamp. It may represent frames elapsed (as in
Vorbis), a simple timestamp, or a more complex bit-division encoding
(such as in Theora). The exact meaning of the granule position is up
to a specific codec.<p>
The granule position is governed by the following rules:
<ul>
<li>Granule Position must always increase forward from page to page,
be unset, or be zero for a header page.<br>
<li>Granule position may only be unset if there no packet defining a
time boundary on the page (that is, if no packet in a continuous
stream ends on the page, or no packet in a discontinuous stream begins
on the page. This will be discussed in more detail under Continuous
and Discontinuous streams).<br>
<li>A codec must be able to translate a given granule position value
to a unique, exact absolute time value through direct calculation. A
codec is not required to be able to translate an absolute time value
into a unique granule position value.<br>
<li>Codecs shall choose a granule position definition that allows that
codec means to seek as directly as possible to an immediately
decodable point, such as the bit-divided granule position encoding of
Theora allows the codec to seek efficiently to keyframes without using
an index.
</ul>
<h3>granule position, packets and pages</h3>
Although each packet of data in a logical stream theoretically has a
unique granule position, only one granule position is encoded per
page. It is possible to encode a logical stream such that each page
contains only a single packet (so that granule positions are preserved
for each packet), however a one-to-one packet/page mapping is not
intended for the general case.<p>
A granule position represents the instantaneous time location
between two pages</em>. In a continuous stream, the granulepos
represents the point in time immediately after the last data decoded
from a page. In a discontinuous stream, it represents the point in
time immediately before the first data decoded from the page.<p>
Because Ogg functions at the page, not packet, level, this
once-per-page time information provides Ogg with the finest-grained
time information is can use. Ogg passes this granule positioning data
to the codec (along with the packets extracted from a page); it is
intended to be the responsibility of codecs to track timing
information at granularities finer than a single page.<p>
<h3>Example: timestamp</h3>
In general, a codec/stream type should choose the simplest granule
position encoding that addresses its requirements. The examples here
are by no means exhaustive of the possibilities within Ogg.<p>
A simple granule position could encode a timestamp directly. For
example, a granule position that encoded milliseconds from beginning
of stream would allow a logical stream length of over 100,000,000,000
days before beginning a new logical stream (to avoid the granule
position wrapping).<p>
<h3>Example: framestamp</h3>
A simple millisecond timestamp granule encoding might suit many stream
types, but a millisecond resolution is inappropriate to, eg, most
audio encodings where exact single-sample resolution is generally a
requirement. A millisecond is both too large a granule and often does
not represent an integer number of samples.<p>
In the event that a audio frames always encode the same number of
samples, the granule position could simple be a linear count of frames
since beginning of stream. This has the advantages of being exact and
efficient. Position in time would simply be <tt>[granule_position] *
[samples_per_frame] / [samples_per_second]</tt>.
<h3>Example: samplestamp (Vorbis)</h3>
Frame counting is insufficient in codecs such as Vorbis where an audio
frame [packet] encodes a variable number of samples. In Vorbis's
case, the granule position is a count of the number of raw samples
from the beginning of stream; the absolute time of
a granule position is <tt>[granule_position] /
[samples_per_second]</tt>.
<h3>Example: bit-divided framestamp (Theora)</h3>
Some video codecs may be able to use the simple framestamp scheme for
granule position. However, most modern video codecs introduce at
least the following complications:<p>
<ul>
<li>video frames are relatively far apart compared to audio samples;
for this reason, the point at which a video frame changes to the next
frame is usually a strictly defined offset within the frme 'period'.
That is, video at 50fps could just as easily define frame transitions
<.015, .035, .055...> as at <.00, .02, .04...>
<li>frame rates often include drop-frames, leap-frames or other
rational-but-non-integer timings
<li>Decode must begin at a 'keyframe' or 'I frame'. Keyframes usually
occur relatively seldom.
</ul>
<p><p> Can seek quickly to any keyframe without index
Naieve seeking algorithm still availble; juyst lower performance
Bisection seeking used anyway
<h2>Multiplex/Demultiplex Division of Labor</h2>
The Ogg multiplex/deultiplex layer provides mechanisms for encoding
raw packets into Ogg pages, decoding Ogg pages back into the original
codec packets, determining the logical structure of an Ogg stream, and
navigating through and synchronizing with an Ogg stream at a desired
stream location. Strict multiplex/demultiplex operations are entirely
in the Ogg domain and require no intervention from codecs.<p>
Implementation of more complex operations does require codec
knowledge, however. Unlike other framing systems, Ogg maintains
strict seperation between framing and the framed bistream data; Ogg
does not replicate codec-specific information in the page/framing
data, nor does Ogg blur the line between framing and stream
data/metadata. Because Ogg is fully data agnostic toward the data it
frames, operations which require specifics of bitstream data (such as
'seek to keyframe') also require interaction with the codec layer
(because, in this example, the Ogg layer is not aware of the concept
of keyframes). This is different from systems that blur the
seperation between framing and stream data in order to simplify the
seperation of code. The Ogg system purposely keeps the distinction in
data simple so that later codec innovations are not constrained by
framing design.<p>
For this reason, however, complex seeking operations require
interaction with the codecs in order to decode the granule position of
a given stream type back to absolute time or in order to find
'decodable points' such as keyframes in video.
<h2>Continuous and Discontinuous Streams</h2>
<h3>continuous description</h3>
A stream that provides a gapless, time-continuous media type is
considered to be 'Continuous'. Clear examples of continuous data
types include broadcast audio and video. Such a stream should never
allow a playback buffer to starve, and Ogg implementations must buffer
ahead sufficient pages such that all continuous streams in a physical
stream have data ready to decode on demand.<p>
<h3>discontinuous description</h3>
A stream that delivers data in a potentially irregular pattern or with
widely spaced timing gaps is considered to be 'Discontinuous'. An
examples of a discontinuous stream types would be captioning.
Although captions still occur on a regular basis, the timing of a
specific caption is impossible to predict with certainty in most
captioning systems.<p>
<h3>declaration</h3> An Ogg stream type is defined to be continuous or
discontinuous by its codec. A given codec may support both continuous
and discontinuous operation so long as any given logical stream is
continuous or discontinuous for its entirety and the codec is able to
ascertain (and inform the Ogg layer) as to which after decoding the
initial stream header. The majority of codecs will always be
continuous (such as Vorbis) or discontinuous (such as Writ).
<h3>continuous granule position</h3>
<p><p><h3>discontinuous granule position</h3>
it is able to definitively from the Ogg layer
<p><p><p>Topics:
Granpos mapping set by decoder
header decode (codec plugin) required to decode granpos
rationale:
must map back to absolute time
Examples of granpos mappings
a) Vorbis (fixed rate)
b) Theora (bit-field for keyframe)
c) absolute time
Continuous Stream Type
Discontinuous stream type
MNG: variable framerate, possibly discontinuous; two code mappings?
flushes around keyframes? RFC suggestion: repaginating or building a
stream this way is nice but not required
<p><h2>Appendix A: discussion excerpts</h2>
Developers at Xiph.Org have discussed the details of Ogg multiplexing
on many occasions on Internet Relay Chat. The earliest conversations
regarding discontinuous streams and granule ordering between Monty
<xiphmont> and Jack Moffitt from 1999 weren't logged, but much
of the same material is rehashed in the three excerpts below.<p>
The primary purpose of these excerpts is to illuminate a number of
subtle points through logged conversations. The cornerstones of the
Ogg muxing specification were long set at this point, however the
excerpts capture discussion of proposed innovations within the
original specification and the reasoning behind each proposal as well
as discussing long-decided details.<p>
These excerpts have been edited from the original verbatim IRC log to
remove off-topic chatter and correct occasional typos.<p>
<h3>excerpt one</h3>
This excerpt discusses:
<ol>
<li>video keyframe flagging via granule position bit-division technique.
<li>Division of labor during seeking between codec and Ogg demuxer
</ol>
<pre>
<mau> guys, how can we test seeking, etc? are changes needed in the
ogg framework?
<mau> like seeking to keyframes?
<rillian> mau: nope, just player support
<mau> ok, so what would be the strategy? seek to an arbitrary time,
and wait for a keyframe?
<mau> yeah, currently there is the hack in granulepos, right?
<mau> maybe just a macro?
<danx0r> I've heard about it -- some sort of bitfield division
<danx0r> lower bits are frames after a key
<xiphmont> you can seek to a given location. the hack in granpos
gives you the number for every keyframe.
<danx0r> keyframes increase by some set increment -- can someone confirm?
<xiphmont> yes
<rillian> xiphmont: I thought it wasn't necessarily fixed
<mau> or is it up to the player?
<xiphmont> it's fixed for a given stream section.
<danx0r> so if you seek naively now, you'll get garbage until the next kf?
<mau> I think it is up to the player to freeze the last known good image
<mau> until a keyframe passes, much like windows media, etc
<xiphmont> you know if you're not in sequence.
<danx0r> the right thing is to go to the previous keyframe and parse up to
your seek frame faster than realtime, but...
<danx0r> for now, something like what WMP does should be fine
<Mike> mau: or, if it's a smart player (and the data source allows it),
to deliberately seek forwards to the next keyframe.
<rillian> are you talking about the radix rather than the actual keyframe
rate?
<mau> mike: going forward is ok, but in wmp you can still read audio
for example, until the next video keyframe, where video resumes
<mau> it is also a good strategy, guess it depends on the player
<xiphmont> rillian: the stream is set up to have a maximum keyframe spacing.
Granpos is updated by a fixed amount at each keyframe. The
granpos is not [necessarily] monotonically increasing
<Mike> true.
<rillian> it's monotonic, but not (necessarily) linear
<mau> xiphmont: so ideally the player would look at the granulepos and
count how many frames since the last key, and seek back that many
pages?
<xiphmont> mau: Ogg seeking is all done as predicted bisection search.
<xiphmont> look in vorbisfile to see code that does it.
<derf> If one encodes in a frame how many frames it has been since a
keyframe, couldn't you do the same thing?
<derf> Without imposing a maximum keyframe spacing?
<xiphmont> that data does not exist in an ogg header.
<xiphmont> Ogg headers use absolute counters.
<derf> I meant in the packet data, but I see what you're saying.
<xiphmont> you get that out of the granpos hack anyway.
<derf> You have to start decoding the packet to tell where to get the
keyframe.
<xiphmont> Seeking in an ogg stream does not look at packets.
<rillian> (except you have to parse the header to do granulepos conversion)
<xiphmont> yes.
<xiphmont> although it may be sensible to change that.
<derf> You already need at least a page worth of data to check the CRC
on the ogg header to seek.
<derf> It would seem reasonable to require a full packet instead, and
pass this to the codec when asking where to seek next.
<xiphmont> derf: a page does not necessarily give you a packet.
<derf> xiphmont: I know.
<derf> xiphmont: But, allowing the codec to look at the packet better
supports embedding codecs which might not be able to determine
the position of a keyframe from their granpos alone.
<xiphmont> derf: why wouldn't they? Blind refusal to use the mechanisms at
hand?
<derf> The reason this concerns me is that the case where you want to
have really long spaces between key frames (streaming) is also
exactly the place where you want to allow very long streams.
<xiphmont> you have a 64 bit granpos.
<derf> And if I never want a keyframe except at the first frame, I now
have only 32.
<xiphmont> ...and you're welcome to use as many logical sections as you want.
<xiphmont> so, now you have 96 bits.
<derf> Okay. I guess I can live with a keyframe every 4 billion frames.
<xiphmont> if you want unique serialnos; you're allowed to wrap them in
streaming, so it becomes infinite.
<xiphmont> if you're streaming with one keyframe every 4G, you'll have no
viewers anyway :-)
<derf> That's what out-of-band synch points are for.
<xiphmont> sure, that works.
<xiphmont> Now, it's possible to do a 'seek requests are handed to the codec,
not to ogg' infrastructure, then the codec makes bisection calls
into the ogg layer.
<xiphmont> it's more complex, and I'm not sure what I really get out of it.
<derf> Well, the codec doesn't really need to do that.
<xiphmont> in fact, I'm beginning to wonder if moving the granpos parsing
away from relying on header at all might be a good idea.
<derf> The codec really just wants "give me the packet at this granpos"
<derf> The bisection can still be done in the ogg layer to find that
packet.
<xiphmont> derf: same basic division of labor.
<xiphmont> the request still originates at the codec.
</pre>
<p><h3>excerpt two</h3>
This excerpt discusses:
<ol>
<li>keyframe pagination in video
<li>keyframe seeking using granule position bit-division
<li>alternate keyframe location proposals
</ol>
<pre>
<rillian> afaik that's just a detail of smpte timecode
<xiphmont> ...and preserving pulldown and non-interval-centered frames.
<rillian> ugh
<xiphmont> (ie, what offset in the sample period is the frame)
<xiphmont> yeah, ugliness.
<xiphmont> but not really representationally difficult.
<rillian> speaking of, do you see any advantage to doing page flushes
before or after keyframes?
<rillian> either to simplify seeking or initialization retention in
something like icecast
<xiphmont> it doesn't affect seeking any, really. It makes streaming
slightly easier for lazy programmers.
<rillian> xiphmont: do you mean icecast should pull out the keyframe packet
and repage it?
<xiphmont> rillian: if there's no flush, then it should as an optimization.
It's not necessary, but it's nice.
<xiphmont> either the streamer or the source should be smart enough to start
streaming at a nice sync point for a and v.
<rillian> xiphmont: so how would you do frame-accurate seeking with the
current design?
<rillian> the concern as I understand was that there wasn't a page/packet
that was specifically labelled 'this is a keyframe' at the ogg layer
<xiphmont> rillian: same way vorbis does. Each frame does have a granpos,
they're just not monotonic.
<rillian> s/wasn't/might not be/
<xiphmont> ah, yes there is.
<derf> Wait, they're not monotonic?
<xiphmont> no, just guaranteed to increase.
<derf> Oh... whew.
<derf> Different definitions of monotonic.
<mau> sorry for being slow, but when you say "Frame" is this a packet,
a page?
<derf> I thought the encoding was
frame_number_of_keyframe<<n|frames_since_keyframe
<xiphmont> right now, each theora frame is one packet.
<xiphmont> derf: yes.
<derf> As far as I can see, we can work backwards and reconstruct a
packet-level granpos for each packet so long as that is still true.
<derf> Once you include data partitioning a la MPEG, you lose that ability.
<mau> k, but if you put many packets in a page, then you do not have one
for each, right? It is just a matter of counting up, and not
allowing keyframes in the middle of a page?
<xiphmont> 'monotonically increasing' == 'increasing by one'
<derf> mau: No.
<derf> You can still put keyframes anywhere.
<xiphmont> actually, my Ogg algos counts forward from previous page generally.
<mau> simple question: if there are multiple frames in a page, does the
ogg layer maintains a granulepos for each?
<xiphmont> mau: It could, it doesn't.
<xiphmont> (requires being even more in bed with the codec. And that is
currently the greatest point of contention in my own mind)
<mau> ok. and how to detect when a keyframe arrives in the middle of a
page?
<xiphmont> mau: the codec knows. Ogg doesn't.
<mau> that's what I needed to know. So the codec initiates the seeking
request
<xiphmont> Ogg knows only how to get to a requested granpos.
<derf> Oh, no, you can't always get a granpos back for every packet.
<xiphmont> mau: it doesn't have to; that's one possible way to do it, yes.
<derf> You can still put keyframes in the middle of pages, but if you put
two of them in one page...
<xiphmont> derf: you can, but only going forward.
<xiphmont> Ogg is built on the idea of chronological decode; data propagates
forward in time.
<derf> If I encode PIPPIP in one page, I have no way of knowing the first
I is there just by looking at granposes.
<xiphmont> no, but you have other data in the page; namely, the codec should
be able to tell by looking at first byte.
<xiphmont> It is a consequence of Ogg having no codec-specific awareness.
<derf> Yes, but even the codec cannot tell with just the granposes.
<xiphmont> correct, but the codec need not function only with granpos.
<xiphmont> the codec knows its own keyframes.
<derf> If the codec need not function only with granposes, then why are
we trying to build a seeking mechanism that works with just them?
<xiphmont> division of labor; Ogg is able to hand you any *page*, not any
*packet*.
<xiphmont> even Vorbis does this.
<mau> ok, wouldn't it be better to require each new keyframe to start a
new page then?
<xiphmont> Ogg hands you the nearest preceding page for the codec to then
discard the minimum amount of page data to get to the packet it
wants.
<mau> to make seeking easier/faster/lazier?
<xiphmont> but it doesn't.
<xiphmont> Seek to page. Start grabbing packets.
<derf> xiphmont: Yes, I understand this, but...
<xiphmont> Discard packets until you see a keyframe
<mau> k
<xiphmont> Ogg would have to do the same thing.
<mau> I see
<xiphmont> You *can* if you want to, certainly.
<derf> Say that page I gave above starts on frame n.
<xiphmont> There's nothing stopping or even discouraging you ;-)
<xiphmont> derf: OK
<derf> I want to seek to frame n+3.
<xiphmont> OK
<derf> I get that page's granpos, and discover there's a keyframe at frame
n+4.
<xiphmont> Ogg, in seeking, hands you the page that is guaranteed to have the
start of n+3.
<derf> I know nothing about the type of packets n to n+3.
<xiphmont> (or, more importantly, hands you the page guaranteed to have the
keyframe you need to decode n+3)
<derf> Without physically examining the packets.
<xiphmont> true. Neither does Ogg.
<derf> So I have to go all the way back to the previous keyframe to
decode them.
<xiphmont> No.
<xiphmont> You already have it for free.
<xiphmont> Assume the keyframe shift in granpos is 8.
<derf> Okay.
<xiphmont> (you get a new keyframe at most every 256 packets)
<derf> Yeah, I know what this translates to.
<xiphmont> but the current actual pattern is: IPPPPPIPPPPPIPPPP....
<xiphmont> your granposes are:
<xiphmont> 0 1 2 3 4 5 600 601 602 603 604 605 c00 c01 c02....
<xiphmont> you want to decode frame 602; seek to 600.
<xiphmont> and you know you have to seek directly to 600 because you know how
the granpos works.
<xiphmont> 600 is your keyframe.
<xiphmont> if 600 does not start the page, ogg hands you the page with 600 on
it.
<rillian> so you get a page with, for example, the end of 4, 5, 600, and the
start of 601
<rillian> you start pulling out packets
<rillian> discard until you get to 600, which you decode
<derf> xiphmont: But, I don't know the frame is called 602.
<rillian> pull in the next page, pull out 601 and discard it
<derf> I want to seek to frame 8.
<rillian> then pull out 602 and resume normal decode
<derf> All I know is that its granpos is <= 800.
<xiphmont> now, you're right; always having a keyframe start a page
eliminates some amount of inspect/discard; but you can
inspect/discard in a few processor cycles.
<rillian> xiphmont: aye. seems a requirement to avoid the discard isn't needed
<xiphmont> derf: OK, then it's a 2-stage bisection. you ask ogg for 'page
before 800'; you see that the granpos is 600+whatever.
then seek to 600.
<xiphmont> (or, Ogg could do that internally with knowledge of the granpos
structure)
<mau> k, this last one explained it for me
<derf> xiphmont: Right, but here's the issue:
<derf> In my PIPPIP example, Ogg doesn't know the granpos of the first 4
packets.
<xiphmont> sure.
<derf> And the codec can reconstruct them just from the granpos of the
page.
<derf> s/can/can't
<xiphmont> sure it can.
<derf> How?
<xiphmont> the count is *reducible* to a monotonically increasing function :-)
<xiphmont> (assuming you have two granposes)
<xiphmont> you're always counting up or down one frame.
<rillian> i.e. you actually need the previous page in derf's example
<derf> rillian: But the previous page doesn't tell you anything about
packets 1-4.
<xiphmont> yes, the first 'P' is undefined granpos without previous page.
<xiphmont> ...but if your stream is not starting with a keyframe, that P
frame is not decodable anyway.
<derf> Let's say the previous granpos is 0|F0
<rillian> derf: ok, I see. I was misunderstanding the granulepos hack.
<xiphmont> derf: yes it does. If gives you the granpos of the first packet.
<xiphmont> (ie, it gives you the granpos of the last frame of the previous
packet, and you can always count forward)
<derf> Then the granpos for those frames can be F1|00 F1|01 F1|02 F1|03
or 0|F1 F2|00 F2|01 F2|02 or ...
<xiphmont> you [the codec] knows if they're keyframes or not.
<derf> Only if I look at the packets themselves.
<xiphmont> yes.
<derf> My claim was that there was no way to do it without looking at the
packets.
<xiphmont> blow 10 cycles on inspecting, and avoid the need for a 64 bit
timestamp on every packet :-)
<derf> I'm not arguing for a timestamp.
<xiphmont> Oh. Yes, your claim is correct. Apologies.
<rillian> but it still doesn't matter much, because discarding as you go
through a single page is cheap
<xiphmont> You need to inspect the packets. It is the responsibility of the
codec definition to make that easy.
<derf> My argument is this: If I have to inspect the packets ANYWAY for
this to work right, why am I going through this complicated granpos
scheme instead of just using a normal, sane mapping of
frame=granpos, and storing an offset to the keyframe in the packet?
<xiphmont> (Vorbis places that information in the first byte)
<xiphmont> derf: the information is redundant.
<xiphmont> Yes, you certainly *can* do it that way.
<xiphmont> I'm even still considering it. it does have advantages.
<mau> monty: if the granulepos hack is made "official" and mandatory
for other video codecs however, you could have ogg doing the
inspection, right?
<xiphmont> OTOH, I'm also considering hardwiring a number of granpos
mechanisms into Ogg such that it can seek without any codec
knowledge.
<xiphmont> the two approaches are mutually exclusive (at least, rationally so)
<xiphmont> mau: yes, what you said.
<derf> I do not see how you're going to be able to accomplish seeking
without codec knowledge.
<derf> I thought I had just demonstrated why your current scheme cannot
do this.
<xiphmont> derf: not entirely; however, you could achieve enough to avoid
the need for two-way feedback between the mux and codec layers.
The current proposal (which includes this two way feedback) is
very unusual and causing outside developers fits.
<xiphmont> for example, it means the Ogg demux has to interface with an
Ogg-like codec glue.
<derf> I had always assumed this was part of the design.
<derf> By saying, to begin with, "the codec decides what granpos means".
<xiphmont> the current normal division of demux and decode has a different
division; it would make it hard to use Ogg as a generic demux
system in something like xine, where the 'vorbis' codec could
just as easily handle the output from AVI or Ogg demux.
<xiphmont> derf: it always has been. That doesn't mean I'm ignoring the
advantages of alternatives.
<xiphmont> it is not yet at the point where changing my mind would break
existing installations, so it's still worth debating. That said,
I've seen nothing yet to change my mind.
<derf> The vorbis "codec" really has two pieces.
<derf> One manages decoding the packets.
<xiphmont> one manages the Ogg mapping.
<derf> Right.
<derf> The first can be separated out and used for other container formats.
<derf> The other containers are then responsible for providing an
equivalent of the second.
<xiphmont> ...and we probably can't escape needing *some* glue for any given
codec.
<xiphmont> even if we strive to make the division similar.
<xiphmont> 'similar' is not 'identical'.
<xiphmont> that is the primary reason I've not changed my mind. Being in
bed with the codec makes possible demux/decode lib APIs with some
very nice features.
<xiphmont> (ala Vorbisfile)
<xiphmont> So, it sounds like we're entirely on the same page.
<xiphmont> [pun not intended]
<derf> Yes, except that if you're in bed with the Theora codec, you
shouldn't need this complicated of a granpos mapping.
<derf> And I still don't see what it gets you.
<mau> let me see if I understand you derf: if you are going to have to
inspect the packets anyway
<mau> why don't you use a linear count?
<mau> is this it?
<derf> mau: Correct.
<mau> guess the hack can possibly give you a closer location
<rillian> the case with mng is interesting. it's natively variable framerate
(or more properly can be) so some realtime base (it has a field for
mapping 'ticks' to seconds) is the obvious granulepos. Except it
has the same keyframe problem theora does, and it's worse because
while identifying a restart point is easy (there's a special chunk
type) the codec has to do quite a bit more work to determine which
pieces are skippable
<derf> Actually, it gives you a farther one.
<xiphmont> derf: it wastes space.
<xiphmont> you certainly can do it that way. You'll sink additional bitrate
to do it.
<derf> xiphmont: Yes, it does move a few bits that are currently in the
granpos into the packets.
<derf> mau: If I want to seek to frame 8, and I ask for the granpos
closest to 800, I get 605... three packets beyond where I want to
be.
<xiphmont> yeah, you'll lose ~ half a kilobit to it.
<xiphmont> depending on framerate/keyframe freq.
<derf> I don't have my H.264 spec on hand, but IIRC, they do the same
thing.
<xiphmont> However:
<xiphmont> If you're a minimalist demux layer without precise seek....
<xiphmont> you can go straight to a keyframe with the granpos hack.
<xiphmont> (without asking the codec)
<xiphmont> that's probably the last minor perq.
<derf> "without precise seek" can be up to 2**keyframe_shift frames off.
<xiphmont> ...which is exactly what mplayer and xine do.
<xiphmont> you get the next following keyframe past what you ask for.
<xiphmont> ...and they could continue to use their demux framework.
<xiphmont> ...and it will give the results they're already getting.
<xiphmont> (something tells me there will be outside devs wedded to their
current libs)
<rillian> which is why you did this in the first place?
<xiphmont> well, yeah.
<xiphmont> *I* want everything to always be perfect and correct :-)
<xiphmont> you can do it either way. Which is not to say derf doesn't have a
point.
<derf> xiphmont: Perfection can take an awful lot of effort, as exhibited
by this long drawn out conversation, which I'm sure is not the first
one.
<xiphmont> you could still do the Xine way with explicit keyframe offset in
the packet, you just get a blank video until you hit a keyframe,
or just discard alot.
<xiphmont> (note that xine/mplayer also do that in alot of codecs. Actually
xine has an annoying tendency to start decoding P and B frames
starting with a uniform green field)
<derf> Heh.
<xiphmont> and not bothering to wait for keyframe.
<xiphmont> So, in summary, derf's offset gives a much simpler mechanism, but
eats a bit of bitrate (.5-1 kilobit) and makes it harder for
pansy-ass demux layers to get to keyframes. The granpos hack
method has the drawback of conceptual complexity although I
maintain the code isn't actually any more difficult.
<xiphmont> you need to know the additional information of 'keyframe shift'.
<derf> It also adds a limit to the amount of frames between a keyframe.
<derf> One which, unlike MPEG, the underlying codec doesn't actually need.
<xiphmont> yes, but for seekable video, if you're only having a keyframe
every 30,000 frames, you're being a little too 1337.
<xiphmont> it is also the case that if we settle on one mapping, and it
turns out to be a bad idea, we change the glue. Supporting both
would require little.
<xiphmont> it looks like a 'new' codec, but uses all the same infrastructure.
<derf> That just means you have all the software inadequacies of both,
since players will then be required to support both.
<derf> So any arguments of "simpler" become meaningless.
<xiphmont> you were just now arguing 'more flexible' (no keyframe spacing
restriction)
<derf> I didn't say the other arguments were meaningless.
<xiphmont> no.
<xiphmont> you didn't.
<xiphmont> I'm just saying the penalty for being wrong is pretty mild.
<derf> I'm suggesting that the reality of the situation is that whatever
you decide now is going to be it, because no one will want to
complicate matters that much for the relatively mild gains of
"slightly more flexible".
<derf> Or, for that matter, "slightly easier braindead demuxers".
<xiphmont> In any case, I don't actually want to cut the lightweight
mplayer style approach out of the picture.
<xiphmont> the granpos hack does give him slightly more rope, should he
choose to use it. I realize it's a weak argument, but it's there.
<derf> Oh, and if you really wanted to, you could eliminate the stream
space overhead for the keyframe offset.
<derf> You have to load all the previous pages ANYWAY, to decode back to
that point.
<derf> So you could load them, scan them backwards for keyframes, and
then turn around and decode them forward.
<derf> The only overhead is the additional buffer space. Or time for
multiple I/Os if you run out of that.
<xiphmont> derf: seeking backward is more expensive than forward.
</pre>
<h3>excerpt three</h3>
This excerpt discusses:
<ol>
<li>introduction of discontinuous streams
<li>ordering of pages in a multiplexed Ogg stream
<li>ordering differences between continuous and discontinuous streams
<li>text/captioning streams and captioning examples
<li>seeking withing a multiplexed Ogg stream
</ol>
<pre>
<Arc> hey monty
<Arc> have some questions about oggfile w/ streaming servers
<Arc> and how codecs get interlaced in a physical bitstream
<Arc> first, whats the process for codecs to get concurrently
multiplexed. i know how pages etc etc, but how do the pages get
paced?
<xiphmont> chronological order by granpos.
<Arc> the granulepos of vorbis means nothing in relationship to theora
<Arc> and in the case of writ, it means nothing at all. they're ordered
by granulepos but they're needed by their start time, which is
something only libwrit would know
<Arc> how is theora and vorbis being synced, i mean, their pages as
close to each other as needed by the player?
<xiphmont> chronological order. Ogg will ask the codec to translate granpos
to absolute time if it needs to know.
<Arc> um ok so that isn't going to work at all for writ
<Arc> granulepos = end time, not start time.
<Arc> but for seeking it needs end time
<xiphmont> granpos *is* end-time :-)
<xiphmont> granpos is 'timing of last valid data to come out of this page'.
<Arc> but if writ packets are put into the stream in the chronological
position of their end time they wont be available for their start
time, which is a variable length before their end time
<Arc> writ packets cover time ranges. "this packet is valid between this
granule and this granule", so there's a start and end time
<xiphmont> right.
<xiphmont> so do vorbis packets.
<Arc> currently the spec is setup to allow overlap of these times by
different phrases and page granulepos = endtime, packets ordered
by end time (so some phrases may be put into the bitstream before
they're started)
<xiphmont> the seeking alg depends on end time.
<Arc> yes im not concerned with seeking, we have seeking in the bag
except for long term phrases + streaming, lets ignore that for now
tho
<Arc> im concerned about they're ordering in the logical bitstream
<xiphmont> You may have opened too large a can of worms with overlapping.
<Arc> if a writ phrase lasts 10 seconds it needs to be in the physical
bitstream close to or before its start time, relative to the
vorbis/theora, you can expect the vorbis + theora layer to be
buffered for ten seconds
<derf> xiphmont: Overlapping does not complicate the problem at all.
<xiphmont> derf: actually it kills the current seeking algo.
<Arc> no it doesn't actually
<derf> You can replace any group of overlapped captions by a single
caption that lasts the entire duration of it.
<derf> And reproduce any problems.
<Arc> the granulepos's are in order. the granulepos's are ordered by end
time, their start times are not in order, but they must be defined
before they're needed (or close to it) in relation to the other
logical bitstreams for them to be useful
<xiphmont> One caption that begins before and ends after another.
<derf> xiphmont: Which exhibits the exact same problems as just one
caption.
<xiphmont> design a seeking algo that works for that.
<derf> Conceptually, you can take any group of overlapping captions and
stick them all in one packet.
<Arc> we do. you seek to the position that you need and begin processing
from there. you'll have everything.
<xiphmont> actually, yes, you're right.
<Arc> my first question (these are very related) is how OggFile,
oggmerge, whatever - how does that sync. do they ask the codec to
pace per realtime, or does it ask the codec for a granulerate
<xiphmont> if the packet ended after the seek point, it wouldn't have
appeared yet.
<Arc> because the latter will break our current spec bigtime
<xiphmont> there are two possibilities; still working out which to use.
<xiphmont> One is two codec types: continuous and discontinuous.
<xiphmont> a continuous codec specifies 'buffer as much as you need to
prevent any time gaps in my data presentation'. A discontinuous
stream type has to 'fall out' of the stream; seeking and sync are
according to continuous streams, and the stream assembly has to
make sure the discontinuous pages magically arrive in time
<xiphmont> [as the buffering/sync algo will not look arbitrarily far head for
them]
<derf> This sounds much like what I suggested to Arc.
<xiphmont> the second possibility is to require a hint in the metaheader for
how long each stream type has to look ahead.
<xiphmont> Audio and video would be obvious continuous types.
<xiphmont> discontinuous types would not be used for sync; the granpos is
allowed to appear out of order.
<Arc> well my question is, will libwrit/etc be asked "where does this
packet belong in the physical bitstream" or will OggFile/etc place
it by granulepos
<xiphmont> Oggfile will place it.
<Arc> yes but how
<Arc> will it ask the codec?
<xiphmont> You don't muck with pages and raw ogg stream in Oggfile. packets
in, packets out.
<xiphmont> In encode, all packets are submitted with timing info.
<xiphmont> Oggfile builds and places pages as needed to obey timing magically.
<xiphmont> [it would be a serious asspain to require each app to do it]
<Arc> yes I know that. but I see two ways for OggFile to place it.
by asking the codec for a granulerate (ie, 88200 granules per
second with 44.1/stereo vorbis or 29.95 granules per second with
NTSC theora) and calculate its position based on granulepos or
will the codec tell OggFile "this belongs at 19.23 seconds"
<derf> Assuming a fixed granulerate is bad.
<Arc> because the prior would require a spec rewrite, the latter is
perfect
<derf> Current Theora's granulerate is not constant.
<Arc> derf, yea but assuming API for something that isn't public yet is
also bad :-)
<xiphmont> Arc: we can have a packet show up with begin and end timing.
<Arc> xiphmont, awesome. thanks :-)
<xiphmont> Ogg won't necessarily know that on decode side (it will have to
ask the codec), but on encode side, just have codec provide it.
<xiphmont> It makes no sense for continuous streams, but for discontinuous it
seems handy.
<Arc> second question, do you feel it would be a good idea for OggFile
(which I very much assume icecast2/libshout will use) to put the
job of keeping track of and reporting "state information", ie,
headers
<xiphmont> yes
<Arc> vorbis would just spit out the headers for state information
<xiphmont> Actually, your grammar doesn't parse.
<Arc> writ, however, could spit out any pages whose granulepos has not
expired yet (to current) thus preventing the need in the spec to
have phrases "expire" by time and need to be "refreshed" every few
seconds for streaming clients
<xiphmont> well, without readahead hinting, you still have an issue.
<xiphmont> You either see a long-time caption too late.... or you miss it on
seek.
<Arc> thus the writ codec on icecast's side could buffer the last few
pages (those that are still valid), on a new client connecting,
spit out the header + however many packets are in the buffer
<xiphmont> [eg... how does Oggfile need to know it has to buffer a full
minute of video?]
<Arc> how big is that window?
<xiphmont> in continuous/discont... there is no window.
<derf> The problem is that icecast needs to buffer some data from a
discontinuous stream.
<xiphmont> A discontinuous stream will need a hint.
<derf> i.e., it needs to know the granpos<->time mapping.
<Arc> or it could be outside icecast
<Arc> right now icecast is buffering the vorbis headers
<xiphmont> yes. But it will also need to know window ahead of time without
reading the whole file.
<derf> So it can tell if it has to buffer packets from a stream if they
appear in the stream long before the granpos time.
<Arc> but if icecast is using OggFile this could be part of the API, the
stream state info, a buffer of pages which are needed to bring a
new client "up to speed"
<xiphmont> yes
<xiphmont> It should be.
<derf> I don't see why it needs any kind of window.
<Arc> i don't understand the "hint" as you call it, why does it need to
read ahead at all?
<derf> With cont/discont streams.
<xiphmont> you have a ten minute caption with a 20 minute gap ahead of where
it appears.
<xiphmont> Do you really want to buffer 20 minutes of video to find it?
<Arc> with seeking or streaming? two different things
<xiphmont> What if the caption stream ends early? You stop and wait for the
whole stream to buffer to figure that out.
<xiphmont> I'm speaking streaming.
<Arc> why would you stop playing audio/video? you either receive a
caption or you don't
<derf> Arc's idea was that packets always appear before they're needed.
<derf> In the stream.
<xiphmont> OK. Now seeking. If it appeared early, you miss em when you seek.
<derf> So if you haven't seen it when you find audio/video that comes
later, then they're not there.
<Arc> xiphmont, how so? wont it seek each logical bitstream based on
its granulepos?
<xiphmont> No.
<xiphmont> Seeking is global.
<Arc> what is the window then?
<xiphmont> You seek to *one* point in the stream, based on all granposes.
<derf> xiphmont: I can't see how that one point is well-defined.
<xiphmont> right. what is the window then?
<Arc> but discontinuous streams..
<xiphmont> derf: granposes are all in chronological order.
<xiphmont> Arc: discontinuous streams to not contribute to sync, they
piggyback off of it.
<xiphmont> A continuous stream is just a stream with a readahead window of
'infinite'.
<Arc> yes but they're going to vary by a certain %, sometimes the audio
will be ahead of the video, sometimes vice versa. they're both VBR
so there needs to be a window of some sort
<xiphmont> A continuous stream has a readahead window of infinite. "Buffer
as much as necessary to keep all queues nonempty"
<Arc> the continuous/discontinuous status of a stream is provided by the
OggFile codec, right?
<xiphmont> yes
<xiphmont> that's current design.,
<Arc> ok. then, whats the window for discontinuous
<xiphmont> exactly.
<xiphmont> It would need to be set somewhere.
<Arc> see it's easy, in writ, for us just to say "this is the maximum
realtime length of a caption compared to its placement in the
stream" and then prematurely end then refresh the phrases that
need it
<derf> And this limits the length of your captions.
<xiphmont> sure.
<Arc> exactly.
<Arc> not the apparent, or source, length of the captions. its all
internal to libwrit
<derf> Right.
<xiphmont> ...but be careful; your maximum duration/gap will set the
buffering requirements of the entire stream.
<Arc> "if a caption's end-time minus physical placement time is greater
than x, then terminate all current phrases early, then immediately
redefine them in the same order and location"
<xiphmont> sorry, no, just duration.
<Arc> well thats why I'm asking you about this, because this is global to
Ogg
<xiphmont> OK, I think we're on the same page right now.
<Arc> well it has to be physical placement time because some captions
will need to be defined before other captions, remember they need
to be ordered by their end time. that will determine if they get
cut, and if they get cut before their start time, they wont need
to be defined yet at all.
<Arc> i was up to 6am this morning running through different projections
for how this could work with seeking/streaming. derf's overlapping
durations idea does play out well
<derf> Except that if you want to cut, you may need to drop out packets
from the middle (or just keep the extraneous data).
<Arc> see I originally had it "all captions are FIFO, the first to be
defined are also the first to end, otherwise they need to be cut
and recreated, always". that can become a very bloated mess with
text constantly getting redefined
<xiphmont> that's the same with other codec types.
<derf> When cutting off the end of something.
<Arc> ?
<xiphmont> derf: that's the same with other codec types.
<xiphmont> editing is always messy. Ogg is not intended to be easy to edit.
<derf> Yes.
<derf> Editing is messy in general.
<Arc> by cut I mean "while encoding the bitstream, if such conditions
exist, split a single set of phrases into two butted end to end,
ie, ending and immediately re-defining it"
<derf> Just having global headers with different codebooks makes
combining different streams hard.
<Arc> i don't mean ala vcut
<derf> (without imposing overhead of adding new headers in each segment)
<Arc> the logical bitstream wont get a EOS/BOS
<derf> Oh, I was talking about someone actually cutting an
already-multiplexed stream into two pieces.
<Arc> its just the phrases, the captions, that will get cut. their
durations split at the window mark, processed as needed,
redefined/copied to start at the same time their original was
prematurely terminated, process repeated as needed so a single
very long phrase (aka caption/subtitle) can be split-copied into
hundreds of phrases, each redefining the same data for another
X second window
<Arc> derf, yea lets not get too complicated here :-)
<derf> Well, it is still a use case to consider.
<derf> People might want to actually do such a thing.
<Arc> I'm not concerned with cutting, this is just text. lossless.
<derf> Even though there are currently no tools for it.
<Arc> people could use the same mechanism icecast does for cutting a
bitstream. each OggFile codec keeps track of "state information",
which typically is just the header but for discontinuous streams
could be the last few buffered pages..
<Arc> if OggFile has such an API it would make cutting child's play.
<Arc> monty, so, is this going to be variable? or is it going to get set
at some point? because i might as well build functionality for
that into the design here while I'm working on it
<xiphmont> Ogg needs to be able to ask the codec what the readahead window is.
<xiphmont> the codec can have that set inherently or git it from the logical
stream header.
<Arc> yea but what should this be
<Arc> are we talking a minute? 10 seconds? 1 second?
<xiphmont> actually thinking a sec...
<Arc> ok :-)
<derf> A second could be as much as 700k of video.
<derf> Which is probably reasonable.
<xiphmont> OK, thinking over, no change in state.
<derf> But captions typically last 3 to 6 seconds.
<xiphmont> 'what derf said'.
<derf> Which means you've quadrupled to quintupled the size of your
caption stream.
<xiphmont> Or you could just decide 'losing last one is no big deal'.
<Arc> yea, exactly.
<xiphmont> ...and go to placing in the bitstream according to start time.
<derf> xiphmont: That's what current DVD players do, IIRC.
<xiphmont> derf: good to know.
<Arc> the smaller the window the less buffering on the player's side,
but the greater the codec size grows
<xiphmont> yes.
<derf> A player that really did care could do a separate seek for each
discontinuous stream instead of one global one.
<Arc> it makes things so much easier to have it ordered by end time
<xiphmont> So.... perhaps the window should be set... and left up to the
application if it cares to use it or not. We go to ordering
discontinuous stream types by begin time, and make sure we're
tolerant of losing 'the one before' if the application chooses to
do it that way.
<Arc> i mean, coding is easier by start time, duh, no buffering, no
changing the order, just drop it in and let it fly or not
<derf> And then buffer just the discontinuous data (which one would
expect to be far less than the continuous) until it caught up to
the global seek point.
<xiphmont> derf: yes.
<xiphmont> no, you don't want to do separate seek... for example, in the
streaming case... you can't whether you care or not.
<Arc> ok but if they're ordered by start time we still need a "window"
for very long captions, otherwise seeking would never have them
appear
<xiphmont> ...so don't turn it into supporting multiple cases. Make it
multiple possibilities in a single case.
<xiphmont> Arc: yes.
<xiphmont> And the application can decide to mind the window or not...
<Arc> the encoding application
<xiphmont> A PC software player will always want to mind. An embedded
player may simply not be able to.
<xiphmont> No, decoding.
<xiphmont> encoding always requests a hint... but the decoder can ignore
the readahead hint without ill-effect if it wishes.
<Arc> no i mean, the encoder would have to "refresh" a phrase periodically
<xiphmont> unless you want to miss a few, yes.
<Arc> if ordered by start time, the player simply seeks and runs.
<Arc> well its not missing a few that bothers me, its missing a very
long one
<xiphmont> you can't have everything you want here :-) Very long would need
to refresh in either case.
<Arc> ok so there would need to be a refresh window variable that the
encoder could set, but could default to a certain number
<Arc> yes I know, refresh is unavoidable.
<xiphmont> ok
<Arc> yea for all cases ordering discontinuous streams by start time is
easier.
<Arc> less elegant, tho
<xiphmont> 'however the codec wants to do it'. It could be a hardcoded
number in the codec for all I care (I know that's not really
sensible)
<derf> Placed in the stream by start time can have a much longer refresh
time than placed by end time.
<xiphmont> derf: yes.
<xiphmont> lookin' like a win all around.
<xiphmont> ...and this can be added to spec without breaking a single thing.
<Arc> if the encoding application chose it could set this window
extremely high, understanding that long term captions would never
appear if it's seeked
<Arc> or streamed.
<xiphmont> Arc: yes.
<xiphmont> If ordered by start time, I think the granpos should also be
start-time.
<Arc> and this would eliminate the need to monitor "state information"
with streaming, it'd act no different from a seek
<xiphmont> but that's a minor detail I'd rather debate another time.
<Arc> well yea that'd have to be the case or you'd have out of order
granulepos and that'd create chaos
<Arc> ok so, the behavior part of the spec should change so that
packets are ordered by start time, in sequence, and it doesn't
matter if they overlap
<xiphmont> Arc: yes, seems like it.
<derf> One could always look at that stuff to see how it wound up being
implemented.
<Arc> derf, you had a great idea, in any case, on how to handle
overlapping when granulepos was by end time
<Arc> i hate to erase it all, I'm going to copy this to another location
on the wiki...
<derf> I don't know how you make seeking work with granpos as the start
time.
<xiphmont> OggFile would need to distinguish between cont and discont.
<xiphmont> It needs to ask codecs for granpos mappings anyway.
<Arc> easy. you seek to a point, you only display new phrases. long term
phrases are periodically refreshed, so the player just displays
them as they come in.
<xiphmont> if it's end-time and packets are in chron order, discont streams
are useless for sync and seek. If it's start-time, they can
contribute.
<xiphmont> I think derf was concerned about complicating the seeking algo.
<derf> Mostly.
<xiphmont> I don;t think this would complicate it much.
<xiphmont> It just changes the 'boundary was at head or at tail' of page.
The bisection is identical.
<xiphmont> ...and the meaning of seek points is the same.
<xiphmont> you still seek to the largest granpos in the stream preceding
requesting time position.
<xiphmont> [preceding or equal to]
<derf> "the page with..."
<xiphmont> well, you either seek *to* that page [if it's discont] or just
past that page [if it's cont]
<i>Actually, if it's continuous, you seek just past that page if the
last packet is not continued, or to that page is the packet is
continued -- Monty</i>
<xiphmont> you have both those boundaries. You just use cont/discont to
decide which.
<xiphmont> I think of seeking as an operation of going to a specific page
boundary, not a specific page.
<xiphmont> [and that makes this extension much cleaner]
<derf> Okay, I think I see now... I was holding the definition of what a
granpos meant fixed as a design constraint.
<xiphmont> derf: well, it had been. This is actually a new innovation within
the machinery.
<derf> But I agree this is a reasonably simple special case.
<Arc> so discontinuous streams, granulepos is the start time of the packet
<xiphmont> what complication do you see?
<xiphmont> Arc: start time of the first packet beginning in the page
<xiphmont> [not a continued packet]
<xiphmont> Oh, continued packets.
<xiphmont> No continued packets in discont streams.
<xiphmont> You think that's reasonable?
<Arc> not really, because a discontinuous packet could be quite large
and you'd want it split across page borders
<derf> xiphmont: It gives a hard limit on packet size, doesn't it?
<xiphmont> yeah, you're right.
<derf> xiphmont: It's an "if"... I'm not worried about it either.
<xiphmont> OK, a restriction:
<xiphmont> Continued packets must be continued in an immediately following
page.
<xiphmont> derf: it is an if.
<Arc> that sounds healthy
<xiphmont> OK
<xiphmont> See the nice part about *all* of this is...
<xiphmont> If a third-party impl screws up, it doesn't break the code, it
just munges playback slightly.
<xiphmont> We can extend the spec to include them...
<xiphmont> the stream format need not rev.
<xiphmont> we already know existing code isn't up to discontinuous anyway.
<xiphmont> OggFile is intended to do this for the app. I do not expect most
apps to implement this. It is purely a mux-layer operation.
</pre>
<p><p>--- >8 ----
List archives: http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'cvs-request at xiph.org'
containing only the word 'unsubscribe' in the body. No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.
More information about the commits
mailing list