[vorbis-dev] granulepos start/end revisited

Sat May 22 21:29:54 PDT 2004

On Sun, May 23, 2004 at 01:39:12PM +1000, Conrad Parker wrote:
> 
> The only relevent difference between the two schemes is that the state-change
> style injects an extra packet at the end-time of the subtitle's presentation.
> All subtitle buffering features that you discuss are identical, however
> by going from a state-change style to a duration-only style you've lost
> end-confirmation packets.

Since you've effectivly argued the exact same point as my previous reply was to, and failed 
to respond to the examples, I'm going to give you the exact same reply again.  I've appended 
more below, but please read through this first.

On Fri, May 21, 2004 at 10:53:45AM -0400, Arc Riley wrote:
> 
> Subtitle A lasts 0 to 31
> Subtitle B lasts 5 to 20
> Subtitle C lasts 28 to 35
> 
> Putting this on a chart:
> 00  02  04  06  08  10  12  14  16  18  20  22  24  26  28  30  32  34 
> AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA       
>           BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB                            
>                                                         CCCCCCCCCCCCCCCC
> 
> Long phrases are redefined every few seconds, so it's considered OK for 
> an unknown phrase to not be shown for a few seconds after seek.  
> However, if a phrase was already known before seek, it should still be 
> shown after seek if it's still valid.  
> 
> Say you played the above stream until 10, then seeked to 25, in the 
> current method you would get A, know immediatly that A lasts until 
> granule 31, then get B and know immediatly that it lasts until granule 
> 20.  When you seek, the new granulepos will drop B (because it's 
> expired) but A is still valid for awhile so it's kept.  C will be 
> received further down and display properly.  Nice and tidy.
> 
> Now say this were to be replaced by the start-stop method you described.  
> A would "turn on" granule 0, then B would "turn on" granule 5, then you 
> seek.  Uh-O!  Did the seek skip A getting turned off? B getting turned 
> off?  Better destroy all known phrases, just to be sure.  When seeking 
> is completed, even tho we would have otherwise known that A is still 
> valid, A would not be shown.  If it's especially long you may see it get 
> redefined later down the stream, but it wont be displayed until then.
> 
> No, it's FAR better to know when things are going to stop ahead of time.  
> In either case the decoder has to keep track of phrases, because in the 
> first you keep track of them so they're turned off at the right time, in 
> the second you keep track of them so they can be turned off when the 
> "off" packet for them is received.  Complexity wise, within the code, 
> neither is really more complicated if we don't consider the seeking 
> issue.  In the first the codec is responsible for clean-up, in the 
> second the stream is responsible for clean-up.  The prior is much better 
> since you should never trust the Ogg stream to be without errors, and 
> once you throw the seeking issue in the way we're doing things now 
> becomes the obvious optimal solution.

<p>Now, do you understand the problem at hand?

In the implementation of Writ, switching to end-pos doesn't work.  The seek mechanism needs 
to know which granules are provided by a given page, so in either case the granule of the 
page needs to be the start granule of that phrase, not the end.

In the start-stop mechanism you describe, to repeat myself, you loose the ability to know 
which phrases are "still on" after a seek.  In order to fo the start-stop method that you're 
recommending, you have to clear your entire phrase table during a seek because you don't 
know if you've skipped a "stop" packet or not.

It also results in more overhead.  Not only would phrases need IDs (prehaps referencing them 
by start granule, tho?) and stop packets, but you loose functionality by doing so.

Writ isn't the only issue that the start-time rule applies to.  Take, for instance, MNG.  
Each frame has a variable delay between them, and the frames can be especially seperated.  
In some cases (using it as a video codec) it's continuous, but in other cases (as a subtitle 
codec) it's discontinuous.  Using start-time in the latter case eliminates alot of issues.

Yet another example is MIDI.  It defines notes.  Said notes are not simple "turn on, turn 
off" either, alot of them have elements of sustain-release and so on.  Each note has a start 
granule and an end granule.  The notes can overlap.  The entire end-granule scheme falls 
apart in this method, and trying to make the end-granule scheme work by trying to make MIDI 
fit into the ridgid rules of "start-stop only" limits it's functionality for no reason.  
Rillian has already implement start-granule times on OggMIDI, which has been available for 
some time (tho his orders by end-time, which breaks muxing, and needs to be changed).

Now apply this to a subtitle format.  Say you wanted a text codec which fades in/out.  With 
state changes, you would have to redefine the color (or something) every granule, a seperate 
packet for each to ensure that it can be muxed properly.  With the current system, you could 
simply say "this phrase starts at this granule and fades out for this duration".  Not that 
such a system is being put into Writ anytime soon, but that's another good example.

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'vorbis-dev-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.