[vorbis] FREE at last

Wed Oct 11 05:43:38 PDT 2000

Kenneth Arnold wrote:
> 
> On Sun, Oct 08, 2000 at 11:05:38AM -0700, Ralph Giles wrote:
> > On Sun, 8 Oct 2000, Aleksandar Dovnikovic wrote:
> >
> > > > Now a question about VORBIS
> > > > I wonder when the next release of VORBIS is sheduled for?
> > > > Are there gonna be some more options to choose from (like LAME)?
> > > > Joint stereo perhaps?
> > >
> > > Yes, I would like to know that too.
> > > Monty, can you supply us with some answers?
> >
> > Last I spoke with monty, the features on the todo list for vorbis 1.0
> > are generally:
> >
> > Channel Coupling. Meaning joint stereo and hopefully also (joint)
> >       ambisonic surround.
> >
> > Cascading support. This is the promised 'bitrate peeling' feature for
> >       trivial transcoding to lower bitrates.
> >
> > Lower-bitrate Modes. Combines with the above.
> >
> > I don't know that there's any firm schedule for this, beyond the
> > aforementioned new year deadline from mp3 licensing. The above will
> > probably take at least a two months together though, so I don't expect 1.0
> > before December at the earliest. Unless they punt. :)
> >
> > Makes sense to release another couple of betas in the meantime though. We
> > could actually do a beta3 real soon with just the bug fixes and code
> > re-org from the postbeta2 branch, but I'd wait until one of the above
> > features is added.
> >
> > <plug>
> > It would also be nice if we could get stream-description metadata in there
> > as well, if only to make it a less traumatic upgrade when tarkin happens.
> 
> Tarkin? Where is that anyway?
> 
> I've found some video codec stuff myself, and am seriously considering
> porting them over to the Ogg framework to ease playing around. Having
> not yet delved into code, I wonder about frame-sync issues -- how can
> I get a frame of video to match up with a position in the audio stream?
> Forgive me for asking if this is blatantly simple.

Here's my 2ct worth:

I don't think it's blatantly simple, at least for the decoder, because
that's just where you will find the synchronization problems. Not every
video frame will always have exactly the same amount of associated audio
samples, for the simple fact that the audio sampling clock may not be an
exact integer multiple of the video frame rate (especially when the
audio ADC and the video ADC each use their own crystals for the clock).
So, a frame of video can have a variable amount of samples assigned to
it on the encoder size, plus a decoder may require a variable amount of
samples. 

First an obvious but wrong way to do it, and then I'll suggest a correct
way to do it...

What you may do is let the encoder assign samples to video frames at
will (it should be sufficient to indicate between which audio samples
the frame boundaries are, maybe numbering the frames so that recovery is
possible after packet loss). Then, the decoder can deal with it in
various ways, and I think it depends on the platform and application
which way should be used (there is no reason not to leave the option
open in the standard).

For example, the decoder can make the video frame display dependent on
the time-base as given by the audio samples (N audio samples per second)
and display the video frames synchronized with the markers in the audio
stream. Or, the decoder can use the video frames as an absolute time
base (N video frames per second), and resample the audio samples so that
the audio stream conforms stays synchronized.

However, if then an encoder makes a stream specified as 44.1Khz audio
samples, and it's crystal is off, resulting in 44.2Khz audio samples
being transmitted, then the video stream will be structurally delayed by
more than 2%, which results for a live video stream in a gradual
increase of the delay, and the buffers at the receiver fill up.

So, that's the wrong way... Here is what I think is the best way:

Define an absolute time base in seconds (or miliseconds, whatever). For
example, the number of miliseconds elapsed since the beginning of the
stream. But I guess there is no real reason not to use the system time
of the computer for it either.

Then, insert the absolute time codes of each video frame with each video
frame, and insert the absolute time codes of each block of audio samples
with each block of audio samples.

Then, the decoder can compare the absolute time codes of the video
frames and audio samples, and determine when to display the frames, and
whether or not to resample the audio (or skip or insert samples when
resampling is too computationally intensive).

In the event that the encoder clock is off, it's still not a huge
problem. For example if the clock at the encoder is 10% faster, then
after 10 seconds, according to the time stamps, the decoder will have
both 1 second worth of video data and 1 second worth of audio data in
it's input buffers. It can then reduce the latency in case of a live
stream by simply discarding the buffer content and/or compensating for
the time base differences (by playing 1100ms worth of video and audio
data each second instead of 1000ms). Time base variations at the decoder
side are dealt with automagically too in that case.

All that the Ogg standard needs is to define a place in the stream where
to add the time codes for the video and audio blocks/frames. The encoder
simply adds the time code when it grabs the samples or video frames, and
each decoder deals with the 'problems' by itself.

Hope this helps,

Cya,

Jelle

> > For my part, I'm just not sure Robert and I will have anything stable by
> > then. Some help with it would be greatly appreciated! :) We seem to
> > generally have consensus on the metadata elements, but not on how to
> > encode them, and I don't have a good handle on what we need to support in
> > the stream-description part.
> >
> > If we do it the way I want to, we need mng and xml substream support at
> > least, to span the required feature set, as it were. See my todo list at
> > http://snow.ashlu.bc.ca/ogg/todo.html for details.
> > </plug>
> >
> > Hope that helps,
> >  -ralph
> >
> > --
> > giles at ashlu.bc.ca
> > *crackle* The Director is a Humpback whale. Hold all calls. *crackle*
> >
> >
> >
> >
> > --- >8 ----
> > List archives:  http://www.xiph.org/archives/
> > Ogg project homepage: http://www.xiph.org/ogg/
> > To unsubscribe from this list, send a message to 'vorbis-request at xiph.org'
> > containing only the word 'unsubscribe' in the body.  No subject is needed.
> > Unsubscribe messages sent to the list will be ignored/filtered.
> 
> --
> Kenneth Arnold <ken at arnoldnet.net>
> Slashdot and K5: kcarnold
> Proudly registered Linux user #180115! See http://counter.li.org/
> 
> -----BEGIN GEEK CODE BLOCK-----
> Version: 3.1
> GCM/CS/E/IT/M/S d?(-)(pu) s:-(:--) a15>? C++(+++) UL+++ P+ L++ E W++(+)
> N? o? K? w--(-) O? M+ V? PS+(++) PE+ Y+ PGP- t+ 5? X? R? tv-(--) b+ DI
> D G e- h! !r !y
> ------END GEEK CODE BLOCK------
> 
> --- >8 ----
> List archives:  http://www.xiph.org/archives/
> Ogg project homepage: http://www.xiph.org/ogg/
> To unsubscribe from this list, send a message to 'vorbis-request at xiph.org'
> containing only the word 'unsubscribe' in the body.  No subject is needed.
> Unsubscribe messages sent to the list will be ignored/filtered.

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'vorbis-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.