[vorbis] FREE at last

Kenneth Arnold ken at arnoldnet.net
Wed Oct 11 06:40:33 PDT 2000



On Wed, Oct 11, 2000 at 02:43:38PM +0200, Jelle Foks wrote:
> Kenneth Arnold wrote:
> > 
> > On Sun, Oct 08, 2000 at 11:05:38AM -0700, Ralph Giles wrote:
> > > On Sun, 8 Oct 2000, Aleksandar Dovnikovic wrote:
> > >
> > > > > Now a question about VORBIS
> > > > > I wonder when the next release of VORBIS is sheduled for?
> > > > > Are there gonna be some more options to choose from (like LAME)?
> > > > > Joint stereo perhaps?
> > > >
> > > > Yes, I would like to know that too.
> > > > Monty, can you supply us with some answers?
> > >
> > > Last I spoke with monty, the features on the todo list for vorbis 1.0
> > > are generally:
> > >
> > > Channel Coupling. Meaning joint stereo and hopefully also (joint)
> > >       ambisonic surround.
> > >
> > > Cascading support. This is the promised 'bitrate peeling' feature for
> > >       trivial transcoding to lower bitrates.
> > >
> > > Lower-bitrate Modes. Combines with the above.
> > >
> > > I don't know that there's any firm schedule for this, beyond the
> > > aforementioned new year deadline from mp3 licensing. The above will
> > > probably take at least a two months together though, so I don't expect 1.0
> > > before December at the earliest. Unless they punt. :)
> > >
> > > Makes sense to release another couple of betas in the meantime though. We
> > > could actually do a beta3 real soon with just the bug fixes and code
> > > re-org from the postbeta2 branch, but I'd wait until one of the above
> > > features is added.
> > >
> > > <plug>
> > > It would also be nice if we could get stream-description metadata in there
> > > as well, if only to make it a less traumatic upgrade when tarkin happens.
> > 
> > Tarkin? Where is that anyway?
> > 
> > I've found some video codec stuff myself, and am seriously considering
> > porting them over to the Ogg framework to ease playing around. Having
> > not yet delved into code, I wonder about frame-sync issues -- how can
> > I get a frame of video to match up with a position in the audio stream?
> > Forgive me for asking if this is blatantly simple.
> 
> Here's my 2ct worth:
> 
> I don't think it's blatantly simple, at least for the decoder, because
> that's just where you will find the synchronization problems. Not every
> video frame will always have exactly the same amount of associated audio
> samples, for the simple fact that the audio sampling clock may not be an
> exact integer multiple of the video frame rate (especially when the
> audio ADC and the video ADC each use their own crystals for the clock).
> So, a frame of video can have a variable amount of samples assigned to
> it on the encoder size, plus a decoder may require a variable amount of
> samples. 
> 
> First an obvious but wrong way to do it, and then I'll suggest a correct
> way to do it...
> 
> What you may do is let the encoder assign samples to video frames at
> will (it should be sufficient to indicate between which audio samples
> the frame boundaries are, maybe numbering the frames so that recovery is
> possible after packet loss). Then, the decoder can deal with it in
> various ways, and I think it depends on the platform and application
> which way should be used (there is no reason not to leave the option
> open in the standard).
> 
> For example, the decoder can make the video frame display dependent on
> the time-base as given by the audio samples (N audio samples per second)
> and display the video frames synchronized with the markers in the audio
> stream. Or, the decoder can use the video frames as an absolute time
> base (N video frames per second), and resample the audio samples so that
> the audio stream conforms stays synchronized.

Markers in the audio stream would be in this case the timestamped Vorbis
audio frames. We talked about this earlier; audio should be the master,
video &c slave, or everything slave to some master clock.

> However, if then an encoder makes a stream specified as 44.1Khz audio
> samples, and it's crystal is off, resulting in 44.2Khz audio samples
> being transmitted, then the video stream will be structurally delayed by
> more than 2%, which results for a live video stream in a gradual
> increase of the delay, and the buffers at the receiver fill up.
> 
> So, that's the wrong way... Here is what I think is the best way:

Which is basically exactly SMTPE timecode. Exact conformance with that
standard should be not difficult and, as I have said before, possibly
beneficial for integrating Ogg into a system that speaks SMPTE.

How to store this timecode data in the stream is a different question.
We already have Vorbis, which has a loose timestamping as much as Ogg
allows but it can be difficult to seek to an exact time (i.e., to the
exact 30th or 60th of a second where a video frame is), requiring more
smarts for the decoder. Then the decoder must also keep track of the video,
which may or may not (likely not) have a related timestamp system. Even if
it does, there's more video information in a second than audio, and this
can get hard to keep track of, especially when seeking arbitrarily. My
suggestion, then, is a (optional?) metadata stream that maps audio, video,
and other time-dependent data from whatever they are broken down into
(frames for video, blocks for audio) onto a standard, high-resolution time
format, doing what Jelle suggests earlier with frame-numbering but assigning
times to numbers. Perhaps this stream could be removed when the stream is
sent to the end viewer, and the decoder could reconstruct it if it needs
it (because it's all just convenience information; the timestamps would do
the job just as well with just some more decoder work).

> Define an absolute time base in seconds (or miliseconds, whatever). For
> example, the number of miliseconds elapsed since the beginning of the
> stream. But I guess there is no real reason not to use the system time
> of the computer for it either.
> 
> Then, insert the absolute time codes of each video frame with each video
> frame, and insert the absolute time codes of each block of audio samples
> with each block of audio samples.

(the metadata stream)

> Then, the decoder can compare the absolute time codes of the video
> frames and audio samples, and determine when to display the frames, and
> whether or not to resample the audio (or skip or insert samples when
> resampling is too computationally intensive).

Audio first, video second. In any case, it's a lot easier on both the
decoder and the person watching for the decoder to just drop video frames.

> In the event that the encoder clock is off, it's still not a huge
> problem. For example if the clock at the encoder is 10% faster, then
> after 10 seconds, according to the time stamps, the decoder will have
> both 1 second worth of video data and 1 second worth of audio data in
> it's input buffers. It can then reduce the latency in case of a live
> stream by simply discarding the buffer content and/or compensating for
> the time base differences (by playing 1100ms worth of video and audio
> data each second instead of 1000ms). Time base variations at the decoder
> side are dealt with automagically too in that case.

A live input source should have some sort of synchronization that can be
trusted to be reasonably accurate, e.g. a good NTSC source will always be
(insert whatever value it is that is very close to 30 fps here), by likely
more accurate crystal than the computer, and the audio is, of couse, synced
to the time it enters the computer in the first place (which, in a live
recording, ought to be very close to the time it was generated by the
source, or something's messed up with our concept of physics and quantum
mechanics). And the encoder should realize that the input buffer is growing,
and compensate its clock much before it reaches 1 second of data. In a live
performance, you don't want 1 sec of buffer latency if it can be avoided.

> All that the Ogg standard needs is to define a place in the stream where
> to add the time codes for the video and audio blocks/frames. The encoder
> simply adds the time code when it grabs the samples or video frames, and
> each decoder deals with the 'problems' by itself.

See my description above.

> Hope this helps,
> 
> Cya,
> 
> Jelle

You seem to know a lot. Why?

Kenneth

> > > For my part, I'm just not sure Robert and I will have anything stable by
> > > then. Some help with it would be greatly appreciated! :) We seem to
> > > generally have consensus on the metadata elements, but not on how to
> > > encode them, and I don't have a good handle on what we need to support in
> > > the stream-description part.
> > >
> > > If we do it the way I want to, we need mng and xml substream support at
> > > least, to span the required feature set, as it were. See my todo list at
> > > http://snow.ashlu.bc.ca/ogg/todo.html for details.
> > > </plug>
> > >
> > > Hope that helps,
> > >  -ralph
> > >
> > > --
> > > giles at ashlu.bc.ca
> > > *crackle* The Director is a Humpback whale. Hold all calls. *crackle*
> > >
> > >
> > >
> > >
> > > --- >8 ----
> > > List archives:  http://www.xiph.org/archives/
> > > Ogg project homepage: http://www.xiph.org/ogg/
> > > To unsubscribe from this list, send a message to 'vorbis-request at xiph.org'
> > > containing only the word 'unsubscribe' in the body.  No subject is needed.
> > > Unsubscribe messages sent to the list will be ignored/filtered.
> > 
> > --- >8 ----
> > List archives:  http://www.xiph.org/archives/
> > Ogg project homepage: http://www.xiph.org/ogg/
> > To unsubscribe from this list, send a message to 'vorbis-request at xiph.org'
> > containing only the word 'unsubscribe' in the body.  No subject is needed.
> > Unsubscribe messages sent to the list will be ignored/filtered.
> 
> --- >8 ----
> List archives:  http://www.xiph.org/archives/
> Ogg project homepage: http://www.xiph.org/ogg/
> To unsubscribe from this list, send a message to 'vorbis-request at xiph.org'
> containing only the word 'unsubscribe' in the body.  No subject is needed.
> Unsubscribe messages sent to the list will be ignored/filtered.


-- 
Kenneth Arnold <ken at arnoldnet.net> / kcarnold / Linux user #180115
http://arnoldnet.net/~kcarnold/

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'vorbis-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.




More information about the Vorbis mailing list