[vorbis] tarkin (was Re: FREE at last)

Mon Oct 9 16:56:36 PDT 2000

On Mon, Oct 09, 2000 at 02:46:57PM -0700, Ralph Giles wrote:
> On Mon, 9 Oct 2000, Kenneth Arnold wrote:
> 
> > Hmmm... good idea. Any data suggesting how good a compression this achieves
> > and how it compares to MPEG-whatever?
> 
> Chad said the test code was in the same ballpark as mpeg-1. Wavelets
> generally do better than block-DCT.

Well it looks like we've got some work to do, since MPEG is 10 years (?)
ahead... but still, that looks too promising to ignore at the moment.

> > I saw a link earlier on vorbis-dev for a chirplet-based system -- I'll have
> > to dig that up and look at it some more. Can't remember who it was, but
> > it was probably Ralph because he has done a lot of (NS)MLP. :)
> 
> What's MLP?

Read K5? Mindless Link Propogation.

> > Finally, did it include motion comp. or not? Could this improve things?
> 
> The algorithm Jack described is 3D wavelet, which is equivalent to motion
> comp. I'd expect that to be even more effective with chirplets. OTOH,

...  well... uh... I guess in some ways. Yes the optimal wavelet / chirplet
could catch all similarities, but what the heck is optimal, and how is the
encoder suppossed to figure that out 30-60 times per second? A good 2D motion
compensation algorithm should catch a lot more stuff, but I'm not completely
sure about that. In any case, wavelets look very promising, and chirplets more
so, so let's jump on that bandwagon. Maybe encode what can be encoded best with
[wave,chirp]lets with those beasts, and 2D motion comp what's left. No I am not
an expert, and I hope I'm not pretending to be.

> someone pointed out that a problem with 3D transforms is that framerates
> are *too low* to get good continuous 'tones', so it's possible motion-
> compensated 2D wavelets would work better below, say 15 fps. Still, I'd
> try subpictures or the like as well; they might be just as good while
> staying in the same framework.

I personally like flexibility as much as practical; if it's not too costly
to do some simple motion compensation on everything, go ahead and do it on
the encoder and just drop it if it doesn't improve things enough. Finding
some other use for the motion-compensated data would be nice too, but I'm not sure what.

>  [...]
> > Maybe rtlinux could do sync issues better. But anyway, there needs to be some
> > sort of marker, either in the metadata or built into the stream itself, to
> > synchronize all the elements of the stream in time. So far we have been able
> > to ignore all that, because it's just audio, but there can be video, lyrics,
> > closed-captioning, links, markers, etc. that all will need to be synced to
> > something, probably the audio. From what you describe, it looks like the
> > facilities to do that are pretty minimal.
> 
> rtlinux would help more with bounded-latency for realtime encoding, I
> think. Given that you want at least 1/10 sample accuracy, the audio
> clocking is best left in hardware. Contrariwise, because of the vast
> sample-rate difference, video is much less of a problem. (again, unless
> you're writing a camera controller :)

Best left in hardware, huh? So what about me playing moviez at home with
nothing more than my run-of-the-mill graphics card?

> Re minimal facilities, all the compressed stream can really do is provide
> timestamps (or ranges). It's easy enough to keep track of exactly where
> you are in the decoder (libvorbis provides sample-accurate seeking, for
> example). In the trivial implementation, it's just a matter of lining up
> the numbers.

Right... more pains in decoders' butts. Ah well, I guess we're going to
have to live with it. It's not like there's much better, other than
SMTPE-timecoding the whole system (which I think should be done possibly
as metadata, because it's possible that the video should interface with
systems that use SMTPE (e.g., broadcast).

> A more sophisticated approach is to write some sort of scheduler that
> works to (1) keep the audio buffer full (2) display video and text frames
> at the appropriate time (3) issue packets to the various decoders so the
> above can happen as smoothly as possible given available system resources.

True: some smarts there can get the same result.

> Some suggested heuristics:
> 
> maintaining audio is the most important thing. drop video frames as
> necessary down to 1-per-5-seconds or so. then drop audio.

But what about inter-frame coding? Wouldn't this totally  mess up the 3d
wavelets?

> Text (closed captions/subtitles) is better run 'fast-forward' or displayed
> in aggregate than dropped.

yes.

> the black magic is in the thresholds which decide when not to
> decode/display a frame. 
> 
> 
> BTW, there's been lots of discussion on the livid-dev list
> (linuxvideo.org) but they haven't worked out a solution yet.

Might be a good idea to join that myself, considering that I'm trying to
use that exact program. Windows DVD won't work at all, and I want to see
The Matrix, so I go to the operating system that works.

Kenneth
>  -r

-- 
Kenneth Arnold <ken at arnoldnet.net>
Slashdot and K5: kcarnold
Proudly registered Linux user #180115! See http://counter.li.org/

-----BEGIN GEEK CODE BLOCK-----
Version: 3.1
GCM/CS/E/IT/M/S d?(-)(pu) s:-(:--) a15>? C++(+++) UL+++ P+ L++ E W++(+)
N? o? K? w--(-) O? M+ V? PS+(++) PE+ Y+ PGP- t+ 5? X? R? tv-(--) b+ DI
D G e- h! !r !y
------END GEEK CODE BLOCK------

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'vorbis-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.