The integer wavelet codec, and Re: [vorbis-dev] A different take at video encoding - I'm stuck though

Sun Feb 11 13:22:59 PST 2001

On Sun, Feb 11, 2001 at 11:25:52AM +0100, Holger Waechtler wrote:
> 
> On Sun, 11 Feb 2001, Lourens Veen wrote:
> 
> > As I said in the other thread (for people using threaded mail)/in my
> > last mail (for the rest), I'm working on the "split up the frames into
> > textured triangles and output textures and coordinates" idea. The idea
> > is to find edges in the image. So far I have a Sobel edge detector, a
> > wavelet transformer (to compress the textures, still very incomplete,
> > transform only),
> 
> As I already mentioned before, I'd really like to see developed many
> approaches. It's the same as the .gz <-> .bz2 thing, your approach will be
> considerably slower on the encoder side (on the receiver side too if
> you don't have hardware OpenGL), but may achieve better compression
> results. Since the 3d wavelet transform is a symmetric process, this is
> probably the only approach which will allow a realtime encoder or harddisk
> video recorder. But whenever compression ratio is more important than
> encoding time, other algorithms may perform better.

Obviously having the as fast as possible is a good thing, but to what
extent do you sacrifice compression for speed? In my opinion the
codec, or at least bitstream format, should be flexible enough to
handle everything from storing raw RGB / YUV data, through fast but
not as efficient lossless, slow but small lossless, MPEG-style DCT and
motion compensation (where possible while avoiding the patent issues),
to your single-pass wavelet, then adding textures, to various levels
of Lourens's polygon-based coder. Getting a bitstream format to
accomodate all of those, possibly simultaneously, has been what I've
been keeping myself busy with.

And for Holger, and anyone else who knows these things: thanks to some
of the good tutorials and references several people on this list have
referred to, I now have a decent idea as to how wavelets work. My
question now is how to apply them. I've been looking (studying?
pondering?) your code, and though I think I understand what you are
doing, I don't get how it all fits together. I have specific
questions, but in a general sense (and this could answer all of them
at once): How do you actually apply wavelets to real data in more than
one dimension? Perhaps this is elementary for the author, but none of
the references I have read cover that. Once I understand everything
involved in the coder, perhaps I can write up something to describe it
to the less math-inclined?

Also, as an implementation question, what are you doing with "scales"?
It seems to be a rough upper bound on the maximum log base 2 of the
width, height, and number of frames... am I missing something or is
this part of multidimensional wavelets? And though I think I get what
you're doing in __fwd_xform__ (just not the algorithm used to do it),
I don't get what it's returning -- the min and max of what? and why?
I haven't actually implemented a wavelet myself, though, so this could
be pretty obvious.

I'm currently running the compiled code on a 400-frame sequence of
352x288 frames. Aside from the fact that it ate almost all of my 256
MB of memory in one fell swoop (and is still partially in swap but
fortunately not thrashing), it's still working on it. I had no idea
what to set the Y U and V bit cutoffs to, so I just picked some
arbitrary numbers to try it out; any suggestions?

I do notice one probably troubling thing with the implementation here:
you're coding Y U and V independantly (unless I'm missing something
big). The problem is that Y U and V are not independant; in my test
images, Y U and V have at least the same or very similar edges; often
the values are highly correlated also (this should be obvious to
anyone working with imaging). What to do about this? Could you expand
the wavelet to another dimension, i.e. predict the U and V data of a
frame from its Y data?

I'll save my streaming and scene-cutting concerns and ideas for
another email.

Thanks all,

-- 
Kenneth Arnold <ken at arnoldnet.net> / kcarnold / Linux user #180115
http://arnoldnet.net/~kcarnold/

<HR NOSHADE>
<UL>
<LI>application/pgp-signature attachment: stored
</UL>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: part
Type: application/octet-stream
Size: 233 bytes
Desc: not available
Url : http://lists.xiph.org/pipermail/vorbis-dev/attachments/20010211/4b19865f/part.obj