[Theora-dev] Problems with Theora DirectShow filters
illiminable
ogg at illiminable.com
Wed Sep 15 23:20:00 PDT 2004
Thanks for the information... looks like i need to do a lot more reading !
:)
Cheers,
Zen.
----- Original Message -----
From: "Timothy B. Terriberry" <tterribe at vt.edu>
To: "illiminable" <ogg at illiminable.com>
Cc: <theora-dev at xiph.org>
Sent: Thursday, September 16, 2004 10:04 AM
Subject: Re: [Theora-dev] Problems with Theora DirectShow filters
> illiminable wrote:
>> Most of that went over my head ! I'm pretty much just winging it here :-P
>
> As I said, see Chapter 4 of the spec. It is quite explicit.
>
> http://v2v.cc/~j/Theora_I_spec.pdf
>
>> I'll go on a google mission !
>>
>> These are the two references i'm using for YUv/RGB wrt directshow.
>>
>> http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnwmt/html/YUVFormats.asp
>> http://msdn.microsoft.com/library/default.asp?url=/library/en-us/directshow/htm/uncompressedrgbvideosubtypes.asp
>
>
>> Yeah... i've seen this part... the sampling positions of the various DS
>> formats are shown on the links above.
>>
>> I'll have to look a bit more closely !
>
> The first document contains some good information. However, some nitpicks:
>
> The color spaces should properly be referred to as Y'CbCr, not YUV. When
> people discuss YUV, they often invert the meanings of U and V, or use some
> other definition of chroma entirely, and it is best to avoid the
> confusion. The meaning of Y'CbCr is explicit.
>
> I find introducing the notion of "studio RGB" somewhat confusing, and
> unnecessary. One issue that _is_ important, however, is the nominal range
> of the Y'CbCr values. The document describes the values for video, but
> does not seem to mention that the full range 0..255 is often used for
> still images (e.g., in JPEG). Theora only supports the ITU-R BT.601
> ranges, Y': 16..234, Cb,Cr: -112..112
>
> The document seems to make the common mistake of assuming there is only
> one set of 4:2:2 sampling locations. JFIF (the standard for encapsulating
> JPEG in a file) is quite explicit about introducing a half-pixel
> horizontal shift in the chroma sampling locations for this format (I do
> not believe MPEG1, H.261, or H.263---the other video standards that use
> JPEG sampling locations in 4:2:0 mode---support 4:2:2 data). Theora uses
> JPEG-style 4:2:2, as this makes converting back and forth between the
> JPEG-style 4:2:0 sampling more convenient, and the same operations can be
> used in the other direction to convert back and forth between 4:4:4 and
> 4:2:2. (Theora inherited JPEG-style 4:2:0 from VP3.)
>
> For the upsampling from 4:2:0 to 4:2:2 or 4:2:2 to 4:4:4, the document
> claims that introducing the required half-pixel phase shift is more
> computationally burdensome than the method they describe, which simply
> ignores the problem and claims it "doesn't look that bad". If one is going
> to use an anti-aliasing filter to do the upsampling, this is true, though
> the increase is not that large. It still requires processing at least 4
> samples, but the filter is no longer symmetric, and must be applied to
> both even and odd sample locations, instead of just the odd ones.
>
> But if one is concerned mostly about speed, then in the JPEG-style
> sampling case, using a box filter is not that bad. (translation: each
> chroma value is simply duplicated). Upsampling for the MPEG2 case can
> never be as simple.
>
> If one _is_ going to apply an alias-correcting filter to upsampling, one
> can do better than the Catmull-Rom interpolation suggested in that
> article. Mitchell and Netravali
> (http://portal.acm.org/citation.cfm?id=378514) investigated an entire
> family of bicubic interpolating filters:
>
> k(x)=(1/6)*{(12-9B-6C)|x|^3+(-18+12B-6C)|x|^2+(6-2B), |x|<1,
> (-B-6C)|x|^3+(6B+30C)|x|^2+(-12B-48C)|x|-(8B+24C), 1<=|x|<2,
> 0 2<=|x| }
>
> The values B and C are parameters. (1,0) corresponds to the traditional
> cubic spline. (0,C) is the 1-parameter family of splines which exactly
> interpolate their sample locations ("cardinal cubics"), and (0,1/2)
> corresponds to the Catmull-Rom spline. They performed tests with human
> observers and found a region of the parameter space which provides a good
> trade-off between blurring, ringing, and anisotropic artifacts, centered
> around the values (1/3,1/3).
>
> This yields the standard Mitchell filter:
> k(x)=(1/6)*{7|x|^3-12|x|^2+16/3, |x|<1
> -(7/3)|x|^3+12|x|^2-20|x|+32/3, 1<=|x|<2
> 0 2<=|x|}
>
> This is what is used for upsampling in the experimental encoder_example,
> at
> http://svn.xiph.org/experimental/derf/theora-exp/examples/encoder_example.c
>
>
More information about the Theora-dev
mailing list