[Theora-dev] Re: [ogg-dev] OggYUV

Tue Nov 8 05:36:52 PST 2005

> But chroma subsampling? no.  And this is where much of the complexity 
> comes.
>
> If we were to combine them, we would be, essentially, doing it something 
> like
> this:
> Value Meaning
>   0    RGB
>   1    YUV444
>   2    YUV422
>   3    YUV420
>   4    YUV411
>   .....

Yes.

> And then spend an additional field on bits/channel, whereas both chroma 
> channels
> in YUV is are going to have the same size.  Oh, please, let there not be 
> an
> exception to this.

Well depending what you want to do... not only are there samping 
differences, there are also ordering differences, eg. interleaved or planar, 
and if interleaved, in which order... eg. among the windows fourcc's... YV12 
(which is most similar to theoras output)... then there is IYUV, which is 
the same, except the U and V planes are in the opposite order.

Then there's YUY2 which is interleaved Y0 U0 Y1 V0 Y2 U1 Y3 V1, and YVYU (Y0 
V0 Y1 U0 Y2 V1 Y3 U1), and UYVY (U0 Y0 V0 Y1 U0 Y2 V0 Y3)... and then 
there's AYUV, which has a 4th alpha channel.

Then there's the issue of where the samples lie on a grid in relation to the 
pixels centre, do the samples centre over the pixels in the horizontal or 
vertical direction, or do they fall at the mid point between 2 pixel 
centres.

And then there's the colour spaces (which i don't know all the details of, 
but i'm sure derf or rillian can tell you all about it).

The way i see it, if you are suggesting have a bits per sample/pixel field, 
a planar or interleaved field, a field for the subsampling in the horizontal 
and vertical directions, some way to denote the ordering of interleaved 
channels, a field to specify if there is an alpha channel, and if there is, 
where and how it's represented, and perhaps something to accurately specify 
the colour space, basically what you are doing is opening up millions of 
possibilities, most of which are completely useless. If you have a bits per 
channel field in RGB, what about RGB24, 3 channels, 8 bits each, but padded 
into 32 bits. RGB555, 15 bits, padded to 16.

There's thousands of invalid possibilities, and only 15-20 or less valid 
ones... only really 3-5 commonly used.

If someone wants to go crazy and design a franken-yuv format for some 
bizarre reason, then they can easily make another stream format... but you 
can pretty much count the ones people actually care about and that are used 
in 90% of cases on one hand, YV12 (4:2:0), YUY2(4:2:2),  RGB24, ARGB and 
maybe RGB555.

And if the format is enumerated, then all the other fields will be the same 
for both RGB and YUV. It will be simple, it doesn't open up the possibility 
that people can specify bizzare combinations of things that you have to 
check against, other wise fail, or do some unknown amount of transformation 
to get to something you can display.

Also, on another issue, i already find the method of codec identification 
pretty ad hoc... i think having ident fields that are only 3 or 4 bytes is a 
very bad idea.

Zen.