[Theora-dev] & [ogg-dev] OggYUV
arc at Xiph.org
Tue Nov 8 10:24:37 PST 2005
On Tue, Nov 08, 2005 at 12:36:44PM -0500, John Koleszar wrote:
> If you want cameras to source this format someday, it'd be wise to
> support the packed formats. A lot of cameras, DV in particular, use YUY2.
Not the camera, only the application which goes between the camera and
OggStream. Plus, changing between packed and planar is easy, my reason for
wanting to avoid packed is there's simply too many different ways to do it.
> >Media players don't have to support every format, nor does any video
> >codec. If a video codec (ie, DV) can only output 4:1:1 and the media
> >player only takes 4:4:4 then an intermediary plugin will be needed to do
> >the convertion.
> Agreed.. But the problem becomes how to send data to the color
> conversion plugin. If you don't specify a way to put other formats into
> this stream, your plugin has to do the work of extracting the data from
> the original stream and performing a color conversion.
Um, other formats? I don't grok your statement. This is what I'm talking about:
[Theora] -> Theora plugin -> [OggYUV:420] -> Colorspace Plugin -> [OggYUV:444] -> Media Player
The current design of OggStream lets you build a chain, if needed, to decode.
The Colorspace plugin only needs to understand OggYUV with the chroma formats it
needs to convert, Theora only needs to be able to export to the file's native
format, and the media player only needs to be able to import YUV444.
A more intelligent Theora plugin is, thus, not nessesary.
> I think the design is cleaner if you have a stream extraction plugin
> just extract the data from whatever the source is
> (decodec/http/whatever) and put it into an ogg stream, then you let your
> plugin negotiation find the "one true xxx->theora/yv12" converter.
> Greatly simplifies writing extraction plugins if you don't have to
> understand anything beyond bytes and buffers.
Our terminology is different, but I think you just argued for the same thing
I've been promoting. :-)
OggStream in it's current form is very, VERY simple. Ogg packets in, Ogg
packets out. Ogg packets are convient because they're just buffers of
timestamped data and libogg2 supplies a nice set of bitpacking/unpacking and
bytepacking/unpacking functions to use on them. libogg2 also provides a very
nice zerocopy memory management for these buffers, which is why I'm using it.
OggFile (or your favorite media framework, network protocol handler, etc) takes
the data from whatever format it's in, wether that be an Ogg file, RTP,
Quicktime, etc and extracts the packets from it, sending them to OggStream for
decoding and setting up the plugin chain using information provided by
I divide functionality along these lines. In my experience, establishing
"layers" to divide what different pieces do both makes the task of implementing
optimising and debugging simpler as well as makes it more flexible overall.
> >So what I propose for OggYUV is to cover the capabilities of Ogg video
> >codecs, everything Theora is capable of and prehaps a bit more that we've
> >seen from other codecs. 4:4:4, as I recall, is supported by the Theora
> >spec (even if the current implementation doesn't).
> I can't speak for what Theora can support today, but the VP3 source it
> derived from supported UYVY, YVYU, YUY2, and RGB24/32 source data as well.
I think much of that was cleaned up and extended during it's convertion to
Theora. I believe additional chroma sampling methods were offered, at least in
the specification for a header value...
So what exactly is lacking from the current YUV spec, and what is overkill? It
sounds like everyone is basically on the same page, we shouldn't try to support
everything /possible/ but that we should try to support everything /common/.
I think 5, as illum. suggested, is too few. Prehaps we pick the 7 most popular
YUV chroma subsampling formats (ignoring RGB for now, I still think it's best
left seperate), have a 8th option for "extended" (for later minor versions,
defined with an additional header field which doesn't exist now), have the bit
for wether it's interlaced or not, have 15 possibilities for packing (with 16th
being extended) which, as I've read, commonly includes funky packed bit values..
so something like this for these three fields:
3 Chroma Subsampling
4 Bit Packing
Chroma Subsampling (these are my suggestions used for example only)
2 4:2:2 doubled
3 4:2:2 blended
4 4:2:0 doubled
5 4:2:0 blended
6 <something else, maybe 4:1:0>
Bit Packing (again, only examples)
0 Planar Y8 UV8
1 Planar Y16 UV8
2 Planar Y16 UV16
3 Planar Y24 UV12
3 Packed Y8 UV8
4 Packed Y16 UV8
5 Packed Y16 UV16
6 Packed Y12 UV10
7 <something else>
Some of the combinations between these will be weird, but again, nothing is
required to support any specific format. When acceptability is being
detirmined, page0 is sent to the codec, and later to the web if nothing local is
found, and the colorspace/packing/etc is all in page0, so a codec or converter
plugin (which are the same in OggStream's eyes) can give a certain answer.
A plugin can look at page0 and know "I only support planar 4:4:4, 4:2:2, 4:2:0,
or packed 4:4:4, which this is one of, so yea I can handle it"
I suggest the above because it results in these three fields being byte aligned.
If put last, they do not need to be byte aligned, and it may even be benefitial
to leave one padding bit at the end for a future minor revision to use if we
don't need all of it (ie, if we want only 7 bit packing formats).
We may want a bit that says nothing more than "packed or planar" then a seperate
few bits for how many bits-per-sample for luminance and chroma is needed. I
feel that, at the very least, 8:8, 16:8, and 16:16 needed. I've seen weird
12:10 and 24:12 formats used by other codecs, which is why I included them in
the example table above.
The recognition of individual possibility,
to allow each to be what she and he can be,
rests inherently upon the availability of knowledge;
The perpetuation of ignorance is the beginning of slavery.
from "Die Gedanken Sind Frei": Free Software and the Struggle for Free Thought
by Eben Moglen, General council of the Free Software Foundation
More information about the Theora-dev