[vorbis] Video codec

Jelle Foks jelle-foks at list.dedris.nl
Mon Sep 11 04:07:11 PDT 2000



Just for clarity, so that we have the correct terminology and numbers,
and I'll raise some issues that I think should be considered when
designing Ogg Video.

Digital Broadcast Quality Video is described in CCIR601/656, which is
basically the following:

Active Frame Size    | Frame Rate   | Subsampling | Active pixels per
second 
---------------------+--------------+-------------+-------------------
NTSC: 720x480        | 1000/1001*30 |  4:2:2      | ~10.3M            
PAL:  720x576        | 25           |  4:2:2      | ~10.3M            

A 'frame' is a full image of video. In interlaced video, a frame
consists of two fields, the even field and the odd field.

The video signals are encoded in the YCbCr color space (Luminance +
Crominance-Blue + Crominance-Red). Each of the color components Y, Cb,
or Cr is called a 'subpixel'. A subpixel in CCIR601/656 has a precision
of 8 bits.
The subsampling of CCIR601/656 is called '4:2:2' subsampling in 'mpeg
terms', and means that the crominance pixels are decimated by factor of
two in the horizontal direction. The result is that color has only half
resolution in the horizontal direction (360x480NTSC/360x576PAL). To be
honest, this subsampling is the first step of lossy compression of a
factor  ((1+1+1)*8)/((1+0.5+0.5)*8)=1.5, because a 24bpp image is
described with an average of 16 bits per pixel after reduction of the
chrominance resolution.

The number of 216Mbit mentioned here is CCIR601/656 video data including
the blanking and retrace interval overhead (a CCIR601/656 video stream
also contains non-active pixels, because it also contains the timing so
that the video data can easily be transformed to and from the analog
domain).

My opinion is that, when discussing video compression, it is confusing
to speak of 'compression ratios', because it is never clear whether
compression ratio before or after subsampling is meant, and whether or
not non-active pixels were counted in the non-compressed stream.

A factor of 100 compression of the 216Mbit stream would result in a
2.16Mbit stream. However, a factor of 100 compression of the active
CCIR601/656 video pixels would result in a 10.3*16/100=1.65Mbit stream.
There is a 24% difference between the two numbers.

I suggest using the term 'bits per pixel' to quantify the compression
ratio. With that number there are no unclarities and it's easy to
calculate the resulting video bit-rate given the video image resolution.
'D1 at 1.5Mbps' is approx 0.15 bits per pixel, 'D1 at 3Mbps' is approx
0.3 bits per pixel.

Rough numbers: With JPEG compression, you get between 1-5 bits per
pixel, jpeg is mostly used in the range of 1-2 bits per pixel. JPEG200
claims to get 4-8 better compression than JPEG, if that is true it's
about the range of 0.15-1.25 bits per pixel. With MPEG compression, you
can get between 0.15-1.5 bits per pixel, depending on the encoder and
image quality of course (and the MPEG version, MPEG1, MPEG2, or MPEG4).
When counting uncompressed video as 24 bits per pixel, this explains the
claimed 100x compression of MPEG video at 0.24 bits per pixel. Below
0.15 bits per pixel is often very agressive coding for applications such
as video conferencing, in which case large parts of the image are left
completely unchanged (H.263/H.26L).

I think if we want to compete based on compression ratio, then we should
somehow get at 0.1 bits per pixel or below. A CDROM is approx
650x8=5.2Gbits, so for an hour of video you have 5200/3600=1.44Mbits/s,
which would dictate a compression to below approx (1.44/10.3)=0.14 bits
per pixel if there is to be any room left for audio etc.

Of course it's easy to get 0.14 bits per pixel if there is no quality
requirement... When comparing compression methods, image quality is
often measured in PSNR (dB) or MSE (mean squared error). A compression
method can be considered better if it achieves better PSNR/MSE at
similar bit rates, or lower bit rates at similar PSNR/MSE. So, when
introducing a video compression method with amazing bit-rates, it can be
proven to have better quality than the alternatives by comparing the
PSNR/MSE at various bit-rates. Of course, the effecitveness of PSNR or
MSE as image quality measure is a point of discussion, so there is
always still room for interpretation of the numbers (note that there are
other measurement methods that attempt to give better numbers, there's
even an expert group (www.crc.ca/vqeg)).

Ok, then there is the issue of variable or fixed bit-rate and variable
or fixed quality and encoder and buffering latency. If you have a
variable bit-rate encoder for a fixed quality stream, or a fixed
bit-rate encoder for a variable quality stream, then you can keep the
buffers small to reduce the latency. However, if you put a maximum on
the bit-rate, and don't want to accept occasionally reduced image
quality of the video, then you will need buffering to even out the
bit-rate on the hard-to-encode pieces of video, which of course
introduces latency. When buffering is needed, the decoder must know how
much to fill the buffer before starting to display to ensure that later
on, during display it never has to wait for compressed data to be
received during the hard-to-compress video scenes. Additionally, there
may be a limitation on the buffer size that is economical in the decoder
(especially in hardware, RAM=money). The MPEG standards include a scheme
to control this, centered around the 'video buffer verifier (VBV)'.  I
think Ogg video should address this issue as well.

Cheers,

Jelle.

Chrissy and Raul wrote:
> 
> Hi,
> 
> I guess this is a good time to start putting together a wish list for a
> video codec.
> 
> I see that for audio the compression is around 10X for reasonable quality.
> I am sure this will start its own thread of conversation.
> 
> For video you can do 40X fairly easily and the big task is to go to 80X or
> 100X with reasonable picture quality, say, a peak luma SNR of more than 30
> dB.  Uncompressed Professional Quality video (called "D1" see below), like
> the one at TV stations before broadcast, is 216 Mbps.  Smaller resolutions
> have less bps of course.
> 
> One of the many tasks is going to be to work around the existing patents but
> if the audio guys can do it, the video guys should be able to as well.  Not
> everything has been discovered or patented for video compression.
> 
> I suggest to focus on "SIF", quarter-screen video (352x240 for NSTC
> rectangular, TV, pixels and 320x240 for square, computer, pixels).  For
> nomenclature purposes full screen is "VGA" is 640x480p progressive and "D1"
> is 720x480i interlaced (some people also use "Half D1" 360x480i for some
> products).  Computers are progressive, TV is interlaced.  30 Frames/second,
> or 60 fields/second yields a natural moving image that does not suffer too
> much from "jumpiness" during pans.  Movie film is 24 fps sometimes
> presenting each frame three times for a net 72 fps.
> 
> Experiments show good quality for SIF, 30 progressive frames/second at 512
> Kbps system bitrate (Audio is 96 Kbps).  There are some experiments on D1
> (full) resolution at 1.5 Mbps video-only but the quality is not good.  D1
> resolution at 3 Mbps can look good today.
> 
> Any input on desired resolutions, bitrates, color resolution (color
> subsampling), frame rates, etc?
> 
> I will be unavailable until Monday, Sep 11th so if you send e-mail or post
> questions I will not be able to get back to you until Sep 11th.
> 
> Excuse the apparent lack of order, I just want to start throwing
> ideas/concepts to the list.  All can be clarified and classified in due
> time.
> 
> RAUL LOPEZ
> _________________________________________________________________________
> Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com.
> 
> Share information about yourself, create your own public profile at
> http://profiles.msn.com.
> 
> --- >8 ----
> List archives:  http://www.xiph.org/archives/
> Ogg project homepage: http://www.xiph.org/ogg/
> To unsubscribe from this list, send a message to 'vorbis-request at xiph.org'
> containing only the word 'unsubscribe' in the body.  No subject is needed.
> Unsubscribe messages sent to the list will be ignored/filtered.

--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'vorbis-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.



More information about the Vorbis mailing list