[xiph-cvs] r6633 - trunk/theora/doc/spec
giles at xiph.org
giles at xiph.org
Fri May 7 17:40:39 PDT 2004
Author: giles
Date: 2004-05-07 20:40:38 -0400 (Fri, 07 May 2004)
New Revision: 6633
Modified:
trunk/theora/doc/spec/spec.tex
Log:
Include reasoning behind the edits made in the previous commit.
Also fix too typos.
<p>Modified: trunk/theora/doc/spec/spec.tex
===================================================================
--- trunk/theora/doc/spec/spec.tex 2004-05-06 23:56:08 UTC (rev 6632)
+++ trunk/theora/doc/spec/spec.tex 2004-05-08 00:40:38 UTC (rev 6633)
@@ -308,6 +308,10 @@
{\em and} the co-located blocks in the chroma planes.
Thus macro blocks can represent anywhere from six to twelve blocks, depending
on how the chroma planes are subsampled.
+% the whole super vs. macro blocks thing is a little confusing, and it can be
+% hard to remember which is what initially. A figure would/will help here,
+% but I tried to add some text emphasizing the difference in terms of
+% functionality.
Super blocks describe an independent group of blocks within a single plane
while macro blocks group blocks from all the planes that cover a specific
area of the frame.
@@ -318,6 +322,7 @@
This coded order proceeds by examining each super block in the luma plane in
raster order, and traversing the four macro blocks inside using a smaller
Hilbert curve, as shown in Figure~\ref{fig:hilbert-mb}.
+%r: I rearranged the wording to make a more formal idiom here
If the luma plane does not contain a complete super block on the top or right
sides, the same ordering is still used, with any macro blocks outside
the frame boundary simply omitted.
@@ -371,6 +376,7 @@
To each block's predictor, a \term{residual} is added to form the final
contents of the block.
+%r: I reworded this next sentence to try and make it more decoder-oriented
The residual is stored as a set of quantized coefficients from an integer
approximation of a two-dimensional Type II Discrete Cosine Transform.
The DCT takes an an $8\times 8$ array of pixel values as input and returns an
@@ -436,6 +442,8 @@
pixel format, and a version number.
The version number is divided into a major version, a minor version, amd a
minor revision number.
+%r: afaik the released vp3 codec called itself 3.1 and is compatible w/ theora
+%r: even though we received the in-progress 3.2 codebase
For the format defined in this specification, these are `3', `2', and
`0', respectively, in reference to Theora's origin as a successor to the VP3.1
format.
@@ -446,6 +454,7 @@
each \term{quantization type} (intra or inter), \term{color plane}
($Y'$, $C_b$, or $C_r$), and \term{quantization index}, \qi, which ranges from
zero to 63, inclusive.
+%r: I think 'nominally' is more specific than 'generally' here
The quantization index nominally represents a progressive range of quality
levels, from low quality near zero to high quality near 63.
However, the interpretation is arbitrary, and it is possible, for example, to
@@ -467,6 +476,8 @@
There is a separate set of base matrices for each quantization type and each
color plane, with up to 64 possible base matrices in each set, one for each
\qi value.
+%r: we will mention that the given matricies must bound the \qi range
+%r: in the detailed section. it's not important at this level.
Typically the bitstream contains matrices for only a sparse subset of the
possible \qi values.
The base matrices for the remainder of the \qi values are computed using linear
@@ -653,7 +664,7 @@
\paragraph{Motion Vectors}
-Intra frames are all centirely in INTRA mode, and this stage is skipped.
+Intra frames are coded entirely in INTRA mode, and this stage is skipped.
Some inter coding modes, however, require one or more motion vectors to be
specified for each macro block.
These are decoded in this stage, and an appropriate motion vector is assigned
@@ -951,7 +962,7 @@
s_bB
\end{array}\right]
\end{eqnarray*}
-Parameters: $x_r,x_g,x_b,x_q, y_r,y_g,y_b,y_w$.
+Parameters: $x_r,x_g,x_b,x_w, y_r,y_g,y_b,y_w$.
\end{description}
@@ -1312,6 +1323,8 @@
Often, the encoded packet bitstream is not an integer number of bytes, and so
there is unused space in the last byte of a packet.
+%r: I think the generality here is necessary to be consistent with our assertions
+%r: elsewhere about being independent of transport and byte width
When a Theora encoder produces packets for embedding in a byte-aligned container,
Unused space in the last byte of a packet is always zeroed during the encoding
process.
@@ -1367,10 +1380,14 @@
The identification header is type 0x80, the comment header is type 0x81, and
the setup header is type 0x82.
These packets must occur in the order: identification, comment, setup.
+%r: I clarified the initial-bit scheme here
All header packets have the most significant bit of the type
field, which is the initial bit in the packet, set.
This distinguishes them from video data packets in which the first bit
is unset.
+% extra header packets are a feature Dan argued for way back when for
+% backward-compatible extensions (and icc colourspace for example)
+% I think it's reasonable
Packets with other header types (0x83--0xFF) are reserved and must be
ignored.
@@ -1504,6 +1521,7 @@
belongs in a separate logical stream that provides greater structure and
machine parseability.
+%r: I tried to morph this a little more in the direction of our application space
The comment field is meant to be used much like someone jotting a quick note on
the label of a video.
It should be a little information to remember the disc or tape by and explain it to
--- >8 ----
List archives: http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'cvs-request at xiph.org'
containing only the word 'unsubscribe' in the body. No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.
More information about the commits
mailing list