[Theora-dev] 16 bits, cast on idct function

Wed May 31 10:07:28 PDT 2006

On 5/31/06, Timothy B. Terriberry <tterribe at vt.edu> wrote:
> Remembering to CC: the list this time.

:-)
my mistake

>
> Felipe Portavales Goldstein wrote:
> > On 5/31/06, Timothy B. Terriberry <tterribe at vt.edu> wrote:
> >
> >> Felipe Portavales Goldstein wrote:
> >> > My question is:
> >> >
> >> > The result of (_Gd + _Cd)  can be a number with more than 16 bits ?
> >> > (yes, it can be because they are int32, but the algorithm could
> >> > guarantee something about that... I dont know...)
> >>
> >> With normal input, certainly this would never occur. However, due to
> >> quantization error, rounding error, etc., it is theoretically possible
> >> to generate a number with more than 16 bits here.
> >
> >
> > Good :-)
> >
> >>
> >> > If can, the cast (ogg_int16_t) will truncate the number to the 16 less
> >> > significant bits, and will get a wrong result...
> >> >
> >> > the ip[0] is 32 bits, so, why truncate to 16 bits ?
> >>
> >> The main answer is, "To make SIMD/hardware implementations easier."
> >> These will generally use 16-bit registers, and so will automatically
> >> have done the truncation.
> >
> >
> > Your right, Its better to use 16-bit registers. And using 16-bit
> > adders and multipliers we can get shorters critical-paths , having a
> > higher clock rate.
> >
> > Then, I have other question:
> >
> > If the result is truncated to 16 bits, why the IntermediateData was
> > declared as 32 bits ?
> >
> >  ogg_int32_t IntermediateData[64];
> >  ogg_int32_t * ip = IntermediateData;
> >
> > I think this is because the dequant_slow result is 32 bits, and is
> > stored in the IntermediateData
> >
> > But, this dequant result is multiplied by a 16 bit defined cossine
> > factor , and this new result is shifted right 16 bits and stored in
> > IntermediateData
> >
> > Im thinking If I could use 16 bits IntermediateData array.
> >
> > The dequant especification says:
> > Output parameters:
> > DQC - integer array - size = 14 bits
> >
> > I think that I can use the InteremediateData as 16 bits integer.
> > What do you think ?
>
> Yes, you certainly can. On modern 32-bit CPUs, 16-bit instructions are
> very, very slow, so we avoid them when we can. The only real reason to
> use 16-bit operands on a 32-bit CPU is to save memory bandwidth, which
> is the primary bottleneck in video processing. Since IntermediateData is
> local, and likely to be entirely in cache, there's no reason to make it
> 16 bits.
>
> If you are implementing the iDCT for a different instruction
> set/architecture, I highly suggest working from Section 7.9.3 of the
> spec directly. The spec can be obtained from:
> http://www.theora.org/doc/Theora_I_spec.pdf

I'm working on a theora decoder on FPGA. I'm writing directly the
hardware in VHDL.

I'm preparing to put the VHDL files on the SVN and post in this list a
description of this work as soon as possible.

Yes, I'm reading the spec.
But sometimes the libtheora software can help.

>
> >> The important thing is not that the iDCT gives you valid values that
> >> make sense in such situations, but that it gives you the _same_ values
> >> across all implementations, even when the input is invalid. If that were
> >> not the case, then the decoded frame would not be the same as what the
> >> encoder _thought_ the decoded frame was going to be, and so the next
> >> subsequent frame would also be wrong, etc., all the way until the next
> >> keyframe.
> >>
> >> Think of it this way: you can never generate a _wrong_ result so long as
> >> you follow the specification. The specification tells you what result
> >> you're going to get for any input. If the encoder chose an input that
> >> caused overflow, well, that's the encoder's problem, not the decoder's.
> >>
> >> > But I'm realy confused with the >> 0 ,
> >> > This shift right zero can do something or someone just forgot to delete
> >> > it ?
> >>
> >> I assume the original author was playing around with dividing up the >>4
> >> in the op[] stage between the two. It doesn't matter; any compiler worth
> >> its salt will optimize the useless operation away.
>

-- 
________________________________________
Felipe Portavales <portavales at gmail.com>
Undergraduate Student - IC-UNICAMP
Computer Systems Laboratory
http://www.lsc.ic.unicamp.br