[Theora-dev] Questions about efficiency.
linesoft at plotinka.ru
Wed Jun 22 01:31:06 PDT 2005
Thanks for the previous answers, now there was a question of efficiency. I use the formula of transformation YUV12->RGB:
float r = nY + 1.371f * ( nV - 128 );
float g = nY - 0.698f * ( nV - 128 ) - 0.336f * ( nU - 128 );
float b = nY + 1.732f * ( nU - 128 );
frame[index + 0] = ClampFloatToByte( r );
frame[index + 1] = ClampFloatToByte( g );
frame[index + 2] = ClampFloatToByte( b );
frame[index + 3]= 255;
But an execution time of this transformation on my computer of the order of 10-15 milliseconds - very long.
After I have translated all calculations in integer area such code has turned out:
register signed short r = (signed short)nY + (signed short)( ( (unsigned char)175 * nV ) >> 7);
register signed short g = (signed short)nY - (signed short)( ( (unsigned char)89 * nV + (unsigned char)43 * nU ) >> 7);
register signed short b = (signed short)nY + (signed short)( ( (unsigned char)222 * nU ) >> 7);
frame[index + 0] = ClampShortToByte(r);
frame[index + 1] = ClampShortToByte(g);
frame[index + 2] = ClampShortToByte(b);
frame[index + 3] = 255;
This code is carried out the order of 5-7 milliseconds that is much better, but all the same is insufficiently fast.
Optimization I see the further in application MMX because data is integer and it is necessary to clamp. However, it
seems to me, I " invent a wheel " and I go that by which already have passed many people up to me.
Whether prompt, please, there are accessible examples of as much as possible effective performance of
transformation YUV12->RGB and where it is possible to read about it.
P.S. My video card doesn't support pixel shaders, so I cannot use it to complete this task.
More information about the Theora-dev