[theora-dev] Beginner Hurdles

Mon Aug 16 13:27:18 PDT 2010

Hi Zack,
I think I can help with some of this since I wrote the ogg/theora
decoder for LiVES (lives.sf.net) and more recently I reworked the
ogg/theora demuxer for vlc.

Converting from yuv420 to rgb:
take a look at the code here
http://lives.svn.sourceforge.net/viewvc/lives/trunk/src/colourspace.c
convert_yuv420p_to_rgb_frame()

for theora you should use clamped yuv (16-235, 16-240, 16-240). The
function uses arrays for conversion and does chroma super sampling
using an averaging array.

Seeking back to the beginning:
you can seek to byte 0 and skip the bos packets, or you can store the
position of data_start which follows the bos packets and seek back to
that. You must remember to call ogg_stream_reset() for each logical
stream after doing any seek operation.

Regards,
Salsaman.

http://lives.sourceforge.net
https://www.ohloh.net/accounts/salsaman

On Sun, Aug 15, 2010 at 5:30 PM, Zack Morris <zmorris at gmail.com> wrote:
> Hey everyone, I just got Theora running on my Mac, and ran across
> several hurdles, that I was wondering if someone could help me with.
> I do a lot of tech support at work, and get the same questions over
> and over, so I tried skimming the archives but couldn't find the
> answers.  Maybe these could go in a FAQ of some sort?  These are
> fairly unavoidable issues that should probably be better documented in
> the example at http://svn.xiph.org/trunk/theora/examples/player_example.c
>  since I think most people are looking for something a little higher
> level.  Here they are:
>
> 1. Many of us just want to render each frame to RGB. I decided to skip
> SDL for now and render the RGB in software so that I understand how it
> works.  I will probably make a shader to render the YUV directly at
> some point, but speed is not the issue for me right now.
>
> I'm rendering a 4:2:0 movie by hand right now, stretching the UV to
> double size like the Y.  If you follow the example on the line that
> says "} else if (px_fmt==TH_PF_444){" to convert YUV to RGB, you will
> end up with reds that are salmon colored, and the blues are too navy
> colored.  I've posted an example here:
>
> http://www.postimage.org/image.php?v=TsBWFfr
>
> The top 3 frames are variations where I'm trying to align to UV
> channels, but the important thing is that VLC does a very good job
> matching colors, and even though MPlayer falls down on the rendering,
> they both do much better than my code.  I'm guessing that the example
> code suffers from some problem with gamma on the Mac, so I'm wondering
> if someone has the proper coefficients to render:
>
> r = (1904000*(*py)+2609823*(*pv)-363703744)/1635200;
> g = (3827562*(*py)-1287801*(*pu)-2672387*(*pv)+447306710)/3287200;
> b = (952000*(*py)+1649289*(*pu)-225932192)/817600;
>
> Or if they have the conversion that will stretch this RGB to the right
> color space for the Mac.  I'm mainly concerned with the red and blue
> channels.  it's ok if the linearity is incorrect, because I'm mostly
> concerned with the max and min values.
>
> If nobody has a solution, then I will just sample my colors and come
> up with the factor to scale them to VLC, but this makes me a little
> uncomfortable.
>
> This could all be solved by including a th_decode_rgb_out() function.
> I realize this would be a convenience function, but if there's ever
> been a clearer need for convenience, I certainly can't think of one!
>
> 2. There is no example code to linearly interpolate the UV up to the
> Y.  I found on page 29 of the Theora Specification at http://theora.org/doc/Theora.pdf
>  that UV are centered on the 2x2 blocks of the Y.  I can come up with
> something that matches my conception of what's happening, but I worry
> that while videos are being encoded, that there is a standard for
> pixel centered or corner centered, which might end up with 1/2 pixel
> alignment errors.  I think that player_example.c should have a code
> snippet that shows how to properly align the channels.
>
> 3. player_example.c doesn't describe how to rewind the various state
> variables back to the beginning of the movie.  Seeking is probably
> overkill, but it's important to be able to rewind so that games can
> use movies that loop.
>
> 4. Timing is not well documented.  It took me a while to realize that
> th_granule_time() actually returns the granule time PLUS the duration
> of the current frame, returning the stale time, which is documented at http://www.theora.org/doc/libtheora-1.0/group__basefuncs.html#g707e1e281de788af0df39ef00f3fb432
>  but also needs a comment on the line that says
> "videobuf_time=th_granule_time(td,videobuf_granulepos);".  Also, there
> is no documentation that I can find that describes how Vorbis does
> timing, so I came up with the following code, where I calculate my own
> timing based on the current byte position in the audio stream:
>
> *audioStartTime = audiobuf_time;        // return the buffer time from the
> last call, which was the start of the buffer
> audiobuf_time = ((double) audiobuf_granulepos)/vi.rate;
> audiobuf_granulepos += (*numBytes/2)/2; // a granule is a left/right
> stereo pair, so the [size in bytes divided by 16 bits (2 bytes)
> divided by stereo (2 chans)]
> *audioStaleTime = audiobuf_time;
>
> As far as I can tell, this results in proper syncing but I could be
> off a frame and not know it...
>
> 5. player_example.c is much too complicated.  Now that my code is
> finished, this is my draw loop.  Under the hood, I'm using an RGBA
> texture in OpenGL and a buffered sound channel class a made on top of
> OpenAL.  I believe the code largely speaks for itself, but included
> comments on the confusing parts:
>
> ////////////////////////////////////////
>
> Video                   video;          // my wrapper class for Ogg Theora
> ImageBuffer             videoBuffer;    // my wrapper class for OpenGL texture
> BufferSoundChannel      bufChan;        // my wrapper class for OpenAL
>
> if( video.Load( "~/Desktop/ogg-theora-tests/320x240.ogg" ) != noErr )
> ErrorDialog( "Couldn't load file" );
>
> double          startTime = video.HasAudio() ? 0 : bufChan.GetTime();
>
> double          audioStartTime = 0, audioStaleTime = 0,
>                        videoStartTime = 0, videoStaleTime = 0;
> int                     audioStatus = kVideoAudioBuffering, videoStatus =
> kVideoVideoBuffering;
> Boolean         audioPlaying = false;
>
> while( !KeyDown( escape_key ) )
> {
>        short   *buffer;
>        uint    size;
>        double  theTime;
>
>        //if( video.Done() ) video.Reset();     // I don't have rewind working yet
>
>        video.Idle();   // buffer data even if no audio or video frames to
> display
>
>        theTime = bufChan.GetTime() - startTime;
>
>        if( audioStatus < 0 && bufChan.IsReady( (void**) &buffer,
> &size ) )       // see if the channel is ready to accept more audio, and if
> so, get the next temporary buffer in which to write the samples
>                audioStatus = video.GetAudio( buffer, &size, &audioStartTime,
> &audioStaleTime );      // returns the frame number of the audio, or a
> negative value to indicate wrong state or an error
>
>        if( audioStatus >= 0 )
>        {
>                theTime = bufChan.GetTime() - startTime;
>
>                if( theTime >= audioStartTime )
>                {
>                        if( !startTime ) startTime = bufChan.GetTime(); // start the sync
> at the exact moment we play the first buffer
>
>                        audioPlaying = true;
>
>                        bufChan.Play( buffer, size );   // queue the next audio buffer
>
>                        audioStatus = kVideoAudioBuffering;     // indicate that we are ready
> to read the next audio frame
>                }
>        }
>
>        theTime = bufChan.GetTime() - startTime;
>
>        if( videoStatus < 0 && (audioPlaying || !video.HasAudio()) )    // wait
> for the first audio buffer to play, because video can play faster to
> catch up, but audio is locked into the sample rate
>                videoStatus = video.GetVideo( videoBuffer, &viewRect,
> &videoStartTime, &videoStaleTime );     // returns the frame number of the
> video, or a negative value to indicate wrong state or an error
>
>        if( videoStatus >= 0 )
>        {
>                theTime = bufChan.GetTime() - startTime;
>
>                if( theTime >= videoStartTime )
>                {
>                        FlushToScreen();        // can just flush the texture buffer to the
> screen, because the video frame was written to videoBuffer when the
> frame was ready
>
>                        videoStatus = kVideoVideoBuffering;     // indicate that we are ready
> to read the next video frame
>                }
>        }
> }
>
> ////////////////////////////////////////
>
> Conceptually, the important parts are that I can grab the next audio
> or frame whenever I need them, and then I just wait until my buffered
> sound channel's timestamp reaches the audio and video buffers' start
> times.  Then I either queue the audio buffer samples or flush the
> video buffer to the screen.  I'm not calibrating my audio clock, but
> that's ok for the short duration videos we plan to use for now.
>
> IMHO, this is the example that people are looking for, and I don't see
> Theora gaining popularity in games as quickly if there is no high-
> level usage example like this.  I don't mean to sound harsh, it's just
> that I have a 4 year engineering degree and 20 years of programming
> experience and still found player_example.c to be quite cryptic.
> Conceptually it's very well written, and I appreciate things like
> select on the audio stream, but unfortunately with Mac and Windows
> operating systems, a lot of programmers today don't understand the
> simplicity of streams or even how to access them through the OS.
>
> Sometimes we take our experience for granted and expect others to
> either know what we are talking about or to read the entire manual and
> understand the fundamentals before using a library.  But in the real
> world, it's almost impossible for that to happen when there are
> divisions of labor and constant deadlines.
>
> If I can get the RGB gamma stuff working properly, I might consider
> releasing this code as an alternate player example that uses OpenGL
> and OpenAL.  It might take me a while, because my OpenGL and OpenAL
> stuff is tied pretty solidly into our engine, but maybe I could rig a
> barebones example with GLUT.  But I would like to see it happen :)
>
> Well I guess that's it, sorry for the long email, but I think these
> are issues that everyone is going to hit, so they need to be remedied.
>
> Thanxs for writing Ogg Theora, I think it's a wonderful library, just
> needs some better documentation and examples.  I tend to work on the
> back end of things and realize that getting someone to write concise
> example code is like herding cats.
>
> Zack Morris
> zmorris at zsculpt.com
> _______________________________________________
> theora-dev mailing list
> theora-dev at xiph.org
> http://lists.xiph.org/mailman/listinfo/theora-dev
>