[Theora-dev] ogg and keyframes
Ralph Giles
giles at xiph.org
Wed Sep 29 16:30:48 PDT 2004
On Wed, Sep 29, 2004 at 08:31:46PM +0200, Thomas Vander Stichele wrote:
> When thinking more about this problem I was convinced that it cannot be
> guaranteed by the server to start from a keyframe. Because if you're
> sending out an ogg stream with *two* theora streams, you can't make sure
> both of them start at a keyframe.
Well, yes you can. That's why I was curious about how your fix worked.
The two things I've thought of are:
You can do a special re-encode of everything up to the first keyframe
for each new client that connects. Could be quite expensive, but you can
degrade by just repeating frames if you run out of resources. Is a
jerky/static picture with sound better than a 'buffering...' spinner
with sound?
You can just adjust the sync between the various clients so each new
connection starts streaming from the previous keyframe. Cheaper, but
increases latency and server-side buffering. You're right about multiple
theora streams in this case; I hadn't though of that. Treating it as
'burst on connect' might work once moore's law makes the stream less
expensive.
> Whether it should be in the lib or not is a different matter.
> Personally, I'd feel the lib should do it, since really, what is it
> decoding and then sending on to the app if it didn't have a keyframe
> yet ?
All of which is fairly convincing that regardless of what kind of job
the server does, the client should handle this case. It needs to worry
about many of the same issues after seek anyway.
However, having libtheora itself 'just' do this is problematic. Waiting
for a keyframe after decoder initialization would handle the streaming
case, but another common place you'd want this is after seek. We could
look at the packetno passed in as part of the ogg_packet structure, and
set the wait flag whenever there was a gap, but that feels unwise to me.
It comes out naturally from an Ogg stream, but it may be inconvenient to
set this meaningfully from another packet source, and having decode
never produce and output because you didn't increment the field
certainly violates least surprise. Also, for example, it's likely better
to keep decoding if only one or two inter frame packets were lost to
dropout than to blank completely until the next keyframe, but the
playback engine is in a better position to set policy on that than the
codec.
Which brings me to there being an explicit call to set the wait flag.
Something like theora_decode_ctl(ctx, SET_KEYFRAME_WAIT); You'd just
call that after decoder init, and after seek, and things would take
care of themselves.
Having the decoder just eat non-keyframes when the wait flag is set
is the best option for performance, but this requires keeping track of
the flag in the playback engine as well, otherwise the recommended
greedy algorithm with starve audio playback waiting for the first video
frame.
I had thought of generalizing the YUV buffer into something that had a
'type' field. The main idea would be to support some convenient
colorspace conversions (like to chunked sRGB) to be included in the
library without an extra filter step, but a 'null' frame type could be
added to carry the frame skip information back to the playback engine.
Thanks for the feedback. Hope you have time for more. :)
-r
More information about the Theora-dev
mailing list