[theora] Indexing Ogg files for faster seeking

Chris Pearce chris at pearce.org.nz
Tue Oct 13 13:46:09 PDT 2009


On 10/11/2009 10:55 PM, Silvia Pfeiffer wrote:
> Since you have implemented both, a one-index and a multi-index format
> - what were your experiences? Which is better?

The multi track index was significantly easier to implement, as I didn't 
have to merge multiple lists of keyframes into a single keyframe index. 
I had a few subtle bugs in the multi-index the first time.

> I suppose for
> multi-index files, when you seek, you have to actually look at all the
> indexes and find the minimum packet (byte location) across all the
> tracks to which to seek.

Yup, a simple bisection search in each track's index will suffice. Then 
just seek the file to the minimum of all the streams' keypoints' offsets.

> Did you do any comparisons whether that is
> faster in the end? I can see the overhead is substantial, but not
> really that big if it's about 200bytes for a 10min video.

10mins of video takes about 11KB to index, depending on the video. A 
bisection search is O(log n), so finding a key point for any track is 
fast. Any time spend doing these searches is dwarfed by the cost of 
network latency.

Just a thought, anyone who's worried about the index being too large a 
blob at the start of a large file served as HTML5 video can always 
include a poster image to display while the video's downloading.

> I personally think the multi-index is better

I think we now all agree on this at least. :)

> Also, I still believe it would make sense to include the Index packets
> into Ogg Skeleton (as you may have gathered from the emails I just
> sent on the other thread).

I accept that adding the index to the skeleton would be a carrot to 
encourage players to use the skeleton, just so they can get the index. 
It's not clear to me how a player is otherwise advantaged by reading the 
skeleton track, hence my reluctance to include the index in the skeleton.

> Ogg Skeleton has been built to be extensible and to include all "meta
> data" about the content tracks, so IMO it is still a natural fit. Ogg
> Skeleton is really a wrapper for every other content track in Ogg, so
> an Index should really go into it. People who don't want to use the
> Index can just use a previous version of Ogg Skeleton. Or
> alternatively we can make the new Ogg Skeleton version flexible to
> include or not include an index per track. Splicing in won't be more
> difficult if we follow Conrad's suggestion of adding a index packet
> per track rather than merging it with the existing packet per track.

What software out there currently reads the skeleton?

I'm also concerned that any players which actually read the skeleton 
will be crashy. I'll mock up some Oggs with indexes in the skeleton, and 
see how players handle them.

> Further on the argument for re-using Skeleton: it already has a field
> in its first header called "presentation time" which is indeed what
> you are repeating in the Ogg Index header as "playback start time". An
> addition of "playback end time" is possible - I assume you are
> recommending this as a replacement for the HTTP X-Content-Duration
> header?

Yeah, I want a mechanism which reports the duration which doesn't 
require server support. Configuring servers is a hassle.

It may be wise to allow the X-Content-Duration header to override the 
duration from the index, in case the resource is a chain.

> Also, is the "length" basically the difference between the
> playback start and end time, but in bytes? How does that help?

Yup, it's the length in bytes. I included that so that you can jump 
between segments easily in chained files. That way if you have a chain 
of multiple ogg files ("sequentially multiplexed bitstreams" as you 
refer to them in the Skeleton RFC), you know can determine the offset of 
the start of the next segment/file. When seeking, if the seek target is 
outside of the bounds of your segment's index's [start time, end time], 
you know to skip to the start of the next segment to try seeking in there.

> Anyway - I love the work and I want it moved forward. Let's sort out
> the details.

Yes lets! What is the process to getting this "blessed" by the community 
and an official Ogg standard? What's the process?

FYI: I'm going to start work on a patch to ffmpeg2theora soonish.

All the best,
Chris P.


More information about the theora mailing list