[theora] Indexing Ogg files for faster seeking
silviapfeiffer1 at gmail.com
Sun Oct 11 02:55:10 PDT 2009
I have some questions.
Since you have implemented both, a one-index and a multi-index format
- what were your experiences? Which is better? I suppose for
multi-index files, when you seek, you have to actually look at all the
indexes and find the minimum packet (byte location) across all the
tracks to which to seek. Did you do any comparisons whether that is
faster in the end? I can see the overhead is substantial, but not
really that big if it's about 200bytes for a 10min video.
I personally think the multi-index is better - in particular if media
fragment URIs in future will allow you to address just a subpart of
all the tracks of a video and thus the server can just cut and past
the tracks out rather then having to re-calculate an index.
Also, I still believe it would make sense to include the Index packets
into Ogg Skeleton (as you may have gathered from the emails I just
sent on the other thread).
Ogg Skeleton has been built to be extensible and to include all "meta
data" about the content tracks, so IMO it is still a natural fit. Ogg
Skeleton is really a wrapper for every other content track in Ogg, so
an Index should really go into it. People who don't want to use the
Index can just use a previous version of Ogg Skeleton. Or
alternatively we can make the new Ogg Skeleton version flexible to
include or not include an index per track. Splicing in won't be more
difficult if we follow Conrad's suggestion of adding a index packet
per track rather than merging it with the existing packet per track.
Further on the argument for re-using Skeleton: it already has a field
in its first header called "presentation time" which is indeed what
you are repeating in the Ogg Index header as "playback start time". An
addition of "playback end time" is possible - I assume you are
recommending this as a replacement for the HTTP X-Content-Duration
header? Also, is the "length" basically the difference between the
playback start and end time, but in bytes? How does that help?
Anyway - I love the work and I want it moved forward. Let's sort out
On Thu, Oct 8, 2009 at 11:48 AM, Chris Pearce <chris at pearce.org.nz> wrote:
> Below is another version of the index track spec with one index packet
> per stream.
> The index format is still quite simple, though not as compact as the
> previous "one merged index per file" approach. I estimate that if you
> index two tracks, assuming one key point every two seconds from both
> tracks, that in practice it will take approximately 70KB per hour of
> video (11.6KB per 10 minutes) to index two-track video. That's about 20
> bytes of index per second of video.
> With the original "one merged index per file" approach it's about half
> that, but I think the added size is an acceptable trade off. I imagine
> the majority of video out there on the internet is under 10 minutes long
> anyway (requiring a 12KB index...), and when playing files over a
> network, most reasonable quality videos will require about 100KB/s of
> bandwidth to playback smoothly. If if you've got a connection fast
> enough for streaming video, you won't notice downloading an index.
> You can tweak the index-keyframe interval to reduce the index size as
> well, though that erodes the benefit of the index for network playback.
> I've implemented this in my indexer on a new branch on my GitHub account:
> New spec here:
> Firefox builds which can handle new index format here:
> Demo here:
> New Proposed Index Track Format:
> An Ogg index track starts with an identifier header packet which
> contains the following data, in the following order:
> * The identifier "index\0".
> * The index version format number, as a 1 byte unsigned integer. This
> specification describes version 1, so this field should have the
> value 0x01.
> * The playback start time, in milliseconds, as an 8 byte unsigned
> integer, this is the presentation time of the first frame.
> * The playback end time, in milliseconds, as an 8 byte unsigned
> integer, this is the end time of the last frame.
> * The length of the indexed segment, in bytes, as an 8 byte unsigned
> The track then contains secondary header packets, which contain the
> actual indexes. These are the "index packets", and each must begin on a
> new page, but they may span multiple pages. There is one index packet
> for each content stream in the Ogg segment, and they appear in
> increasing order of the streams' serialno. Each index packet contains
> the following:
> * The serialno of the stream as a 4 byte field.
> * The number of key points in the index packet, 'n', as a 4 byte
> unsigned integer.
> * 'n' key points, each of which contain, in the following order:
> - the page's byte offset as an 8 byte unsigned integer, followed by
> - the checksum of the page found at the offset, as a 4 byte field,
> followed by
> - the presentation time in milliseconds of the key point, as an 8
> byte unsigned integer.
> The key points are stored in increasing order by offset. The
> presentation time of the key point is calculated from the granulepos.
> The last packet in the track is an empty EOS packet, which must start on
> a new page.
> Note that this format can be encoded in one pass. If you know the
> duration of the media, you can decide the keyframe interval (say one
> every 2 seconds, which is roughly ffmpeg2theora's default for theora
> anyway) and then allocate the required space in the index packets and
> come back and fill it in once you've encoded the media.
> Comments? Questions etc?
> Chris P.
> theora mailing list
> theora at xiph.org
More information about the theora