[ogg-dev] [PATCH] oggz: inefficient seeking

Sean Young sean at mess.org
Sun May 3 18:05:10 PDT 2009


I have a 1.1G Ogg file with vorbis and theora. oggz_seek_units() takes 14
seconds to find a position in the file towards the end. Now, the function
guess() in oggz_seek() guesses a position at about 1.5G and then slowly 
searches back until it finds the end of the file (continously seeking 
beyond the end of the file and then calling read which returns 0). Then
it does a linear scan from the beginning of the file, reading every page!

14 seconds is with the file not in the page cache, it's 5 seconds when the
file is in the page cache.

The function guess() seems to me to be a stab-in-the-dark since it does
not know how long the file is in actual units. Here are the changes:

1) If guess() returns a position beyond the end of the file, reset to the end
2) Before guessing, determine the length in units. 
3) Fix the output when DEBUG is defined.

Now it takes 0.0023 seconds to preform the seek. Patch is against 
git commit 718a73d31011a7985365f4473e735b96791c7156.

Thanks,
Sean
--- 
diff --git a/configure.ac b/configure.ac
index 6a2bb05..92075a3 100644
--- a/configure.ac
+++ b/configure.ac
@@ -314,7 +314,7 @@ if test $SIZEOF_OGGZ_OFF_T = 4 ; then
     OGGZ_OFF_MAX="0x7FFFFFFF"
     PRI_OGGZ_OFF_T="l"
 elif test $SIZEOF_OGGZ_OFF_T = 8 ; then
-    PRI_OGGZ_OFF_T=PRId64
+    PRI_OGGZ_OFF_T="ll"
 fi
 
 dnl The following configured variables are written into the public header
diff --git a/src/liboggz/oggz_seek.c b/src/liboggz/oggz_seek.c
index c46f0ab..a8c1476 100644
--- a/src/liboggz/oggz_seek.c
+++ b/src/liboggz/oggz_seek.c
@@ -506,9 +506,15 @@ guess (ogg_int64_t unit_at, ogg_int64_t unit_target,
 
   if (unit_at == unit_begin) return offset_begin;
 
-  guess_ratio =
-    GUESS_MULTIPLIER * (unit_target - unit_begin) /
-    (unit_at - unit_begin);
+  if (unit_end != -1) {
+    guess_ratio =
+      GUESS_MULTIPLIER * (unit_target - unit_begin) /
+      (unit_end - unit_begin);
+  } else {
+    guess_ratio =
+      GUESS_MULTIPLIER * (unit_target - unit_begin) /
+      (unit_at - unit_begin);
+  }
 
 #ifdef DEBUG
   printf ("oggz_seek::guess: guess_ratio %lld = (%lld - %lld) / (%lld - %lld)\n",
@@ -657,7 +663,16 @@ oggz_seek_set (OGGZ * oggz, ogg_int64_t unit_target)
 
   unit_at = reader->current_unit;
   unit_begin = 0;
-  unit_end = -1;
+
+  og = &oggz->current_page;
+
+  if (oggz_seek_raw (oggz, 0, SEEK_END) >= 0) {
+    ogg_int64_t granulepos;
+
+    if (oggz_get_prev_start_page (oggz, og, &granulepos, &serialno) >= 0) {
+      unit_end = oggz_get_unit (oggz, serialno, granulepos);
+    }
+  }
 
   og = &oggz->current_page;
 
@@ -682,6 +697,10 @@ oggz_seek_set (OGGZ * oggz, ogg_int64_t unit_target)
       break;
     }
 
+    if (offset_guess > offset_end) {
+      offset_guess = offset_end;
+    }
+
     offset_at = oggz_seek_raw (oggz, offset_guess, SEEK_SET);
     if (offset_at == -1) {
       goto notfound;
diff --git a/src/tools/oggz-dump.c b/src/tools/oggz-dump.c
index fc7e3ba..56c5eb6 100644
--- a/src/tools/oggz-dump.c
+++ b/src/tools/oggz-dump.c
@@ -361,7 +361,7 @@ revert_packet (OGGZ * oggz, ogg_packet * op, long serialno, int flush)
 
 #ifdef DEBUG
   printf ("feeding packet (%010lu) %ld bytes %s, %s\n",
-          current_serialno, op->bytes,
+          serialno, op->bytes,
           op->b_o_s ? "bos" : "not bos",
           op->e_o_s ? "eos" : "not eos");
 #endif


More information about the ogg-dev mailing list