[Icecast-dev] Icecast WebM Support Patch Second Edition

Fri Feb 10 15:10:01 PST 2012

On 8 February 2012 20:53, Krad Radio <kradradio at gmail.com> wrote:

> The 'mini-library' doesn't actually 'parse'
> the stream at all, all it does is look for the four byte sequence indicating
> a cluster boundary and informs the format functions appropriately.

Neat, thanks for explaining what the code's doing.

Matroska is always described as a hierarchical container format, so it
never occurred to me to just look for sync points the way you would in
Ogg.

> I am a
> fraudulent mathematician at best, but I calculate that there is a 1 in 4.2
> billion chance of this happening for any given 4 bytes, and 1 in 4294 per
> megabyte, and likely once per 4.2 gigabytes.

That's true if the bytes are randomly distributed, which the
compressed data should be, but it's definitely not true of the
headers. Also, one error every 4 GB of data is pretty frequent where
video bitrates are concerned, so I expect someone will hit this
eventually.

Fortunately, this is easy to check, just run a bunch of webm data
through your cluster boundary detector and a real parser and see how
many false positives you get.

The other thing you can do is to read the cluster size and check to
see if there's another cluster (or other appropriate) element id at
that offset. That reduces the probability of a false positive by quite
a bit (but you have to buffer more).

Reading element sizes shouldn't add much to the code size.

 -r