[theora-dev] My issues with ogg and directshow...

illiminable ogg at illiminable.com
Mon May 10 18:20:59 PDT 2004

----- Original Message -----
From: "Timothy B. Terriberry" <tterribe at vt.edu>
To: <theora-dev at xiph.org>
Sent: Tuesday, May 11, 2004 3:32 AM
Subject: Re: [theora-dev] My issues with ogg and directshow...

<p>> > Which is how it currently works.
> >
> > amFactory.cpp
> Good, you had me worried, but I hadn't read the code.
> > But it still makes every codec to some extent dependant on ogg.
> Yes, this dependency is somehow mandated. Every codec that wishes to be
> embedded in Ogg must define a mapping for how that is done. It must
> define what format its headers take, how things are split into packets,
> what the granule position means, etc. You can't have a codec that is
> unaware of Ogg embedded in it.

The muxer must know all this, the demux doesn't necessarily... it just needs
answers to a few simple questions. Which the muxer can offer in a standard
way, meaning the demuxer doesn't have to figure it out for itself every time
the file is played.

> > But when you say a "codec api" this will be out of the realm of
> > interfaces. Assuming directshows model of a decoder as a filter, how can
> > demux know what filter to ask if it can't determine a unique GUID to
> > identify the media type. It could try to ask all 3000+ installed filters
> > they support a particular "ogg codecs" interface, then when they find
> > ask them if they know this codec, it would work but this is not
> > This is why the media type GUID is important as it narrows the number of
> > filters it has to try to a much smaller number.
> ASKING all the codecs if they support an Ogg interface may be
> unnecessarily burdensome, but there's no reason they can't put their
> GUID in a list in the registry somewhere, specifically for codecs that
> can be embedded in Ogg.

True which is what will end up happening. But seeing as the codec identifier
is of variable length, in the case where one id is prefixed by the other it
depends on which one is checked first.

In my previous example if one id is \001vorbis and another is \001vorbis2
... if the first is checked first it will incorrectly identify the second,
blindly parse the header with invalid information. Accept a connection it
can't process, and either crash or spit out random garbage. In directshow
this is a fatal error if any filter accepts a connection it can't process,
because it stops the automatic graph building process because it thinks it's
found a match. It's not until it plays that everything falls apart.

You could arrange the registry keys in order of increasing size to avoid
this, but what happens to client that is older and doesn't even know there
is such a thing as \001vorbis2, it will match it as \001vorbis.

Also as i mention in a different response, if you get no match you've got
nothing. No way to determine which codec you are really looking for, however
with a standard guid mapping, if you can create a fully formed guid
systematically, then even if the codec is not on your system or was created
after your version of the demux, then directshow can identify, locate,
download and install it.

> As for querying each one if it knows a particular codec, while this
> sounds inefficient on the face of it, a simple MRU scheme will cut out
> most of the overhead in practice. The number of codecs in general use is
> small, and the number embeddable in Ogg even smaller. And I give an even
> better suggestion below.
> And finally, as Aaron said, that GUID DOESN'T need to refer to the same
> codec .dll that does the full decoding. A much smaller one that does
> only the minimal amount needed by the demuxer is possible, to reduce
> loading time.

Yes this is the alternative i will end up using... but see my response here
and in previous emails about how this stops a codec being automatically
identified and downloaded.

> > So the other alternative is to have "out of directshow" calls to dll's.
> > other words require codecs to provide not only directshow filter and pin
> > interfaces, but also some other api external to directshow. Which again
> > possible and loks liek the route i will end up going, but it puts
> > requirement on codec developers to implement another API which is not
> > part of directshow, which kind of breaks the whole idea of an
> > buildable filter graph in directshow.
> This is precisely what I was suggesting. I don't see how this breaks
> automatically buildable filter graphs. You can automatically query
> codecs to see if they support this extended API (or even better, have
> them register somewhere like I suggested above).

As above, ideally you want to at least be able to recognise codecs that you
can't process so you can locate them, with this scheme, you either have it
or you don't, no way to automatically locate it and install it.

> > And it also make integrating existing codecs which know nothing of ogg a
> > problem as they are lready implemented, they already know how to work
> I don't see how you're going to get around this problem. If a codec
> knows nothing about Ogg, then there is no Ogg-layer mapping defined for
> it, and it can't be embedded in Ogg.

Also, only the muxer needs to know this. As is evidenced by the fact that i
imlpemented all ffdshow video codecs with a single ogm header. If they add
10 new codecs tomorrow, my demux won't change and nor will i need helper
libraries to identify the codecs. My demux has no idea whats in a divx
header, but it can still create a valid divx stream. All it knows is this
codec is divx, it's data rate is x, it's frame size is y.

Only the muxer needs to know how divx(or any other video codec) actually
stores that in it's native header. The demux just needs to know the ogm
header, which it will use to setup a conenction, then strip out fo the
stream and deliver to the codec it's native header with absolutely no
understanding of what it contains.

<p>> > The solution that every other format i can find that is implemented on
> > directshow has chosen, is to make a blind mapping of some identifier to
> > guid... be it a numeric identifier or a fourcc code. Basicly what is
done is
> This doesn't really solve the problem either... someone still has to
> have defined what identifier is used for a particular codec in a
> particular container. The FOURCC route, while it looks simple on the

That's true, but that's also true of ogg and anything really.

> surface, is inadequate. For example, consider MJPEG embedded in an AVI.
> There are currently two decoder filters available which use the same
> FOURCC, but exactly reverse the order of any interlaced fields (there
> are APP1 markers stored in the JPEGs to indicate which field is which,
> but these two codecs swap the interpretation of what those markers
> mean). How do you know which decoder should be used for a file? Looking
> at the FOURCC alone, you can't. I won't even go into all the subtle

How would that be any different than if i created theora files that produce
upside down video but still identify themselves as theora. This is a generic

> differences between the various MPEG4 codecs, which all have different
> bugs, different interpretations of the spec, which change from version

Ideally specs shouldn't be open to that kind of interpretation. And where
they are they should be modified to bring everything back to alignment. Bugs
on the other hand are always going to be a fact of life.

> to version of the same encoder, and all of which use each other's FOURCC
> codes all the time.
> All current Ogg codecs DO contain a codec indentifier, but the location
> and length of the identifier varies from codec to codec. It's a simple
> matter to extend the registry list described above with an identifier
> string, offset, and length to narrow down the codecs queried even more.

But that still doesn't solve the problem of codecs unknwon to the client. It
only solves the problem where conflicting identifiers are both known to the
client, which as you say is easily solved. In a case where the client
beleives it has a solid match but since it was created a new codec has
invalidated that assumption.

> Simply pass the initial packet to every codec which passes this simple
> memcmp() test. It's even future proof against codecs which can't be
> identified in such a simple manner, since they can always register a
> length-0 identifier (thus forcing the muxer to always query that codec).
> This scheme is efficient where efficiency is possible, and flexible
> where it is not.
> As for the fixed header formats... imposing those leads to unnecessary
> restrictions on the codecs embedded in the container. Look at all the
> problems AVIs have with B frames, their inability to store variable
> frame-rate video, or even variable bit-rate audio, etc.

But the fixed header is not there to replace the variable header, merely to
supplement it.

> "For every complex problem there is an answer that is clear, simple, and
> wrong."

Very true. :)


<p>--- >8 ----
List archives:  http://www.xiph.org/archives/
Ogg project homepage: http://www.xiph.org/ogg/
To unsubscribe from this list, send a message to 'theora-dev-request at xiph.org'
containing only the word 'unsubscribe' in the body.  No subject is needed.
Unsubscribe messages sent to the list will be ignored/filtered.

More information about the Theora-dev mailing list