[Theora-dev] Problems with Theora DirectShow filters

Wed Sep 15 01:40:40 PDT 2004

----- Original Message ----- 
From: "Robert Huitl" <theora-dev at huitl.de>
To: <theora-dev at xiph.org>
Sent: Wednesday, September 15, 2004 12:30 AM
Subject: [Theora-dev] Problems with Theora DirectShow filters
>Hi,
>
>I'm trying to capture video and audio from a webcam in order to encode it 
>with
>the Theora and Speex DirectShow-filters. For audio, this filter graph plays
>fine: http://www.huitl.de/mic-speexenc-speexdec-out.png.

It does... but i don't see the point of encoding it if you are going to 
decode it straight away... the encoders aren't designed to be chained back 
into their matching decoder.

>The video part is problematic. Please have a look at
>http://www.huitl.de/cam-theoraenc-theoradec-out.png.

Same goes here... linking an encoder back to a decoder won't always work 
directly (particularly with video... as the video decoder generally expects 
the video renderer to provide the custom allocators to draw straight onto a 
directdraw surface).

You should build a graph something like this if you want to capture...

Audio Input Source --> Speex Encoder -->  }
                                                                     }  Ogg 
Mux Filter
Video Input Source --> Theora Encoder -->  }

>1. The webcam provides the color formats RGB24, I420 and IYUV. The Theora
>encode filter only accepts YV12, so I can't just connect the webcam capture
>pin with theora's yv12-in.

Yes... this is a problem at the moment... probably in the next release i'll 
include a conversion from RGB types to YUV types.

>2. The "Color Space Converter" can't do the conversion to YV12.

Yeah... the colour space converter is pretty useless !!

>3. I found that the ffdshow raw video filter should be able to do this
>conversion (although I'd like to have no dependencies other than Theora and
>Speex).

At the moment that is probably the only option... until i include the colour 
space conversion in the theora filter.

>4. When I render the Theora decode filter's output pin, DirectShow attaches 
>an
>AVI decompressor, a color space converter and finally the video renderer.

One thing... you should never "Render" encoding graphs... the automatic 
graph building is for decoding graphs only.

>5. I cannot play the filter graph. When trying to do so, GraphBuild shows:
>"This graph can't play", return code 0x80070000.

It's because the encoder and the decoder shouldn't be directly chained. 
There is no need to do this (excpet perhaps testing)

>6. I tried to replace the AVI decompressor and the color space converter 
>with
>another ffdshow raw video filter. When I connect the ffdshow-filter's 
>output
>with the video renderer, GraphBuild crashes:
>http://www.huitl.de/cam-theoraenc-theoradec-ff-out.png

Hmmm... i'll have to have a look into this... quite possibly it's because 
the ffdshow filter expects the downstream filter to create the sample 
allocator... but the encoder expects the upstream filter to do it... as the 
upstream filter is the one that knows what sort of samples it has, and what 
size and how many buffers it shold create. The excpetion is generally 
anything that could connect to the video renderer which expects the renderer 
to provide the allocators.

>I'd appreciate any help doing this:
>1. Capture video from the webcam and encode with Theora.

For the moment... you'll need to find a colour space converter that can do 
the conversion properly. In the next release i'll include the RGB to YUV 
conversions. I know there are some proprietary filters out there... i 
haven't had much luck finding a good free colour space converter though.

>2. Access the encoded video data in order to send it over a network.

At the moment... none of the filters do network output... that would 
typically be done by an application. I do have plans to provide a callback 
interface on the muxer so that your application can grab the pages and do 
whatever it wants with them.

If you want to feed packets... you will probably need your own custom filter 
at this stage... eg if you wanted to use rtp. At some point i may write a 
generic packet grabber to feed packets back to an application.

>3. Feed data received over a network to the Theora decoder, which in turn
>should pass it to a video renderer.

This should already work... if you open a http resource (via full url) with 
the ogg demux source and render it... it should work fine. But again if you 
want to do packet level, and not page level network transport with custom or 
non-http protocols, you'll need to write a custom filter (or use something 
like the sample grabber filter) and do the protocol/network operations in 
the application code.

It sounds like you are wanting to do some kind of video voip application... 
as a general rule if rtp is not being used... it will probably be a custom 
protocol and should be done in application code  or by writing an 
application specific filter. At some point i will probably write some rtp 
stuff... but for now i'm focussing on http streaming rather than trying to 
send data back out over the network. Though i will get to it at some point, 
it's just not the highest priority at the moment.

So i'm more focussed at the moment on getting the core of the code working 
for the majority of the uses... specifics for particular styles of 
application should probably be written by the application developer as they 
will be application specific. I don't realyl want to spend time at this 
stage writing custom network protocol filters for individual applications.

Cheers,

Zen.