[theora-dev] Extension to Skeleton for multi-track media

Benjamin M. Schwartz bmschwar at fas.harvard.edu
Tue Mar 23 18:16:22 PDT 2010


Silvia Pfeiffer wrote:
>>> A further part of the wiki page is the proposal to impose an implicit
>>> order on the tracks through the order in which their BOS pages are
>>> given. This is nothing semantic, but only a convenience so we can
>>> ascertain that different Web browsers will address the same track by
>>> the same index number through JavaScript.
>> I reiterate my preference for associative arrays, indexed by the Ogg track
>> ID and name.  The BOS ordering is unstable, and provides no benefit that I
>> can see over unique stream identifiers.
> 
> I can see where you're coming from, but building an associative array
> is something that the application has to do. It will create an array
> saying that serialno x matches to position i on the index array.

I don't agree with this definition of an associative array.  In
javascript, the associative array would have keys that are track names and
values that are MediaTrack objects.  No positional index is ever defined.

> However, the order is still not specified by this. We have to create
> an order that can be maintained between applications.

Why do you have to create an order? I cannot think of any programming task
that requires such an ordering.

>>> Finally there are two rendering related fields that we propose
>>> introducing: Display-hint and Altitude (their names could of course
>>> still be changed).
>> Altitude seems fine.  I have more problems with Display-hint:
>>
>> pip:
>> Specifying that a track can be shown as PIP might be a good thing.  This
>> mechanism seems very rigid, though.  Television sets that provide PIP
>> usually let the user control the positioning, because they may want to see
>> different parts of the underlying frame.  I'm not convinced that
>> specifying a position or size along with the PIP hint is necessary at all.
>>  If it is, the text should say "may be displayed" instead of "should be
>> displayed" to indicate that the player should give the user control.
>> Content producers who want exact control of overlay positioning should use
>> Altitude and video/alpha.
> 
> It's a display HINT, therefore it's always just a suggestion to the
> player.

Sure.  I guess I'm just nitpicking as to whether the location is a very
useful hint.  Are there other systems that provide such hints?

>> 2. mask:
>> Ogg files are self-contained.  This proposal breaks that in a huge way,
>> and I think it's terrible.  The right way to do this is in CSS in the
>> webpage, a la
>> http://labs.silverorange.com/files/video-demo/ambient.xhtml
>> http://webkit.org/blog/181/css-masks/
>>
>> Please remove mask from the draft.
> 
> Yes, that is another train of thought. We indeed do not need the
> functionality for the Web. But what about media players? Other media
> format allow for inclusion of such a mask inside the media resource to
> allow masking the video display.

They do?  Can you point me a to media player other than a web browser that
can play statically-masked video?

> This is an attempt at introducing
> this functionality into Ogg. I won't fight for it if the general
> consensus is: we don't need it. But I have had this discussion that
> e.g. Flash and MPEG are capable of this and Ogg isn't. This would be a
> relatively simple way to introduce it.

A static binary mask image cannot reproduce the full-motion alpha behavior
of VP6a (Flash) and H.264 (MPEG).

>> 3. transparentcolor.
>> This will not work. 

> Now, how would you cut out a person from the video? Would you need to
> create a new track (the "video/alpha" video track) that provides the
> continuing mask over the person and makes everything around that mask
> transparent? Since we don't have alpha channels in Ogg, this would be
> a means to introduce alpha channels.

Yes, that's what I imagined.  I assumed this was the purpose for which you
created the video/alpha Role.

>> For video/alpha, this is still insufficient, because masking a video and
>> an overlay before compositing them is not the same as masking after
>> compositing.  To permit masking after compositing, video/alpha tracks
>> should optionally have one or more Altitudes.  For each Altitude held by a
>> video/alpha track, it applies to the composited result of all visible
>> higher tracks.
> 
> Yes, I agree - the "video/alpha" approach is a hack and not a feature.
> Is this even the best way to go about it? Would it make more sense to
> change Theora to include possibility for an alpha channel?

This is a tough question.  The "video/alpha" approach has the advantage
that current non-transparency-aware players have a chance of falling back
to playing the main video without transparency.  Adding an alpha channel
to Theora would be possible, but the resulting tracks wouldn't play at all
in any current player; it would essentially be a different codec,
requiring new encoders and decoders.  As a result, I favor the
"video/alpha" approach, even though it's messier and probably less
efficient.  I'd like to hear more opinions on that topic.

--Ben

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: OpenPGP digital signature
Url : http://lists.xiph.org/pipermail/theora-dev/attachments/20100323/b9cfd46c/attachment.pgp 


More information about the theora-dev mailing list