[cvs-annodex] commit (/annodex):
standards/draft-pfeiffer-cmml-current.xml
silvia
nobody at lists.annodex.net
Tue Feb 15 04:48:47 EST 2005
Update of /annodex (new revision 897)
Modified files:
standards/draft-pfeiffer-cmml-current.xml
Log Message:
Fixed up the encoding side of CMML into Annodex. Yay!
Still missing:
- decoding Annodex to CMML
- additional new tags for CMML 2.1 (encoding hints)
- moving back from v3 to v2 for I-D submission
Modified: standards/draft-pfeiffer-cmml-current.xml
===================================================================
--- standards/draft-pfeiffer-cmml-current.xml 2005-02-14 16:37:18 UTC (rev 896)
+++ standards/draft-pfeiffer-cmml-current.xml 2005-02-14 17:48:46 UTC (rev 897)
@@ -660,7 +660,8 @@
time of 350 seconds is to be included 50 seconds into the
Annodex bitstream. If no basetime (or no stream tag) is given,
the basetime defaults to 0 npt. The basetime can be given as a
- SMPTE or NPT time, but not as a utc time.
+ SMPTE or NPT time, or as a rational number as in 5/1300, but
+ not as a utc time.
</t>
<t>The "utc" attribute associates a calendar date and a
@@ -1300,13 +1301,28 @@
<t>CMML is serialised by having some initial header pages that
set up the CMML decoding environment, and contain header type
information. The content of a CMML bitstream then consists of
- "clip" tags.
+ "clip" tags. The "stream" tag is not represented in the CMML
+ bitstream as it controls the authoring of the bitstream that is
+ created by interleaving the CMML with the media streams listed
+ in the "stream" tag. Its information is meant to be stored in the
+ encapsulation format.
</t>
+ <t>All of the CMML bitstream information is text. As it gets
+ encoded into a binary bitstream, an encoding format has to be
+ specified. To simplify things, UTF-8 is defined as the mandatory
+ encoding format for all data in a CMML binary bitstream. Also,
+ the encoding process MUST ensure that newline characters are
+ represented as LF (or "\n" in C) only and replace any new line
+ representations that come as CR LF combinations (or "\r\n" in C)
+ with LF only.
+ </t>
+
<section title="The format of the CMML ident header packet">
- <t>The ident header packet of a logical bitstream contains all
- information required to set up a CMML decoder. It has the
+ <t>The first header packet of a CMML logical bitstream is the
+ CMML ident header. It contains all information required to identify
+ the CMML bitstream and to set up a CMML decoder. It has the
following format:
</t>
@@ -1321,23 +1337,11 @@
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Version major | Version minor | 8-11
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | Granulerate numerator | 12-15
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | | 16-19
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | Granulerate denominator | 20-23
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | | 24-27
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | Granuleshift | 28
- +-+-+-+-+-+-+-+-+
+ | ...
]]></artwork>
</figure>
- <t>Fields with more than one byte length are encoded LSB
- (least significant byte) first.
- </t>
<t>The fields in an CMML ident header packet have the following
meaning:
@@ -1365,44 +1369,33 @@
minor version number of the CMML format
bitstream.
</t>
- <t>Granule rate numerator & denominater: 8 Byte integer
- number each. They represent the temporal resolution of the
- logical bitstream in Hz given as a rational number in the
- same way as the fishead basetime field above.
- </t>
- <t>Granuleshift: a 1 Byte integer number describing whether to
- partition the granule_position into two for that logical
- bitstream, and how many of the lower bits to use for the
- partitioning. The upper bits then still signify a
- time-continuous granule positions for a directly decodable
- and presentable data granule. The lower bits allow for
- specification of a finer resolution such that for example
- predicted frames of a video can be addressed as well, though
- not decoded without tracing back to the last fully decodable
- data granule. This is e.g. the case with Ogg theora.</t>
</list>
+ <t>When encapsulating a CMML bitstream, more fields may be added
+ to this header as required by the encapsulation or exchange format.
+ </t>
+
</section>
<section title="The format of the CMML secondary headers">
<t>The CMML secondary headers are a sequence of
- two packets that contain the CMML "setup" information and
- are getting mapped into (at least) two Ogg pages:
+ two packets that contain the CMML and XML "setup" information:
<list typs="symbols">
- <t>one packet with the CMML xml preamble.</t>
+ <t>one packet with the CMML xml preamble and "cmml" tag.</t>
<t>one packet with the CMML "head" tag.</t>
</list>
- These packets contain textual, not binary information. All
- characters MUST be encoded in UTF-8 as transport format.
+ These packets contain textual, not binary information.
</t>
<t>The CMML preamble tags are all single-line tags, such as the
xml processing instruction (<![CDATA[<?xml...>]]>) and the
document type declaration (<![CDATA[<!DOCTYPE...>]]>).
- The only CMML tag that is not already serialized from a
+ </t>
+
+ <t>The only CMML tag that is not already serialized from a
CMML file is the "cmml" tag, as it encloses all the other
- content tags. To include it into the Ogg stream, the "cmml"
+ content tags. To serialise it, the "cmml"
start tag is transformed into a processing instruction,
retaining all its attributes (<![CDATA[<?cmml ...>]]>), and
the "cmml" end tag is deleted.
@@ -1430,8 +1423,9 @@
]]></artwork>
</figure>
- <t>The second CMML secondary header packet has the following
- format.
+ <t>The second CMML secondary header packet contains the
+ CMML head element with all its attributes and other
+ containing elements and has the following format.
</t>
<figure>
@@ -1439,7 +1433,7 @@
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | <head> ... | 0-
+ | <head ... | 0-
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| ... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
@@ -1451,6 +1445,49 @@
</section>
+ <section title="The format of the CMML data packets">
+
+ <t>The data packets of the CMML bitstream contain the
+ CMML clip elements. Their "start" and "end" attributes
+ however only exist for authoring purposes and are not
+ copied into the bitstream, but are rather represented
+ through the time mapping of the encapsulation format that
+ interleaves CMML data with data from other time-continuous
+ bitstreams. This avoids contradictory doubly represented
+ timing information. Generally the time mapping is done through
+ some timestamp representation and through the position in
+ the stream.
+ </t>
+
+ <t>A "clip" tag is encoded with all tags (except for the
+ "start" and "end" attributes) as a string printed into a
+ clip packet. The "clip" tag's "start" attribute tells the
+ encapsulator at what time to insert the clip packet into
+ the bitstream. If an "end" attribute is present, it leads to
+ the creation of another clip packet, unless another clip packet
+ starts on the same track beforehand. This clip packet contains
+ an empty "clip" tag, i.e. a "clip" tag without "meta", "a",
+ "img" or "desc" elements and no attribute values except for a
+ copy of the "track" attribute from the original "clip" tag.
+ </t>
+
+ <figure>
+ <artwork><![CDATA[
+ 0 1 2 3
+ 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | <clip ... | 0-
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | ... |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | </clip> |
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+
+ ]]></artwork>
+ </figure>
+
+ </section>
+
</section>
@@ -1461,59 +1498,231 @@
<section title="Media mapping for a CMML logical bitstream inside Ogg">
- <t>As CMML is an authoring format for Annodex bitstreams, there
- is a simple way to map the annotations and meta information
- contained in a CMML instance document to the annotation
- bitstream and header fields of an Annodex format bitstream.
- Please be aware that some of the encoding rules given here are a MUST,
- and others a SHOULD. As the binary header format for the annotation
- and media bitstreams provide for an extensible list of message
- header fields, an encoder MAY however add some or all of the
- non-used tags in there and even add others. For this section a
- detailed understanding of the <xref target="ANX">Annodex format
- bitstream</xref> is necessary.
+ <t>When mapping a CMML logical bitstream into Ogg, the
+ serialisation as described in the previous section is used as
+ a logical bitstream. The ident packet is extended by a few
+ fields that are necessary for handling the time stamping of
+ the content packets (i.e. the clips) for Ogg. Here is its format:
</t>
- <t>The "head" and "clip" tags of a CMML document are mapped as
- codec data into the annotation bitstream of an Annodex bitstream,
- where the "head" tag is regarded as a secondary header.
+ <figure>
+ <artwork><![CDATA[
+ 0 1 2 3
+ 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1| Byte
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | Identifier 'CMML\0\0\0\0' | 0-3
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | | 4-7
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | Version major | Version minor | 8-11
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | Granulerate numerator | 12-15
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | | 16-19
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | Granulerate denominator | 20-23
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | | 24-27
+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ | Granuleshift | 28
+ +-+-+-+-+-+-+-+-+
+ ]]></artwork>
+ </figure>
- Thus,
- the rest of the information in a CMML file, i.e. the "stream" tag,
- the "cmml" tag and the preamble information, MUST be handled as
- binary header type information. Header type information in Annodex
- is generally regarded as non-human readable information, therefore
- by default language and directionality information will not be
- encoded. The character set used in the Annodex header fields is
- UTF-8, but the mandatory header fields are all covered by US-ASCII
- code points and for the optional ones it is recommended to do the
- same as much as possible. User defined optional message header
- fields MUST follow the naming standard given in RFC2822.
- </t>
-<!--
-<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
-<!DOCTYPE cmml SYSTEM "cmml.dtd">
+ <t>Fields with more than one byte length are encoded LSB
+ (least significant byte) first.
+ </t>
-<cmml lang="en">
--->
+ <t>The additional fields in an CMML ident header packet for Ogg
+ have the following meaning:
+ </t>
+ <list style="numbers">
+ <t>Granule rate numerator & denominater: 8 Byte integer
+ number each. They represent the temporal resolution of the
+ logical bitstream in Hz given as a rational number in the
+ same way as the fishead basetime field above.
+ </t>
+ <t>Granuleshift: a 1 Byte integer number describing whether to
+ partition the granule_position into two for the CMML logical
+ bitstream, and how many of the lower bits to use for the
+ partitioning. The upper bits then still signify a
+ time-continuous granule position for a directly decodable
+ and presentable data granule. The lower bits allow for
+ specification of the granule position of a previous CMML
+ data packet (i.e. "clip" element), which helps to identify
+ how much backwards seeking is necessary to get to the last
+ and still active "clip" element (of the given track).
+ </t>
+ </list>
- <section title="Encoding the 'stream' tag">
-
- <t>A CMML instance document contains in its "stream" tag
- information that is relevant to the authoring process of Annodex
- format bitstreams.
+ <t>A default granule rate for CMML is: 1/1000. The default
+ granule shift used is 32, which halfs the granule position to
+ allow for the packwards pointer to be public.
+ </t>
+
+ <t>The ident header packet is encapsulated into the bos page of
+ the CMML logical bitstream in Ogg. The other header packets are
+ included as secondary header packets. The content packets are
+ also included into Ogg by encapsulating them into Ogg pages and
+ providing them with the accurate offset time.
+ </t>
+
+ </section>
+
+ <section title="Using CMML to author Annodex bitstreams">
+
+ <t>As CMML contains authoring information for Annodex bitstreams,
+ a CMML instance document contains more than just the annotation
+ information necessary for the CMML logical bitstream. It also
+ contains control information to create the control section of an
+ Annodex bitstream, i.e. the skeleton bitstream with its secondary
+ header packets describing each of the contained logical bitstreams.
+ Note that we only describe the creation of Annodex Version 3.0
+ bitstreams here.
+ </t>
+
+ <t>The authoring information stems in particular from the "stream" tag
+ plus some specific information from the "cmml" tag. Generally,
+ the "stream" tag's attributes contribute to the skeleton fishead
+ packet, the "import" tag's attributes to the skeleton fisbone
+ packets of each logical bitstream, and the "cmml" tag's attributes
+ to the fisbone of the CMML logical bitstream. While the "cmml" tag
+ is represented in full as a processing instruction in the secondary
+ header packets of the CMML logical bitstream (see above), this is
+ not the case for the "stream" tag. Therefore, this section also
+ contains a description of what tags of the "stream" tag are not
+ used inside an Annodex bitstream.
+ </t>
+
+ <section title="Creating the skeleton ident packet">
+
+ <t>The skeleton ident packet receives the "basetime" and the
+ "utc" field information from the "stream" tag.
</t>
- <t>The "stream" tag itself finds no representation in the
- Annodex bitstream. Rather, it contains both, information on
- the complete Annodex bitstream, and information on the
- different input documents. This is information that finds
- a representation in the Skeleton logical bitstream of an
- Annodex bitstream. The second information is also used
- during the encoding process of each media bitstream.
+ <t>"Basetime numerator & denominator": if the "basetime"
+ attribute is given in a CMML instance document, it MUST be
+ represented in the skeleton ident header in the fields
+ "Basetime numerator" and "Basetime denominator". It is converted
+ from a possible NPT or SMPTE representation to a rational number
+ to be stored in these fishead fields.
</t>
+ <t>"Presentationtime numerator & denominator": if the "basetime"
+ attribute is given in a CMML instance document, it also
+ determines the presentation time of the interleaved bitstream and
+ the "Basetime numerator" and "Basetime denominator" MUST be
+ copied to the "Presentationtime numerator" and "Presentationtime
+ denominator" fields of the skeleton ident header.
+ </t>
+
+ <t>"UTC": if the "utc" attribute is given in a CMML instance document,
+ it MUST be represented in the skeleton ident header in the "UTC" field.
+ </t>
+
+ </section>
+
+ <section title="Creating the skeleton fisbone packets">
+
+ <t>A fisbone packet for a logical bitstream is created through
+ the authoring information of an "import" tag in a CMML instance
+ document's "stream" tag. One "import" tag contains information
+ on one particular logical bitstream in the interleaved bitstream
+ and thus creates one particular skeleton fisbone packet.
+ </t>
+
+ <t>"Granulerate numerator & denominator": if the "granulerate"
+ attribute is present in the "import" tag, it MUST be represented
+ in the fisbone header for the respective media bitstream in the
+ fields "Granulerate numerator" and "Granulerate denominator".
+ The encoder MUST however ascertain that the values are sensible,
+ and if it knows the accurate granule rate for a logical bitstrea
+ overrun the user input with the one that was used during creation
+ of the interleaved bitstream.
+ </t>
+
+ <t>"Content-type" message header field: this attribute MUST be
+ represented in the respective skeleton fisbone packet as a message header
+ field with name "Content-type", as it signifies the MIME type
+ of the media bitstream, providing for a decoding hint. If the user
+ does not specify the "contenttype" attribute, the encoder
+ MUST provide it during the interleaving process.
+ </t>
+
+ <t>"ID" message header field: if an "id" attribute is specified
+ for an "import" tag, it SHOULD be represented in the skeleton
+ fisbone header for the respecitve media bitstream as a message
+ header field with name "ID", as it signifies a short identifying
+ machine-readable string for the import media bitstream.
+ </t>
+
+ <t>User specified message header fields: if "name" and "value"
+ attributes are specified in the "param" tags of the "import" tag,
+ these MAY be represented in the skeleton fisbone packet of the respective
+ media bitstream as a message header field with the given name-value pair.
+ These fields are highly dependent on the type of media bitstream
+ handled and it therefore depends on the encoding tool to make
+ a selection of the parameters acquired. For example, an
+ audio bitstream that contains speech in a specific language may
+ be identified during CMML authoring through a param element with
+ "Content-Language" name, and acquired into the media bitstream
+ message header field of the same name.
+ </t>
+
+ </section>
+
+ <section title="The CMML fisbone packet fields">
+
+ <t>A CMML instance document that specifies annotations in "head"
+ and "clip" elements does not get to use the "stream" tag to
+ provide encoding hints for its CMML logical bitstream. Its
+ encoding hints come from the "cmml" tag and the "encoding"
+ attribute of the xml processing directive.
+ </t>
+
+ <t>"Number of header packets": this field has a fixed size of 3
+ for the CMML specification given in this document. It counts the
+ CMML ident packet, the XML preamble packet and the head tag packet.
+ </t>
+
+ <t>"Granulerate numerator & denominator": if the "granulerate"
+ attribute is present in the "cmml" tag, it MUST be represented
+ in the fisbone header in the fields "Granulerate numerator" and
+ "Granulerate denominator". The encoder MUST however ascertain
+ that the values are sensible. The value defaults to "1/1000" if
+ it is not specified by the user.
+ </t>
+
+ <t>"Content-type" message header field: the content type for
+ the fisbone packet that describes the CMML logical bitstream is
+ fixed at "text/x-cmml" (or "text/cmml" after IANA registration
+ of the MIME type.
+ </t>
+
+ <t>"charset": if the xml processing directive contains an "encoding"
+ attribute, this MUST be represented in the CMML fisbone packet as an
+ addendum to the message header field "Content-type" as a charset. For
+ example: "Content-type: text/x-cmml; charset=UTF-8".
+ </t>
+
+ <t>"ID" message header field: if an "id" attribute is specified
+ for the "cmml" tag, it SHOULD be represented in the skeleton
+ fisbone header for CMML as a message
+ header field with name "ID", as it signifies a short identifying
+ machine-readable string for the import media bitstream.
+ </t>
+
+ <t>"Content-Language" and "Content-Dir" message header fields: if
+ the "lang" and "dir" attributes are given in a "cmml" tag, they
+ MUST be represented in the fishbone packet of the CMML bitstream
+ as message header fields with name "Content-Language" and "Content-Dir".
+ </t>
+
+ </section>
+
+ <section title="Usage of the 'stream' tag">
+
<t>Here is a list of the attribute values of the
"stream" tag and how they are being used:
<list>
@@ -1523,41 +1732,36 @@
therefore be lost on encoding.
</t>
- <t>basetime: this attribute MUST be represented in the Skeleton
- ident header in the fields "Basetime numerator" and "Basetime
+ <t>basetime: this attribute maps to the skeleton
+ ident header fields "Basetime numerator" and "Basetime
denominator".
</t>
- <t>utc: this attribute MUST be represented in the Skeleton ident
- header in the field "utc".</t>
+ <t>utc: this attribute maps to the skeleton ident
+ header field "UTC".</t>
</list>
</t>
<t>Here is a list of the attribute values of the
"import" tag and how they are being used:
<list>
- <t>id: this attribute SHOULD be represented in the Skeleton
- secondary header for the respecitve media bitstream as a message
- header field with name "ID", as it signifies a short identifying
- machine-readable string for the import media bitstream.
+ <t>id: this attribute may be represented as a message header field
+ in the respective skeleton fisbone packet.
</t>
<t>lang, dir: not used, as these attributes signify the language
and directionality of the human readable texts in the stream tag
which are not acquired into the Annodex bitstream.</t>
- <t>granulerate: this attribute MUST be represented in the Skeleton
- secondary header for the respective media bitstream in the
- fields "Granule rate numerator" and "Granule
- rate denominator". The encoder MUST however ascertain that
- the values are corrected with the exact granule rate that was
- used during creation of the Annodex bitstream.
+ <t>granulerate: this attribute is used in the skeleton
+ fisbone header fields "Granule rate numerator" and "Granule
+ rate denominator" as well as for the "Presentationtime numerator"
+ and "Presentationtime denominator".
</t>
- <t>contenttype: this attribute MUST be represented in the
- respective Skeleton secondary header packet as a message header
- field with name "Content-type", as it signifies the MIME type
- of the media bitstream, providing for a decoding hint.
+ <t>contenttype: this attribute is represented in the
+ respective skeleton fisbone packet as a message header
+ field with name "Content-type".
</t>
<t>src: not used, as this attribute only points to the location
@@ -1582,126 +1786,14 @@
therefore be lost on encoding.
</t>
- <t>name, value: these attributes MAY be represented in the
- Skeleton secondary header packet of the respective media bitstream
+ <t>name, value: these attributes may be represented in the
+ skeleton fisbone packet of the respective media bitstream
as a message header field with the given name-value pair.
- These are highly dependent on the type of media bitstream
- handled and it therefore depends on the encoding tool to make
- a selection of the parameters acquired. E.g. lets regard an
- audio bitstream containing speech in a specific language.
- This language MAY be identified during CMML authoring as a
- param element with "Content-Language" name, and acquired into
- the media bitstream message header field of the same name.
</t>
</list>
</t>
</section>
- <section title="Encoding the preamble and the 'cmml' tag">
-
- <t>While the "stream" tag contained meta data on the different
- input media bitstreams, the preamble and the "cmml" tag contain
- meta data on the annotation bitstream and therefore end up in the
- Skeleton secondary header packet of the cmml bitstream.</t>
-
- <t>Here is a list of the attribute values of the preamble and
- how they are being acquired:
- <list>
- <t>xml version: without loss of generality, for simplicity
- this is fixed to version "1.0" for the current versions of
- CMML 2.0 and Annodex 2.0. Therefore, this attribute
- does not get represented in the Annodex bitstream and MUST be
- auto recreated during ripping of annotations out of the
- Annodex bitstream.</t>
-
- <t>xml encoding: this attribute MUST be represented in the
- CMML fisbone packet as a message header field with name
- "Content-type" and the encoding format being the charset
- value following "text/x-cmml;" (or "text/cmml;" after IANA
- registration of the MIME type).</t>
-
- <t>xml standalone: this is fixed to "yes" for the current versions
- of CMML 2.0 and Annodex 2.0. There is a need to explore how
- to include data of general xml documents that conform to a
- different DTD into CMML and ultimately Annodex. Until then,
- standalone is fixed to "yes" and does not get represented in
- the Annodex bitstream, but MUST be auto recreated during
- ripping of annotations out of it.</t>
-
- <t>DOCTYPE declaration: this is fixed to
- <![CDATA[<!DOCTYPE cmml SYSTEM "cmml.dtd">]]> and thus
- again does not get represented in the Annodex bitstream
- but MUST be auto recreated during ripping.</t>
- </list>
- </t>
-
- <t>Here is a list of the attribute values of the "cmml" tag and
- how they are being acquired:
- <list>
- <t>id: this attribute SHOULD be represented in the fisbone packet
- of the annotation bitstream as a message header field with
- name "ID", as it signifies a short identifying
- machine-readable string for the annotation bitstream (in
- analogy to the id field of the import tags).
- </t>
-
- <t>lang, dir: these attributes MUST be represented in the
- fishbone packet of the annotation bitstream as message header
- fields with name "Content-Language" and "Content-Dir".
- </t>
-
- <t>xmlns: this attribute is fixed to "http://www.annodex.net/cmml"
- and thus does not get represented in the Annodex bitstream
- but must be auto recreated during ripping.
- </t>
- </list>
- </t>
- </section>
-
- <section title="Encoding the 'head' tag">
- <t>The CMML "head" tag is printed as a string into the first
- secondary header packet of the annotation bitstream. Thus,
- the value of the field named "number of header packets"
- in the fisbone page for the annotation bitstream will be 1, unless
- the "head" tag turns out to be too big for one Ogg page (i.e.
- larger than about 64K).
- </t>
-
- <t>Note that the encoding process must ensure that newline
- characters are represented as LF (or "\n" in C) only. As some
- systems represent the new line as CR LF combinations (or
- "\r\n" in C), the encoding process MAY need to strip out
- the CR character.
- </t>
- </section>
-
- <section title="Encoding the 'clip' tags">
- <t>The "clip" tags are the real content of an annotation
- bitstream. Their "start" and "end" attributes only exist for
- authoring purposes and are not copied into the annotation
- bitstream to avoid contradictory doubly represented information as
- their position in the stream already represents this timing information.
- </t>
-
- <t>A "clip" tag is encoded with all tags (except for the
- "start" and "end" attributes) as a string printed into a
- clip packet in the annotation bitstream. The "clip"
- tag's "start" attribute tells the Annodex encoder at what
- time to insert the clip packet into the bitstream. Its "end"
- attribute (if present) leads to the creation of another
- clip packet at the given end time in the Annodex bitstream,
- unless another clip packet starts on the same track beforehand.
- This clip packet contains an empty "clip" tag, i.e. a "clip"
- tag without "meta", "a", "img" or "desc" elements and no
- attribute values except for a copy of the "track" attribute
- from the original "clip" tag.
- </t>
-
- <t>Again, the encoding process must ensure that newline
- characters are represented as LF (or "\n" in C) only.
- </t>
- </section>
-
</section>
<!--**************************-->
@@ -1726,7 +1818,7 @@
</t>
<t>If the Annodex bitstream has a non-zero basetime or a non-null
- utc time in the Skeleton ident header, a "stream" tag MUST be
+ utc time in the skeleton ident header, a "stream" tag MUST be
created with these attribute values. That "stream" tag is empty
by default. A ripping application MAY however extract all the data
bitstreams out of the Annodex bitstream into files, and then reference
@@ -1737,13 +1829,13 @@
the logical bitstreams:
<list style="symbols">
<t>the "contenttype" attribute from the "Content-type" Message
- header field of the respecitve Skeleton secondary header packet,</t>
+ header field of the respecitve skeleton secondary header packet,</t>
<t>the "granulerate" attribute from the Granulerate fields of
- the respecitive Skeleton secondary header packet,</t>
+ the respecitive skeleton secondary header packet,</t>
<t>the "id" attribute from a Message header field called "ID"
if available,</t>
<t>and "param" elements from all the remaining Message header fields
- of the respective Skeleton secondary header packet, where the field
+ of the respective skeleton secondary header packet, where the field
name gets stored in the "name" attribute and the value in the
"value" attribute.</t>
</list>
--
silvia
More information about the cvs-annodex
mailing list