It is defined by atomArrays each with a list of elementTypes and their counts (or default=1). All other information in the atomArray is ignored. formula are nestable so that aggregates (e.g. hydrates, salts, etc.) can be described. CML does not require that formula information is consistent with (say) crystallographic information; this allows for experimental variance.
An alternative briefer representation is also available through the conciseForm. This must include whitespace round all elements and their counts, which must be explicit.
<cml>
<molecule id="sulfuric acid">
<formula concise="H 2 S 1 O 4"/>
</molecule>
<molecule id="">
<formula title="[Cu(NH3)4]2+ SO42-]">
<formula formalCharge="+2">
<atomArray elementType="Cu"/>
<formula count="4">
<atomArray elementType="N H" count="1 3"/>
</formula>
</formula>
<formula formalCharge="-2">
<atomArray elementType="S O" count="1 4"/>
</formula>
</formula>
</molecule>
</cml>
This is not formally of type ID (an XML NAME which must start with a letter and contain only letters, digits and .-_:). It is recommended that IDs start with a letter, and contain no punctuation or whitespace. The function generate-id() in XSLT will generate semantically void unique IDs.
It is difficult to ensure uniqueness when documents are merged. We suggest namespacing IDs, perhaps using the containing elements as the base. Thus mol3:a1 could be a useful unique ID. However this is still experimental.
<action title="turn on heat" start="T09:00:00" convention="xsd"/>
The namespace is optional but recommended where possible
Note: this convention is only used within STMML and related languages; it is NOT a generic URI.
<list>
<!-- dictRef is of namespaceRefType -->
<scalar dictRef="chem:mpt">123</scalar>
<!-- error -->
<scalar dictRef="mpt23">123</scalar>
</list>
There is no controlled vocabulary for conventions, but the author must ensure that the semantics are openly available and that there are mechanisms for implementation. The convention is inherited by all the subelements, so that a convention for molecule would by default extend to its bond and atom children. This can be overwritten if necessary by an explicit convention.
It may be useful to create conventions with namespaces (e.g. iupac:name). Use of convention will normally require non-STMML semantics, and should be used with caution. We would expect that conventions prefixed with "ISO" would be useful, such as ISO8601 for dateTimes.
There is no default, but the conventions of STMML or the related language (e.g. CML) will be assumed.
<bond convention="fooChem" order="-5" xmlns:fooChem="http://www.fooChem/conventions"/>
The namespace is optional but recommended where possible
Note: this convention is only used within STMML and related languages; it is NOT a generic URI.
<list>
<!-- dictRef is of namespaceRefType -->
<scalar dictRef="chem:mpt">123</scalar>
<!-- error -->
<scalar dictRef="mpt23">123</scalar>
</list>
A reference to a dictionary entry.
Elements in data instances such as scalar may have a dictRef attribute to point to an entry in a dictionary. To avoid excessive use of (mutable) filenames and URIs we recommend a namespace prefix, mapped to a namespace URI in the normal manner. In this case, of course, the namespace URI must point to a real XML document containing entry elements and validated against STMML Schema.
Where there is concern about the dictionary becoming separated from the document the dictionary entries can be physically included as part of the data instance and the normal XPointer addressing mechanism can be used.
This attribute can also be used on dictionary elements to define the namespace prefix
<scalar dataType="xsd:float" title="surfaceArea"
dictRef="cmlPhys:surfArea"
xmlns:cmlPhys="http://www.xml-cml.org/dict/physical"
units="units:cm2">50</scalar>
<stm:list xmlns:stm="http://www.xml-cml.org/schema/stmml">
<stm:observation>
<p>We observed <object count="3" dictRef="#p1"/>
constructing dwellings of different material</p>
</stm:observation>
<stm:entry id="p1" term="pig">
<stm:definition>A domesticated animal.</stm:definition>
<stm:description>Predators include wolves</stm:description>
<stm:description class="scientificName">Sus scrofa</stm:description>
</stm:entry>
</stm:list>
The formal charge is normally calculated from the formal charges of the atoms. If the formalCharge attribute is given it overrides this information completely. This allows (say) metal complexes to be represented when it is difficult to apportion the charges to atoms.
This MUST adhere to a whitespaced syntax so that it is trivially machine-parsable. Each element is followed by its count, and the string is optionally ended by a formal charge. NO brackets or other nesting is allowed.
<stm:list xmlns:stm="http://www.xml-cml.org/schema/stmml">
<formula id="methane" concise="C 1 H 4"/>
<formula id="chloroacetate" concise="Cl 1 H 2 C 2 O 2 -1"/>
<formula id="sodiumSulfate">
<formula concise="H 2 O 1" count="10"/>
<formula concise="Na 1 +1" count="2"/>
<formula concise="S 1 O 4 -2"/>
</formula>
</stm:list>