formula[el.formula]
The stochiometry of the molecule

It is defined by atomArrays each with a list of elementTypes and their counts (or default=1). All other information in the atomArray is ignored. formula are nestable so that aggregates (e.g. hydrates, salts, etc.) can be described. CML does not require that formula information is consistent with (say) crystallographic information; this allows for experimental variance.

An alternative briefer representation is also available through the conciseForm. This must include whitespace round all elements and their counts, which must be explicit.

example

<cml>
  <molecule id="sulfuric acid">
    <formula concise="H 2 S 1 O 4"/>
  </molecule>
  <molecule id="">
    <formula title="[Cu(NH3)4]2+ SO42-]">
      <formula formalCharge="+2">
        <atomArray elementType="Cu"/>
        <formula count="4">
          <atomArray elementType="N H" count="1 3"/>
        </formula>
      </formula>
      <formula formalCharge="-2">
        <atomArray elementType="S O" count="1 4"/>
      </formula>
    </formula>    
  </molecule>
</cml>

       
Content Model
((formula|atomArray)*)
id[att.id]
A unique ID for an element

This is not formally of type ID (an XML NAME which must start with a letter and contain only letters, digits and .-_:). It is recommended that IDs start with a letter, and contain no punctuation or whitespace. The function generate-id() in XSLT will generate semantically void unique IDs.

It is difficult to ensure uniqueness when documents are merged. We suggest namespacing IDs, perhaps using the containing elements as the base. Thus mol3:a1 could be a useful unique ID. However this is still experimental.

[xsd:string]
Pattern: [A-Za-z0-9_-]+(:[A-Za-z0-9_-]+)?
An attribute providing a unique ID for an element
title[att.title]
A title on an element.
No controlled value.

example

<action title="turn on heat" start="T09:00:00" convention="xsd"/>
convention[att.convention]
A string referencing a dictionary, units, convention or other metadata.

The namespace is optional but recommended where possible

Note: this convention is only used within STMML and related languages; it is NOT a generic URI.

example

<list>
<!-- dictRef is of namespaceRefType -->
  <scalar dictRef="chem:mpt">123</scalar>  
<!-- error -->
  <scalar dictRef="mpt23">123</scalar>  
</list>

        
[xsd:string]
Pattern: [A-Za-z][A-Za-z0-9_]*(:[A-Za-z][A-Za-z0-9_]*)?
A reference to a convention

There is no controlled vocabulary for conventions, but the author must ensure that the semantics are openly available and that there are mechanisms for implementation. The convention is inherited by all the subelements, so that a convention for molecule would by default extend to its bond and atom children. This can be overwritten if necessary by an explicit convention.

It may be useful to create conventions with namespaces (e.g. iupac:name). Use of convention will normally require non-STMML semantics, and should be used with caution. We would expect that conventions prefixed with "ISO" would be useful, such as ISO8601 for dateTimes.

There is no default, but the conventions of STMML or the related language (e.g. CML) will be assumed.

example

<bond convention="fooChem" order="-5"
   xmlns:fooChem="http://www.fooChem/conventions"/>
dictRef[att.dictRef]
A string referencing a dictionary, units, convention or other metadata.

The namespace is optional but recommended where possible

Note: this convention is only used within STMML and related languages; it is NOT a generic URI.

example

<list>
<!-- dictRef is of namespaceRefType -->
  <scalar dictRef="chem:mpt">123</scalar>  
<!-- error -->
  <scalar dictRef="mpt23">123</scalar>  
</list>

        
[xsd:string]
Pattern: [A-Za-z][A-Za-z0-9_]*(:[A-Za-z][A-Za-z0-9_]*)?

A reference to a dictionary entry.

Elements in data instances such as scalar may have a dictRef attribute to point to an entry in a dictionary. To avoid excessive use of (mutable) filenames and URIs we recommend a namespace prefix, mapped to a namespace URI in the normal manner. In this case, of course, the namespace URI must point to a real XML document containing entry elements and validated against STMML Schema.

Where there is concern about the dictionary becoming separated from the document the dictionary entries can be physically included as part of the data instance and the normal XPointer addressing mechanism can be used.

This attribute can also be used on dictionary elements to define the namespace prefix

example

<scalar dataType="xsd:float" title="surfaceArea" 
  dictRef="cmlPhys:surfArea" 
  xmlns:cmlPhys="http://www.xml-cml.org/dict/physical"
  units="units:cm2">50</scalar>

           

example

<stm:list xmlns:stm="http://www.xml-cml.org/schema/stmml">
  <stm:observation>
    <p>We observed <object count="3" dictRef="#p1"/> 
      constructing dwellings of different material</p>
  </stm:observation>
  <stm:entry id="p1" term="pig">
    <stm:definition>A domesticated animal.</stm:definition>
    <stm:description>Predators include wolves</stm:description>
    <stm:description class="scientificName">Sus scrofa</stm:description>
  </stm:entry>
</stm:list>

           
count[]
A multiplier for the formula
No formal default but assumed to be 1. Allows for fractional components.
formalCharge[]

The formal charge is normally calculated from the formal charges of the atoms. If the formalCharge attribute is given it overrides this information completely. This allows (say) metal complexes to be represented when it is difficult to apportion the charges to atoms.

concise[]
A concise representation for a molecular formula

This MUST adhere to a whitespaced syntax so that it is trivially machine-parsable. Each element is followed by its count, and the string is optionally ended by a formal charge. NO brackets or other nesting is allowed.

example

<stm:list xmlns:stm="http://www.xml-cml.org/schema/stmml">
  <formula id="methane" concise="C 1 H 4"/>
  <formula id="chloroacetate" concise="Cl 1 H 2 C 2 O 2 -1"/>
  <formula id="sodiumSulfate">
    <formula concise="H 2 O 1" count="10"/>
    <formula concise="Na 1 +1" count="2"/>
    <formula concise="S 1 O 4 -2"/>
  </formula>
</stm:list>

       
[xsd:string]
Pattern: \s*([A-Z][a-z]?\s+[1-9][0-9]*)(\s+[A-Z][a-z]?\s+[1-9][0-9]*)*(\s+[-|+]?[0-9]+)?\s*
A concise string representing an (unstructured) formula