MOL
The content model of a MOL (molecule) allows for considerable flexibility
in storage (see below).
Although many of the XVAR, XLIST, etc.
could also be held in an TecML file without MOL.DTD,
the containment within a molecule is very well suited to molecular
databases (e.g. crystallography) where all data is "attached" to a molecule.
NOTE: The use of the term 'molecule' is not meant to imply anything
about the bonding model or physical nature of the thing in question.
MOL
can be used to hold data on extended solids (such as NaCl) or van der Waals
complexes. The bonding model is kept simple to emphasise that for many
molecules there need to be additional semantics to specify it adequately.
The simple model may be refined over time.
The primary use of MOL is to provide at least one way of accurately
conveying the precise nature and identity of the substance. This may not
always be the best or most efficient or the one that you are used to.
The present constraints of MOL are:
- Only one molecule can be stored as ATOMS per MOL. (It is possible to store
disjoint molecules, such as complexes or salts with simple ratios, simply
by providing this in the connexion table (e.g. 'dot-disconnected molecules'
in SMILES). Mixtures
would be best described by defining two or more molecules and using links
(A) embedded in hypertext.
- There are limited descriptors for generic molecules, such as substructures,
Markush, search queries, etc. For a completely generic approach I expect
we shall need a grammar. See TYPE...
- It cannot deal with reactions. These can be partially dealt with by
hypertext and references, but this needs to be developed. (C.REACT is a
placeholder at present).
Content Model
Among the molecular properties and data MOL can handle
(in any order and repeatable although this is not always meaningful);
- XVARs to handle keywords, IDs, etc. (For other data is may be
tidier to enclose them in XLISTs).
- ARRAYs, though these may be neater in XLISTs
- A description or other chunks of hypertext(XHTML)
- Molecular formula and/or connection table (FORMULA) (Repeatable).
- Molecular symmetry (SYMMETRY).
- Crystallographic data (including cell dimensions, spacegroup and
experimental data) (CRYST).
- Macromolecular sequence (SEQUENCE) (Repeatable)
- Macromolecular features (FEATURE) (Repeatable)
- Atoms and their attributes (ATOMS)
- bond information (BONDS)
- Bibliography (BIB).
- Data blocks (XLIST).
- Figures (FIGURE).
- Free text or foreign files (NOTATION).
- Relations bteween objects (RELATION).
- Other Molecules (MOL)
The more examples I have explored, the less constraints can be put
on what MOL can contain. The
Content
- admin -- Administrivia.
- array -- A very flexible matrix/array/geometry container.
- atoms -- A generic container for atomic coordinates and properties
- bib -- A bibliographic entry.
- bonds -- A generic container for bonds and their properties
- cryst -- Crystallographic data, especially unit cell and symmetry.
- feature -- Features of macromolecules (e.g. SITE, MUTATION).
- figure -- A figure, possibly in encoded binary.
- formula -- Chemical formula.
- mol -- Toplevel container for molecular information.
- relation -- Describes relationship between objects, including hyperlinks.
Experimental at present.
- sequence -- Represents a macromolecular sequence.
- symmetry -- Molecular symmetry.
- xhtml -- A hypertext container for use in TecML and CML.
- xlist -- A very flexible generic list/tree/table container.
- xnotation
- xvar -- A generic, flexible, container for scalar information.
ATTRIBUTES
CONTENT DECLARATION
- Tag Minimization
-
Open Tag: REQUIRED
Close Tag: REQUIRED
Parent Elements
- cml -- A toplevel DTD encompassing HTML 2.0, TecML and MOL.
- mol -- Toplevel container for molecular information.
Top Elements
All Elements
Tree
cml DTD