This document is available on-line as http://www.ch.ic.ac.uk/chemime/iupac.html

If you have any comments, please mail any of the principal authors of this document (rzepa@ic.ac.uk, p.murray-rust@mail.cryst.bbk.ac.uk, benw@chemistry.leeds.ac.uk) or send a comment to the discussion forum chemime@ic.ac.uk)


The Chemical MIME Project

Henry S. Rzepa,(a) Peter-Murray Rust (b) and Benjamin Whitaker (c)

August, 1996

(a) Department of Chemistry, Imperial College, London.
(b) Department of Pharmacology, University of Nottingham.
(c) School of Chemistry, University of Leeds.

Contents.

  1. Background and History to the Project
  2. Why do we Need Chemical Internet Standards?
  3. Chemical MIME Types included in the May - October 1995 IETF draft
  4. New types Proposed since the Original IETF Draft
  5. The Chemime Discussion list archives
  6. Uptake of Chemical MIME Usage (Alta Vista Statistics)
  7. A List of Projects Utilising Chemical MIME
  8. Software which supports chemical MIME media types directly
  9. Background articles and other information about chemical MIME

1. Background and History to the Project

Prior to 1992, the "Internet" was essentially a matrix of computer networks bound by a common "network" protocol and used predominantly as a computer file transfer mechanism and electronic mail carrying mechanism. Standards were in place, but they tended to be generic ones dealing with technical issues. No explicit chemical standards were in place. Around 1993, two new mechanisms were introduced on the Internet.

(a) Electronic mail evolved from purely text based communication, to systems where "attachments" to the message could be included. For the first time, it became possible to include attachments which could have chemical content.
(b) A mechanism for document delivery called the World-Wide Web was introduced. Here too, a document could be associated with chemical content via a device known as a "hyperlink".

It became obvious during 1993 that the enormous potential for exchange of structured information that the Internet now offered would have to be matched by globally accepted standards for such information. At this stage, we considered that "chemical" information represented a potentially definable class of "media type" that had certain unique characteristics that would require particular handling by the recipient of such information. We started a project in January 1994 which we called "Chemical MIME". This was first announced during the Chemistry workshop at the First WWW International Conference, held at CERN in May 1994.

Our intention was to establish a set of standard "headers" that would unambiguously identify "chemical content" in Internet Electronic Mail message bodies and World-Wide Web documents. We originally identified a small number of relatively standard file types containing chemical information, which together with the addition of a chemical MIME header, would enable the content to be sensibly processed by the recipient of the information. Essentially, this was an addition to the MIME standard which had been proposed and ratified via a body called the IETF (Internet Engineering Task Force) in 1993. We initially approached the IETF with our proposal, via a discussion document called an Internet Draft. The first version of this was published during May-October 1994, and a second revised version during April-September 1995. These two proposals each expired after six months of discussion. In July 1995, we presented our case in person at the IETF meeting in Stockholm. Out of this meeting there emerged several conclusions.

Meanwhile, the original chemical MIME proposals were widely disseminated throughout the chemical communities, and have been widely adopted via various electronic forums. It is the proposal of the current discussion document to define a set of chemical MIME standards for ensorsement by the IUPAC committees.

2. Why do we Need Chemical Internet Standards?

Effective information exchange takes place when everyone uses the same tools. There are a number of ways this can happen, varying from careful planning over several years to the adoption of a de facto approach that everyone uses. When organisations try to develop informatics tools without the general knowledge and consent of the community great tensions usually result, and it is our intention to try facilitate progress with as little conflict as possible.

Communities are usually suspicious of organisations that 'go it alone' in developing informatics tools and this often results in competing systems developed under a veils of secrecy; there is a built-in disadvantage to those outside the developer's organisation. This is evidenced by the flame-wars that are common on many public newsgroups and discussion groups at present.

Chemistry has inherited a large number of legacy approaches to information and whilst these are useful for some subsets of the discipline, we feel strongly that the tools of the future will only come through public debate and cooperation. Also, however, we need a variety of ideas and approaches so it is valuable to see which ones 'evolve' as well as being planned. If a subcommunity finds a useful de facto standard, that may well be worthy of recognition as such; but it may also need careful tailoring so that it interfaces well with other areas. This can only come through public activity.

Many tools developed by single organisations in a competitive situation are not future-proof; i.e. they may not be interpretable in a few years' time and the information may be effectively lost. This is particularly likely for binary files, but may also happen when numbers or abbreviations are used. Examples of this are common, and it would be presumptuous to guess which products were still supported in the future.

Terms are often given different semantics or used with default units. It is therefore important to agree with the rest of the community how a term is to be interpreted, and ideally there should be algorithms to convert to related terms.

Guidelines

We propose that those developing informatics standards commit to the following guidelines:

General

Chemical/* MIME

A proposal for classifying and regulating the types of chemical document has been submitted to the IETF. A number of existing file types were proposed which have met with wide acceptance in the molecular community. Until the IETF or other body ratifies the proposal, the following guidelines for the use of MIME types are proposed:

3. Chemical MIME Types included in the May - October 1995 draft as part of a "standards track" process.

The following list of chemical MIME types forms the main body of the IETF Internet draft valid during the period May - October 1995.
Type Filename extension
chemical/x-cxf cxf
chemical/x-mif mif
chemical/x-pdb pdb
chemical/x-cif cif
chemical/x-mdl-molfile mol
chemical/x-mdl-sdf sdf
chemical/x-mdl-rdf rdf
chemical/x-mdl-rxn rxn
chemical/x-embl-dl-nucleotide emb, embl
chemical/x-genbank gen
chemical/ncbi-asn1-binary val
chemical/x-gcg8-sequence gcg
chemical/x-daylight-smiles smi
chemical/x-rosdal ros
chemical/x-macromodel-input mmd, mmod
chemical/x-mopac-input mop
chemical/x-gaussian-input gau
chemical/x-jcamp-dx jdx
chemical/x-kinemage kin

4. New types Proposed since the Original IETF Draft

TypeFile extensionDescription and Formal Description Originator
chemical/x-chemdraw chm ChemDraw Format CambridgeSoft
chemical/x-chem3d c3d Chem3D Format CambridgeSoft
chemical/x-mdl-tgf tgf Transportable Graphics Format MDL Information systems
chemical/x-csmlcsmlChemical Structure Markup Language P. Murray-Rust, R. Sayle, H. S. Rzepa and B. J. Whitaker, J. Chem. Soc., Perkin Trans 2, 1995, 7.
chemical/x-vmdvmdVMD - Visual Molecular DynamicsAndrew Dalke, see http://www.ks.uiuc.edu/Research/vmd/
chemical/x-cmlcmlChemical Markup Language Peter Murray-Rust. Developers version 0.7

5. The Chemime Discussion list archives. During the Period November 1994 - present, a discussion list has been active for people to discuss various aspects of the proposals. Users can subscribe by sending a message to listserver@ic.ac.uk with the content
subscribe chemime your name
The discussions of this forum are archived under
http://www.ch.ic.ac.uk/hypermail/chemime/

6. Uptake of Chemical MIME Usage (Alta Vista Statistics)

The following url fragments represent Alta Vista (http://www.altavista.digital.com/) searches using the Advanced Query feature after the keyword "link:" (e.g. link:www.ch.ic.ac.uk). It can report an estimate or actual count of the number of pages pointing to a particular link. This search was performed on July 23, 1996.
URL Used for Alta Vista Search Number of other documents with a Hyperlink to this page Comment on page
"www.ch.ic.ac.uk/chemical_mime.html" 550 One of the twooriginal project pages used to illustrat the use of chemical MIME-types.
"chem.leeds.ac.uk/Project/MIME.html" 400 The second MIME project page
"www.ch.ic.ac.uk/chemime/chemime2.html" 250 The original IETF Chemical MIME-types standards document
>"www.ncbi.nlm.nih.gov" 8000 NCBI
"www.pdb.bnl.gov" 3000 Brookhaven
"www.prosci.uci.edu" 2000 The Electronic Journal Protein Science
"structbio.nature.com" 900 Nature

7. A List of Projects Utilising Chemical MIME

  1. Original Examples at Imperial and Leeds Universities.
  2. NCBI Project at NIH
  3. Brookhaven Protein Databank
  4. Molecules R Us facility at the NIH
  5. Protein Science E-Journal
  6. Journal of Molecular Modelling
  7. Nature Science Journal
  8. Electronic Conferences in Trends in Organic Chemistry: ECTOC-1 and ECHET96
  9. Klotho Project at WUSTL.
  10. Project CORINA at Erlangen University
  11. ChemFinder Project by CambridgeSoft
  12. Demos by Daylight Software.
  13. Molecule-of-the-Month Collections
  14. Chemical and Drug Structure Display at the NIH

8. Software which supports chemical MIME media types directly

  1. Chemscape Chime by MDLI: A Netscape plug-in.

9. Background articles and other information about chemical MIME.

  1. Antony N. Davies, "Internet Chemical MIME", Spectroscopy Europe, 1996, 8(1), 42.
  2. H. S. Rzepa, B. J. Whitaker and M. J. Winter, J. Chem. Soc., Chem. Commun., 1994, 1907.
  3. O. Casher, G. Chandramohan, M. Hargreaves, C. Leach, P. Murray-Rust, R. Sayle, H. S. Rzepa and B. J. Whitaker, J. Chem. Soc., Perkin Trans 2, 1995, 7.
  4. S. M. Bachrach, P. Murray-Rust, H. S. Rzepa and B. J. Whitaker, Network Science, March, 1996.
  5. Maryilyn Dunker, Indiana University, Chemical Information Viewers: A Collection of programs that can be used with chemical MIME datasets.
  6. Scott Nelson, Lawrence Livermore National Laboratory, A test page for checking your MIME Configurations