The Application of Chemical Multipurpose Internet Mail Extensions (Chemical MIME) Internet Standards to Electronic Mail and World-Wide Web information exchange

Henry S. Rzepa,^{* a} Peter Murray-Rust^b and Benjamin J. Whitaker^c

^aDepartment of Chemistry, Imperial College, London, SW7 2AY; E-mail: rzepa@ic.ac.uk

^bVirtual School of Molecular Science, Department of Pharmacy, University of Nottingham, Nottingham.

^cSchool of Chemistry, University of Leeds, Leeds, LS2 9JT, UK

Summary: The proposal and subsequent global use of an Internet standard based on chemical primary Multipurpose Internet Mail Extensions (chemical MIME) media type is reviewed. Examples of the configuration of this standard for use with Internet based electronic mail and World-Wide Web clients are shown. The long term objectives of the integration and inter-operability of chemical information across the boundaries of Internet-based electronic journals, conferences, virtual courses, databases, modelling and newly emerging information handling and modelling tools are set out. We believe that one way forward is by concentrating on more finely grained chemical information components, using generic tools based on XML (eXtensible markup language) and its support in chemistry via CML (chemical markup language).

Introduction

The development of Internet-based document and information delivery systems during the last four years has been rapid,¹ with a particular focus on the creation and delivery of chemically oriented World-Wide Web (WWW) based documents. This has in turn introduced to many concepts such as the use of structured and interlinked document collections specified by languages such as HTML (Hypertext Markup Language). This review will focus on one aspect of this revolution, the chemical application of an Internet standard known as MIME (Multipurpose Internet Mail Extensions) to the World-Wide Web and electronic mail (email) handling. We will discuss how a transparent integration of e-mail and WWW based exchanges of chemical information with chemical modelling and information handling tools can be achieved using this mechanism, and present some ideas for how we believe development beyond the MIME mechanism should proceed.

Multipurpose Internet Mail Extensions (MIME)

Despite the attention given to the development of the World-Wide Web, email arguably remains the more frequently used mechanism for electronic information exchange by scientists. Email is often regarded however as a temporary and informal communication medium, not well suited for the precisely defined exchange of structured information in a subject area such as chemistry. Its increasing adoption by chemists over the last 15 years or so has concentrated on the exchange of loosely structured messages based on ASCII text, which rarely if ever contain any explicit markup (chemical or otherwise) or easily machine-parsable semantics. In 1992, Borenstein and Freed recognised that the absence of a structuring mechanism was a serious deficiency in email, and proposed a MIME protocol² for achieving this, which was subsequently adopted as an Internet standard by the Internet Engineering Task Force (IETF). MIME defined how a specified document could be associated with an email message body, and how it should be handled upon receipt by a suitable client program.

The MIME protocol comprises two components. The first defines how binary computer files must be encoded to achieve so-called 7-bit transparency for compatibility with most text-based Internet mail routers (so-called base-64 encoding) and is not discussed further here. The second component defines a standard mechanism whereby computer files can be associated with an email message via appropriate headers and delimiters, and allows the appropriate processing of such enclosures by mail handling programs in the possession of the email recipient. Borenstein and Freed envisaged a multi-component structure to an email message, in which the first compoent would comprise the informal and unstructured message body, whilst subsequent components could include structured and well defined data files which could be handled by programs other than the basic email client. These components were to be known as media types, and in the original proposal, a number of such primary media types were defined, each sufficiently generic that default handling schemes could, at least in principle, be applied their content. Thus it is clearly apparent that different processing and display mechanisms are required for the primary defined media types TEXT, IMAGE, AUDIO and VIDEO. The APPLICATION media type has less well defined boundaries, and tends to be used for the resolution of proprietary data types defined by the developers of software applications. A MULTIPART type was defined to allow the multi-component collection to be created. Most recently, the MODEL primary type has been added to allow the processing of numerical and symbolic data for three and higher dimensional models. The MIME protocol also defines a secondary media type header which allows the definition of more specific information on the expected content of a message attachment. For example, image/jpeg defines a bit-mapped image file in the specific standard format defined by the Joint Photographic Experts Group. The two level mechanism also allows a separate name space to be defined for each primary media type.

In early 1994, we considered³ how the MIME mechanism could be used to allow the exchange of standard (ratified or de facto) chemical data types using either email mechanisms or the then emerging medium of the World-Wide Web. Whilst many of the so-called chemical legacy formats are not always fully documented and specified in the literature, and some such as the Brookhaven protein databank format have spawned a number of variants and mutations over the years, we nevertheless felt that the concept of "chemical" as a new primary MIME media type would have a number of distinct advantages. Firstly, it was apparent that none of the original or subsequently proposed primary media types would allow any sensible component of default handling of implicit chemical information contained in a data file. Secondly, the MIME mechanism operates by assigning three or four letter filename extensions to the data files, and hence each primary type must operate within a closely regulated name space convention. By assigning a primary type CHEMICAL, this name space could be delegated to the community that defines the media type, rather than the less manageable Internet community as a whole. Finally, the adoption of CHEMICAL as a primary media type was seen as the first step in achieving a closer integration between the exchange of chemical information via document server systems such as the World-Wide Web and the exchange of the same data types using electronic mail mechanisms.

Chemical MIME Types

In the four years or more that have elapsed since the original proposal for chemical MIME types, their use via the World-Wide Web has become common. Listed in Table 1 are the chemical MIME types which as far as we are aware have actually been used to a greater or less extent during this period,⁴ almost always in the context of the WWW rather than email, together with suggestions for appropriate programs capable of processing and/or displaying the associated chemical content. In many cases, such programs can also serve as the starting basis for molecular modelling applications and database queries,⁵ electronic journal⁶ and conference browsing,⁷ and numerous other applications. In this sense, chemical MIME has served as the infra-structure which has started to catalyse the development of a new generation of Internet-based chemistry tools.¹

Chemical MIME types can be applied in three different contexts.

MIME types which have been configured for WWW (HTTP) document servers operating on an Internet-wide scale, i.e. associated with publically published documents. Such configuration is normally accomplished via a privileged account, and the use of standard types is essential so that different servers allow documents of the same type to be access by remote users in an identical manner. The precise manner in which any individual server is configured may differ, but a typical entry in a "mime.types" configuration file might appear as follows
chemical/x-mdl-molfile mol
This simply serves as an instruction to the server that any document associated with a filename extension .mol is issued upon request with a document header containing a specification of the MIME type as chemical/x-mdl-molfile.
In an Intranet environment, i.e. one associated with documents which are only accessible in a controlled private environment, it is common to define additional non-standard MIME types for local use. The responsibility for coordinating the use of such private types lies entirely within the organisation. This is to be contrasted with the use of public types, for which articles such as this serve to co-ordinate globally.
The configuration of client-side software for MIME is accomplished quite differently from that for servers. A number of the MIME types listed in Table 1 in fact derive from so-called "plug-ins"⁸ which can be used to enhance the basic capability of a World-Wide Web browser and/or email software, and which removes much of the burden of installation of the MIME mechanism from the user. An alternative is for the user to pro-actively specify that a designated "helper" program be used to resolve the chemical document. In some cases, such as the Netscape Communicator program, the same software package can be used for handling both WWW documents or email messages, and the user's configuration for both is handled via a single plug-in installation process. For other programs, such as stand-alone email clients, the user will have to do the configuration process explicitly.

Application of chemical MIME using Email and WWW-based Client Software

An overview of how MIME can be applied to the transport of specific chemical data types using the two principle Internet mechanisms of email and the WWW is illustrated in Scheme 1. Four distinct data storage areas can be identified on any individual user's computer file system. These include the general user file area, an area specified by the user for receipt of email attachments, a temporary area associated with the WWW-client cache if specified by the user and finally a WWW document collection area if the user has specified a personal WWW-server or has access to a central WWW server. Chemical MIME at least in part provides one mechanism for achieving self-consistency in the handling of chemical files across these file areas. To illustrate this process, some specific examples of how MIME headers are added are shown below.

Examples of chemical MIME Headers

A WWW client which makes a HTTP (Hypertext transfer protocol) GET request to a WWW server configured to support chemical MIME types results in the following response;

GET /atp.pdb http/1.0

HTTP 200 Document follows
Date: Mon, 30 Mar 1998 13:54:40 GMT
Server: NCSA/1.5.2
Last-modified: Fri, 19 Aug 1994 15:46:58 GMT
Content-type: chemical/x-pdb
Content-length: 2916

The received MIME type is resolved via a suitable internal look-up table available to the WWW client which maps the MIME types to an application program or plug-in capable of parsing, processing and/or displaying the chemical data, in this example a simple PDB format file.

An email client which makes a SMTP (Simple mail transfer protocol) request to an email relay will receive the following related set of headers;

Mime-Version: 1.0
Content-Type: multipart/mixed; boundary="===_-1320854989==_===="
Date: Mon, 30 Mar 1998 15:18:23 +0100
To: recipient@somewhere
From: "Sender" 
Subject: Illustration of  chemical MIME headers
Status: O

--====_-1320854989==_====
Content-Type: text/plain; charset="us-ascii"

This message contains a chemical attachment

--====_-1320854988==_D====
Content-Type: chemical/x-pdb; name="ferrocene.pdb"
Content-Disposition: attachment; filename="ferrocene.pdb"
Content-Transfer-Encoding: base64

Q09NUE5EICAgIGZlcnJvY2VuZS5...

The email program can be used to extract the appropriate component of the multipart message attachment (in this example separated by the unique string 1320854989), decoding it if necessary from the base-64 scheme adopted to ensure 7-bit transparency of the file, and to save the file to the user's filebase in a segregated area identified for such attachments. If the user wishes to view the contents of the attachment, a mapping between the MIME types and a suitable application program can be achieved either via a specific look-up table associated with the email client, or by invoking a WWW-client to perform this task.

Application of chemical MIME to add value to Document Exchange

During the last four years, it has become increasingly common place to attach documents of various types to text email messages via the implicit use of MIME protocols. The most common types of attached documents tend to be either bit-mapped images (MIME type image/gif or image/jpeg) or 7-bit encodings of binary word-processor documents (e.g. MIME type application/msword). The use of such non-chemical MIME types almost certainly means that any chemical information contained in the documents will inevitably degrade. Recovering chemical data from such formats into an active and reusable form firstly requires knowledge that the document actually does contain chemical information, and secondly it requires information about the likely structure of that information. It is precisely this missing information which the explicit use of the chemical MIME protocol will provide, via the primary and secondary types. To highlight the advantages of this still infrequently used method of attachment, we include here specific details of how the mechanism can be used for three typical email environments. These examples should also serve as prototypes for setting up other commonly used email handling systems that support MIME.

Example 1. Chemical MIME Handling using the Unix Pine email Client

This mechanism in fact constitutes the original Unix-based method developed by Borenstein and Freed² to test their MIME proposal. For outgoing email messages, the chemical MIME headers are added according to a look-up table present on the users home directory called .mime.types. A typical entry is as follows

chemical/x-pdb 	 pdb

For incoming email messages, the association of a document MIME type with a program suitable for its resolution is accomplished using a look-up table present on the users home directory called .mailcap

chemical/x-pdb; netscape %s

Example 2. Chemical MIME Handling using the Eudora email Client.

Eudora is a stand-alone email client available for Windows and MacOS operating systems. This program allows hyperlink-style resolution of an enclosed message attachment by a program designated by the recipient. Unlike a WWW client such as Netscape, where the chemical MIME types are simply defined on all three major platforms by adding an appropriate plug-in (Table 1), the configuration of Eudora both for sending and receiving chemical attachments is operating system dependent. On MacOS, a chemical MIME plug-in¹⁰ is placed in the same folder as the Eudora application. To achieve the equivalent functionality on Windows 95/98/NT, the file Eudora.ini present in the application folder must have an entry of the following type added for each of the MIME types required;

both=pdb,pdb,TEXT,chemical,x-pdb

When receiving email messages which include a chemical MIME attachment, users will have to specify an appropriate program to resolve the attachment. This has to be done only once for each MIME type. This can be by e.g. adding the filename extension appropriate for each type of MIME attachment via the Windows Registry file, or by specifying this within the email program.

Example 3. Chemical MIME Handling using Netscape Communicator illustrating Integration of WWW and email Clients.

Netscape Communicator (at the time of writing at version 4.05) represents, inter alia, an integrated WWW client (Navigator) and an email client (Messenger). Configuration of chemical MIME types can be accomplished in two generic ways. The simplest is via the Netscape plug-in mechanism. Several plug-ins offer support for chemical MIME types (Table 1) and their installation automatically configures both the WWW and email client components of Netscape with the MIME types supported by the plug-in. This automatic mechanism can also be over-ridden by a user configuration option which will allow additionally defined or redefined chemical MIME types to be associated with other specific programs for processing any individual data type.

In operation, the application of chemical MIME is almost entirely transparent to the user. Any chemical data set defined by the MIME types which is received by the WWW client Navigator will be displayed as either an in-lined model using an appropriate chemical plug-in or in an external window using a user specified program. We note here that all incoming data files can also be saved in the Netscape client local disk cache, where in principle the chemical MIME labelling could be used to create a persistently stored chemical database using suitable software. A chemical attachment received by the email client Messenger can be passed to the browser window for resolution with the MIME headers being processed internally between Messenger and Navigator, or externally via the file system and the filename extensions.

When Netscape Messenger is used to send an chemical email attachment to an email relay, the user selects the appropriate filename, and Messenger will insert the appropriate MIME headers by appropriately mapping the filename extensions. This mapping is as before defined either via the plug-in support, or by explicit declaration in the configuration.

The Netscape implementation is the only one that works transparently across Unix, Windows and MacOS client-based operating systems. One test of operating system transparency is for the test originator to attach a simple chemical co-ordinate file⁸ to an email message and to send this to a remote recipient. The entire process is then reversed by the recipient retrieving the received file from their email attachments folder (Scheme 1) and sending it back to the original sender. The process will be regarded as successful if the test file received back is identical with the originally sent file, and can be suitably and automatically resolved by both parties via an appropriate 3D coordinate display program or plug-in using either email or WWW clients.

Example 4. Using chemical MIME handling to add value to an Internet Chemical Database.

The fourth example illustrates how chemical MIME can be applied outside an email environment. The application of chemical MIME in areas such as electronic conferences and journals has been amply documented elsewhere.¹ This example illustrates how MIME can be used to enhance a database of degradation schemes for atmospherically significant volatile organic compounds (VOCs).¹¹ In the Leeds Master Chemical Mechanism (MCM), simplified degradation schemes for 120 such VOCs have been constructed. The information contained in the database is however difficult to navigate because the degradation products and intermediates of one reaction may be reagents in an other. In the Leeds scheme the organic component of the MCM contains in the region of 7000 reactions and 2500 chemical species. Such a complex web of information is best presented as a hypertext document, within which the chemical structures are represented using the SMILES notation (Table 1). The SMILES string can be interpreted by spawning an external viewer as a helper application (Table 1) using the chemical MIME mechanism, thus achieving a linkage between structural and kinetic information in the database. Furthermore the chemical structural information contained within the database remains active and reusable in other contexts, for example substructure searching of other databases.

Beyond chemical MIME: Chemical Information Components

Chemical MIME was an experiment in the sense of initiating activity and collecting data. In the spirit of the WWW, we had few preconceptions about what would happen, and our original choice of chemical MIME types was in part designed to stimulate development. A retrospective assessment is that the major use has been for distributing simple chemical information components, or what we have termed "legacy formats" deriving from older databases or program input files, rather than for multi-component documents such as envisaged in newer structured formats such as CIF, ASN.1 etc. Given the direction of the World-Wide Web community, this is entirely reasonable and has given an excellent platform for the next stage, which we outline here.

Most chemical information is a complex mixture of different components and disciplines. For example, a "compound data card" usually consists of:

a molecular connection table
citations/references/authorship
physicochemical properties
(possibly) graphics, spectra
links to other information.

Of these only the first is specifically "chemical", and the others are found in many other disciplines.

Starting in 1997, the W3C (World Wide Web Consortium)¹² has been developing a set of generic, discipline-independent protocols for the transmission of documents. These protocols allow for the description, encapsulation and inter-operability of "information objects" from specific disciplines. For example, in a cooperative international forum including support from the American Mathematical Society, a protocol for WWW-based transmission of "mathematics" has been developed (MathML). This is specifically designed so that it will interoperate with existing (HTML) and emerging (XML) protocols. In a similar fashion, a W3C group is devising a protocol for meta data (RDF, or Resource definition framework and DC, or Dublin Core) which will allow simple and complex descriptions of the role and content of documents. For example, the authorship, authenticity, location, ownership and related attributed of a document can be described in RDF/DC framework. We believe that it should always be appropriate to examine these generic approaches before devising yet another proprietary format.

What the W3C has also made clear is that there are generic operations which apply to any sort of document and that these are often not trivial. They include:

authoring
editing
validation
parsing
rendering
merging
filtering
searching
transformation

To develop tools for each of these from scratch is expensive and error-prone. The approach taken by the W3C is resulting in generic tools which apply to all document types, and we shall outline the general principles.

A document (which is not limited to text and can contain or consist of non-textual objects) is composed of smaller components. It contains sequence information (the order of the components can matter) and structure (one component can contain another). Thus a book consists of chapters in sequence; the chapters can contain sections, which contain subsections. In a similar way we can we could describe a labNoteBook as containing compoundDataCards and text. The compoundDataCards can contain molecules, spectra, etc. The other important concept is the identification of the components themselves through markup. The markup shows the limits of a component, and gives a handle (tag) by which it can be identified or linked to semantic resources.

The W3C has now created a language (eXtensible Markup Language, XML)¹³ for managing general structured documents on the WWW. XML is formally a very simple subset of SGML, but for those unfamiliar with SGML, it may help to think of it as an abstraction of the HTML-approach. It is not a language, but a tool for creating one's own language in such a way that it is compatible with existing and future WWW technology. At time of writing (April 1998) it seems certain to be a ubiquitous component of all major manufacturers' WWW tools.

The XML effort is currently tackling the problem of how to manage documents with components from several domains. This has to address cases where the individual markup languages have been developed in ignorance of each other. An example would be a document containing text (HTML), maths (MathML) and chemistry (CML). The solution to this will initially be through the use of namespaces which guarantee that tags from the different domains will not clash.

It is clear, therefore, that a very important trend is towards multi-domain documents using XML syntax. We believe that much information in molecular science is ideally suited to such an approach, and suggest that additional means of identifying chemical information will be required. In some cases this could be the existing chemical MIME types (XML has a mechanism for encapsulating MIME), but will increasingly use other methods based on XML.

A survey of a number of technical publications suggests that there are a relatively small number of different abstractions of information components. For example, although the content of an image could vary widely, the basic technology to read and render it is domain-independent. The commonest components (with existing technology) are:

text (HTML)
images (GIF, JPEG)
structured graphics
tables [*]
graphs [*]
numeric quantities with units [*]
arrays and matrices [*]
citations (RDF, Dublin Core)
links (XLL)
mathematics (MathML)

For components labelled [*] there are existing solutions but they are fragmented and not yet developed for the WWW/XML. Note that many components which are apparently chemistry-specific (e.g. spectra, schemes) are simply concrete examples of these abstract types (e.g. a spectrum and an index of stock prices can use identical technology). The challenge is semantic, how do we attach meaning to the labels or other components? In a similar way, reaction schemes can be described as molecular components embedded in a general semantic network.

A key feature of XML is the separation of syntax from semantics. Thus XML guarantees that the document will be platform and software-independent. However, when it "arrives at the reader" there must be a mechanism for adding semantic information, e.g. what does MOL mean?. In many cases this is supplied by styles sheets (e.g. formatting the document such that its meaning becomes clearer) but for technical data additional processes are required. Tags and attributes can be linked to glossaries. Thus

<ITEM CONVENTION="mmCIF" TERM="_cell.length_a">23.4</ITEM>

can be linked to the International Union of Crystallography's mmCIF dictionary for macromolecules.¹⁴ This defines the quantity as the cell length (in Angstrom units by default). It will be a major advance if those with data dictionaries can make them available electronically. The Virtual Hyperglossary Technology¹⁵ has been developed as a prototype to provide XML-based technology for such linking. Alternatively it is possible to link the document to actions provided by software. One such method is with Java classes, which can be specifically linked to given tags as in the JUMBO browser.¹⁶

The most challenging types of information to distribute are those that describe the relation of one quantity to another. Sometimes this can be done by containment (e.g. a spectrum and a molecule can be linked by being contained in a compoundDataCard). But for more complex mapping we require either semantic networks as in RDF or hypermedia(as in XLL. These provide ways of connecting components in arbitrarily complex manners. Thus XLL could e.g. provide links from functional groups to peaks in a spectrum.

If we accept that XML and related standards (XLL, XSL stylesheets, RDF, DC and DOM or Document Object Model) will provide the generic capabilities then the task facing chemistry becomes more clearly defined: to provide extensible markup for chemistry-specific components and to analyse common relationships in chemistry.

To this end a prototype XML language has been developed termed Chemical Markup Language.¹⁷ This is provided as a starting point for the process of supporting chemistry in XML. Current elementTypes or tags have been deliberately kept simple:

ATOM
ATOMS (simplifies handling of large molecules)
BOND
BONDS
ELECTRON
FORMULA

These can be qualified with a small number of hard coded attributes (e.g. ELSYM, X2, Y2, etc. for ATOM, ORDER, STER, etc. for BOND) Different conventions can be set with a CONVENTION attribute. In this way most simple descriptions of molecules can be captured. The contents of these can be arbitrarily complex and could support, say, orbital components on atoms, quadrupoles, ¹³C shifts, etc. Since much chemistry is solid-state we include CRYST, and for macromolecules supply SEQUENCE and FEATURE. The key point is that XML has been developed in an open collaborative process (which has included many companies which compete vigorously in the markets). We believe this is appropriate for chemistry as well and offer CML as a starting point for such a process.

Conclusion.

During the period 1970-1994, chemical applications of the Internet were largely based on a set of generic file and text transmission protocols, such as terminal (Telnet), file transfer (FTP), email transfer (SMTP) and document handling systems (HTTP), methods where little explicit labelling and structuring of the chemical content was available and where little inter-operability existed between either the chemical content itself or with non-chemical content in other disciplines.

During the period 1994-1998, mechanisms such as chemical MIME set the scene for a convergence both within the chemical community and with other scientific areas, bringing together applications such as electronic mail with database and modelling tools, electronic conferences, journals, books and taught courses. In the future, more finely grained mechanisms such as XML/CML will undoubtedly enable further convergence to the point that the Internet will become a far better and more powerful "resource discovery" tool for the chemical and scientific communities.

Acknowledgements

Funding from the UK JISC e-Lib programme for the CLIC electronic journal project, and the JISC JTAP programme for the VChemLab project is gratefully acknowledged.

Notes and References

H. S. Rzepa, P. Murray-Rust and B. J. Whitaker, Chem. Soc. Revs, 1997, 1-10.
N. Borenstein and N. Freed, "MIME (Multipurpose Internet Mail Extensions) Part One: Mechanisms for Specifying and Describing the Format of Internet Message Bodies", Internet RFC 1521, Bellcore, Innosoft, September 1993.
H. S. Rzepa, B. J. Whitaker and M. J. Winter, J. Chem. Soc., Chem. Commun., 1994, 1907; H. S. Rzepa, Comp. Networks and ISDN Systems, 1994, 27, 317-318; H. S. Rzepa, Chem. Design Auto. News, 1994, 9, 1; H. S. Rzepa, in "The Internet: A Guide for Chemists", Ed. S. Bachrach, American Chemical Society, 1995; M. J. Winter, H. S. Rzepa and B. J. Whitaker, Chem. Brit., 1995, 685; A. N. Davies, Spectroscopy Europe, 1996, 8, 42; H. S. Rzepa, Science Progress, 1996, 79, 97; B. J. Whitaker, H. S. Rzepa, Proc. Int. Chem. Inf. Conf. (Ed. H. Collier), 1995, 62-71; H. S. Rzepa, O. Casher and B J. Whitaker, Proc. Int. Chem. Inf. Conf. (Ed. H. Collier), 1996, 141-148; H. S. Rzepa, W. Locke, P. Murray-Rust and B. J. Whitaker in Perspect. Protein Eng. '96, (Ed. M. J. Geisow), 1996, Paper No. 19; H. S. Rzepa, P. Murray-Rust and B. J. Whitaker, Chem. Intl., 1997, 19, 17.
A description of the definitive list will be published elsewhere; H. S. Rzepa, P. Murray-Rust and B. J. Whitaker, Pure & App Chemistry, in preparation. The latest information is available on-line at http://www.ch.ic.ac.uk/chemime/
H. S. Rzepa, "Internet-based Computational Chemistry Tools", in Encyclopaedia of Computational Chemistry, Wiley, 1998, in press.
An example of the use of chemical MIME to integrate a variety of chemical data types into the body of an electronic journal is the CLIC Electronic Journal Project; D. James, B. J. Whitaker, C. Hildyard, H. S. Rzepa, O. Casher, J. M. Goodman, D. Riddick and P. Murray-Rust, New. Rev. Information Networking, 1996, 61. For the project itself, see http://www.rsc.org/is/journals/current/chemcomm/cccenha.htm or the original site at http://chemcomm.clic.ac.uk/. For details of how a "chemically enhanced" article was prepared, see O. Casher and H. S. Rzepa, in Proc. E. Conf. Trends in Organomet. Chem.: ECTOC-3 (Eds H. S. Rzepa and C. Leach), Royal Society of Chemistry, 1998. ISBN (CD-ROM) 0-85404-889-8.
For examples of the application of chemical MIME to electronic conferencing, see C. Leach and H. S. Rzepa (Eds), ECTOC-1, Royal Society of Chemistry, 1996. Also ECHET96, 1997; ECTOC-3, 1998, ECHET98, 1998. The conferences are on-line at http://www.ch.ic.ac.uk/ectoc/.
T. Maffett and B. van Vliet, MDL Information systems, 1996. For further details of Chime, see http://www.mdli.com/chemscape/chime/. This plug-in derives from the Rasmol program written by R. Sayle and applied within the chemical MIME project as described in O. Casher, G. Chandramohan, M. Hargreaves, C. Leach, P. Murray-Rust, R. Sayle, H. S. Rzepa and B. J. Whitaker, J. Chem. Soc., Perkin Trans 2, 1995, 7.
A simple test molecule is available at http://www.ch.ic.ac.uk/rzepa/jcics/molecule.pdb A site for testing an extended set of chemical MIME types is available at http://www-dsed.llnl.gov/documents/tests/chem.html
H. S. Rzepa, available as http://www.ch.ic.ac.uk/rzepa/jcics/chemical10.hqx
M. J. Pilling, S. Saunders, M. Jenkin and D. Derwent, "Tropospheric Chemistry", http://www.chem.leeds.ac.uk/Atmospheric/MCM/main.html.
For details of all W3C (World-Wide Web Consortium) recommendations, proposed recommendations, working drafts and notes, see http://www.w3.org/TR/
P. Murray-Rust, "Chemical Markup Language, A simple introduction to structured documents", in "XML, Principles, Tools and Techniques" (Ed. D. Connolly,), O'Reilly, 1997, pp 135-149.
For specifications see P. E. Bourne, H. M. Berman, B. McMahon, K. D. Watenpaugh, J. Westbrook and P. M. D. Fitzgerald. Meth. Enzymol, 1997, 277, 571-590. See also http://www.iucr.org and http://www.iucr.org/cif/mm/index.html
P. Murray-Rust and L. West, ASLIB Managing Information, 1997, 4, 36-39. See also http://www.vhg.org.uk/
P. Murray-Rust in Proc. E. Conf. Trends in Organomet. Chem.: ECTOC-3 (Eds H. S. Rzepa and C. Leach), Royal Society of Chemistry, 1998. ISBN (CD-ROM) 0-85404-889-8. A fully working version of JUMBO is included on this CD-ROM. See also Ref 13, pp 197-207.
The CML project was first described in P. Murray-Rust, C. Leach and H. S. Rzepa, Abs. Papers Am. Chem. Soc , 1995, 210, pp.40-COMP (http://www.ch.ic.ac.uk/cml/) and in P. Murray-Rust and H. S. Rzepa, Abs. Papers Am. Chem. Soc, 1997, 214, pp.23-COMP. Details of CML itself are available in Refs 13 and 16. Further details will be published in a forthcoming paper.