Hierarchical display of Chemical Data in Web Browsers

Georgios V. Gkoutos, Henry S. Rzepa* and Michael Wright

Department of Chemistry, Imperial College of Science, Technology and Medicine, London, SW7 2AY.

Introduction

When first introduced in 1993, Web browsers were only capable of displaying the content of HTML documents expressed as text or 2D static images (GIF, JPEG). This method has some significant limitations, since it does not explicitly identify the content as chemical in an unambiguous manner, and chemical data expressed as an image cannot be easily converted to machine processable information. Browser capability was enhanced with the introduction of Version 2 of the Netscape browser, which allowed embedded content to be displayed on a browser page using a plugin method. This is based on identifying the content of transcluded datafiles using MIME media type definitions.1 Since then, enhancing the incorporation of chemical datatypes into Web-based documents has followed two, rather divergent directions. In this article we discuss these developments, and analyse the implementation of a new unified method for displaying chemical content in Web browser pages.

Chemical Applications of Browser Plug-ins

Support for chemical content within a Web page was first introduced with the Chime plugin, released by MDL in 1996 and enabling a number of chemical MIME types.1 Other chemical plugins have followed.2 The plugin method is also often used in a chemical context to display virtual reality chemical models.3 The syntax for invoking a chemical dataset using HTML commands is of the following type:
<embed src="chemical_file" width="200" height="200" ... optional_parameters="...">
A pre-requisite with plugins is that they require the user to first install the appropriate software. If the user does not wish to install this specific software, or has no permission on their computer to do so, then the HTML page containing the invocation of the plug-in will fail to display any chemical content, i.e. this mode of display is not "fail safe". The HTML can be made more robust by inserting a number of JavaScript tests to predetermine whether the appropriate plugin was installed, but because of the complexity of this test, few authors made use of this (somewhat non standard) feature.

Examples of plug-in applications are shown in entries 1-4 (Table 1).4

Chemical Applications of Java Applets

In 1996, a further technology termed Java was introduced which provides an alternative approach. This no longer requires the user to pre-install any software; rather the requisite code (termed an applet) is downloaded from the server, and executed on the users machine in a "just-in-time" manner. This approach has a number of pros and cons compared to the plugin method.

  <applet code="chemical.class" archive="chemical.jar" width="200" height="200">
  <param name=model value="caffeine.xyz">
The introduction of Java-based applets has the consequence that the author of the web page must consider preparing two or more different versions; one for each plug-in, and one for each Java applet (Table 1). It is possible to create this effect dynamically using JavaScript embedded in the HTML document, or using server-side includes, but this again increases the complexity of the authoring. A good example of such a multi-mode approach are the "Molecules-of-the-Month" pages,6 a project started in December 1995, and where three or four different presentational modes are normally prepared for each molecule discussed.

Some examples of the use of Java applets are shown in entries 5-9 (Table 1).

Table 1. Different Methods for Invoking Chemical Data in a Web Browser
Entry ID Program System HTML Markup (tag) Type of data
1 Chime11/Chem3D2 <embed> 2D/3D Coordinates
2 VRML3 <embed> 3D Coordinates and Animation
3 Quicktime <embed> 3D Orbital Surface
4 SVG <embed> 2D Vector diagrams
5 ChemSymphony10 <applet> 3D Coordinates
6 Chemapplet <applet> 3D Coordinates/Orbital
7 JMol <applet> 3D Coordinates/Animation
8 jSPEC5 <applet> JCAMP IR, UV, MS spectral data
9 Java Molecular Editor9 <applet> 2D Structure data
10 Java=>Chemical_plug=>VRML=>Image=>Text <object> 3D Coordinates/Animation
11 Chemical_plug=>Java=>VRML=>Image=>Text <object> 3D Coordinates/Animation
12 Chime11/Chem3D2 <iframe> 3D Coordinates
13 Browser <img> General
14 Browser <script> General
15 Browser <a> XML (eXtensible Markup)

The OBJECT Element

The December 1997, the HTML 4.0 specification introduced two alternative and more robust way of invoking chemical content.4 In essence, both the <embed> and the <applet> elements are replaced by entirely new elements termed <object> and <iframe> (inline frame). The <object> element specification allows a so-called cascade of options to be specified, wherein two or more alternatives for the display of e.g. chemical content are made available. If the browser has the capability of invoking the first, this is used, and the others ignored. If however the first option is unavailable (perhaps because the user has not installed a plugin on their computer), the second is attempted, and so on until the last option (which is normally a simple text display, and hence always possible). The code below shows how the various options are invoked (Entries 10-12, Table 1).
    <object id="jmol" classid="java:org.openscience.miniJmol.JmolApplet.class"
    codetype="application/java" archive="JmolApplet.jar" standby="Loading JMol"
    width="320" height="240" title="Display using Java">
      <param name="model" value="6a-rhf-vib.xyz" />
      <param name="format" value="XYZ" />
      <param name="bcolour" value="#FFFFFF" />
      <param name="animate" value="1" />
      <object id="chemical" data="6a-rhf-vib.xyz" type="chemical/x-xyz" width= 
      "320" height="240" title="Display using chemical Plugin" standby="Loading 
      plugin">
        <param name="frank" value="no" />
        <param name="display3d" value="spacefill" />
        <param name="bgcolor" value="white" />
        <object id="vrml" data="vib.wrl" type="model/vrml" width="320" height= 
        "240" title="Display using VRML Plugin" standby="Loading VRML">
          <object id="image" type="image/jpeg" data="vib.jpg" width="418"
          height="325" title="Display using image only">
            <a id="RSC" href= 
            "http://www.rsc.org/suppdata/perkin2/1998/2695/index.html" name=
            "RSC" title="Link to RSC Journals Supplemental data pages" target= 
            "search_main"> Trapezoidal distortions in 2+2 Cycloaddition
            reactions</a>
          </object>
        </object>
      </object>
    </object> 
    
In the above definition, the first attempt to display a set of coordinates in XYZ format is made by invoking the JMol applet. If for some reason, Java is disabled on the browser, an attempt to display an object using the defined MIME type chemical/x-xyz is made. Internally, the browser will attempt to resolve this MIME type to any plugin that may have been installed. Optional parameters for the plugin can also be defined (the ones shown above are relevant for the Chime plugin). If a specific viewer of chemical coordinates is not available, then a more generic Virtual reality viewer can be invoked. If this option also fails, then a simple image file will be loaded. If the user has disabled image viewing on the browser, then the final text field will be displayed.

The <iframe> specification is much simpler, and is not designed to provide alternative methods of presenting information. We include a short test of this mode (entry 12, Table 1) for completeness. We also note three other tags which can potentially invoke chemical content. The <img> tag (entry 13) is used to display bitmap-based diagrams which are not generally re-usable in a chemical context. It is possible however to embed text and more specifically chemical objects within the image byte code of image types such as GIF or PNG, and these can in principle be parsed, indexed and searched for. This could be used to provide more semantically rich content within the <img> tag. Another tag which can carry chemical information is the <script> tag, although it would be difficult to specifically identify chemical content within this syntax. Finally, we note the <a> tag, which of course is normally invoked only to link one document to another, and not to carry specific chemical information per se. We do note however that such links can be made to XML-based documents, which can be semantically very rich. We note one example in the table (entry 15) which is used to invoke a CML-based XML repository. This is fully described in a separate article.7

Issues relating to the HTML 4 Specification

Browser support for <object> and <iframe>

The object display (entry 9, Table 1) was invoked on a number of recent generation browsers, most of which purport to support HTML 4.0. The various tests were enabled by setting configuration options within the browser itself, or by removing the plugins directory. The results (Table 2) indicate a surprising degree of non-compliance amongst these browsers. The best results were obtained with iCab, which correctly cascades down through the various objects. We note also that reversing e.g. the first two object declarations (entry 10) also produces correct behaviour in iCab.

Amaya, which serves as a test bed for the W3C standards process, does not support either Java or plugins, and hence is not a suitable browser for chemical applications. In theory, this browser should display the image file by default, but in fact fails to do so. Internet Explorer version 5 from Microsoft probably represents the most development investment in any browser, but this software does not appear to support the <object> element correctly, attempting to display all four embedded objects simultaneously, and not sequentially as intended. Netscape at the 4.7 revision appears to support only a single nesting of the <object> element, and reverting to text display if the first declaration is unavailable. Early (alpha) versions of Version 6 appear to display standard behaviour towards the <object> element, although there seem some problems with plug-in display.

Overall, the track record of implementing HTML standards within Web browsers has been very variable. Almost two years after the HTML 4.0 standard was published, only one browser (iCab) fully and one (Netscape) almost implement the <object> and <iframe> elements. It is apparent that some time must yet pass before a properly structured implementation of chemical datatypes can be achieved within browser windows. The alternative option, of associating these types with external applications continues to be available on all browsers, as it has done since 1993.

Table 2. Browser tests for <object> Cascading
Test (Entry 10, Table 1) Object ID displayed in Browser
Internet Explorer 5 Netscape 4.7 Netscape 6.0 Amaya 2.4
Windows
Opera 3.62
Windows
iCab 2.0
MacOS
Windows MacOS Windows MacOS Windows MacOS
Java enabled attempts all objects, displays image Displays innermost object (image)a java java errorb Java Java attempts imagec texta java
Java disabled attempts all objects, displays image Displays innermost object (image) text text Chemical_plug (e) Chemical_plug (e) attempts imaged text chemical_plug
chemical_plug disabled attempts all objects, displays image Displays innermost object (image) text text (e) (e) attempts imaged text vrml_plug
All Plugins disabled attempts all objects, displays image Displays innermost object (image) text text image image attempts image text image
Images disabled - text text text text text - text -
a Object element not supported. b Netscape does not support appropriate Java class libraries c Java not supported. d Plugins not supported. eVersion 2 of Chime does not display correctly with Version 6, preview 1 of Netscape. f Appears not to support VRML plug-in.

Conclusions

In retrospect, the early adoption of using two alternative forms of markup to express chemical content in HTML documents was most unfortunate. The increasing browser support for the unified <object> method which allows a proper cascade to operate at the point of display now allows more robust HTML pages to be deployed. This may be particularly important as new generations of mobiles devices capable of accessing the Web become available, and chemical implementation for these becomes an issue. Furthermore, we believe that expression of chemical data in the form of XML and the application of appropriate transforms for viewing using XHTML is the appropriate method for the future.

References

  1. See http://www.mdli.com/ and H. S. Rzepa, P. Murray-Rust and B. J. Whitaker,J. Chem. Inf. Comp. Sci., 1998, 38, 976-982.
  2. See http://www.camsoft.com/ See also J. S. Brecher Chimia, 1998, 52, 658-663
  3. O. Casher, C. Leach, C. S. Page and H. S. Rzepa, Chem. in Brit., 1998,34(9), 26.
  4. For a discussion of the use of the <object> element, see H. S. Rzepa, Chimia, 1998, 52, 653-657. For a specification of the HTML 4 standard, see http://www.w3.org/MarkUp/. The original HTML 4.0 standard was replaced by a slightly modified one (4.01) in late 1999, and has now been made XML compliant as XHTML 1.0 in January 2000. The document you are reading is XHTML 1.0 compliant. There is also a proposal for a modification of XHTML termed XHTML basic, which would be suitable for display on mobile wireless devices (along with a currently competing proposal for WML).
  5. A. P. Tonge, H. S. Rzepa and H. Yoshida, J. Chem. Inf. Comp. Sci., 1999, 39, 483-490.
  6. For a discussion of this project, see P. May, Molecules, 1998, 3, 16.
  7. We have developed a version of one robot indexer that does follow such objects;G. V. Gkoutos and H. S. Rzepa,to be published.
  8. For a discussion of the role of XML and stylesheet transformations, see P. Murray-Rust and H. S. Rzepa, J. Chem. Inf. Comp. Sci., 1999, 39, 928. For a working implementation of this concept, see P. Murray-Rust, H. S. Rzepa, M. Wright and S. Zara, Chemcomm, submitted for publication.
  9. P. Ertl, Chimia, 1998, 52, 673-677.
  10. A. Krassavine, Chimia, 1998, 52, 668-672.
  11. R. M. Horton_, R. J. S. Stone, Biotechniques, 1997, 22, 884.