The CLIC Consortium. A Flagship Chemistry Electronic Journal

Annual Report

May 1995 - July 1996

The consortium operates between the Royal Society of Cambridge, and the chemistry departments at the Universities of Leeds and Cambridge and Imperial College. Each partner has a well defined role in the project. In this report we will present an overview of the activities and progress of the project, with a summary of the contributions from the individual sites, including an analysis by each site of what they have learnt from the process of implementation, and interim evaluation results. The report will conclude with a statement of our objectives for the next phase of the project. More detailed reports from each site are included as Annexes.

1. Activities and Progress.

The first stage of the project, which includes the first two designated milestones in our original proposal, had the following major activities.

1.1 The project started with the appointment of Omer Casher (Imperial), Christopher Hildyard (Leeds) and David Riddick (Cambridge) to the project. A dedicated SGI Unix workstation, together with disk sub-systems, and other communications technologies were purchased for the project. This also included six computer systems for libraries, and licenses for various software products.

1.2 To establish a distributed and scaleable infra-structure for handling an electronic journals on a regular basis. This involved developing SGML technologies for processing material derived from the existing printed journal and converting it in as flexible manner as possible to HTML based content for distribution through a Web server. (Leeds University, Annex 1).

1.3 We focused on developing server technologies for creating, indexing, hyperlinking and maintaining a scaleable document database, using Hyperwave software. In addition to these generic technologies, we developed chemically specific technologies for delivering molecular content as part of the journal, based on the chemical MIME standards we have promoted. (Imperial College, Annex 2).

1.4 To develop macros, style sheets, software and other training materials for authors at the RSC Cambridge offices, and to evaluate various commercial authoring packages already available. This component also involved liaising with commercial chemistry software vendors for software that would be of use in the journal. Another component developed is a formal SGML dtd specifically for chemistry called CML (Chemical MarkUp Language), the principal author of which is Peter Murray-Rust, who is affiliated with the CLIC project. (Cambridge University and Royal Society of Chemistry, Annex 3. See http://www-clic.ch.cam.ac.uk/CLIC/pr/year1.html).

1.5 To raise awareness in the community by means of three focusing activities; chemistry electronic conferences, an e-mail discussion list, and one day discussion days for the chemical community. These activities also form part of our basis for our structured evaluation of user response to the delivery of chemical information in this form. (Leeds, Cambridge and Imperial).

1.6 To establish a centre in each of the three university libraries devoted to the CLIC project, as an evaluation mechanism for delivering the electronic journal to users. The structured evaluation will be co-ordinated by the CBL unit at Leeds University. (Leeds, Cambridge and Imperial).

1.7 Most recently, members of the CLIC consortium have become associated with activities to create the "Open Molecule Foundation" to promote the development of methods for enhancing the inter-operability of molecular information. This would include development of object oriented class libraries and programming languages such as Java in the area of molecular science. We expect this activity will directly result in software that can be used in conjunction with the CLIC project. (Imperial College, Annex 2).

Outputs and Deliverables.

1.8 To use the infra-structure noted in 1.1 - 1.6 to deliver graphically enhanced contents pages for the journal, and in the first year to deliver full versions of the "keynote" articles that are a new regular feature of the journal in 1996. These constitute the formal milestones in our project. The CLIC server is available as http://chemcomm.clic.ac.uk/ (Leeds, Cambridge and Imperial).

1.9 The ECTOC-1 awareness raising conference, edited jointly by Imperial College and Cambridge University, achieved major international prominence, and appears to be regarded as a seminal event in electronic publishing. Of the 75 articles from 13 countries submitted to the conference, 66 were formally abstracted by Chemical Abstracts, and the CD-ROM produced of the proceedings has been commended as a "professional product". It is now on sale via the Royal Society of Chemistry. With this product, we achieved international prominence for an e-lib supported project.

1.10 The second conference, known as ECHET96, was even more successful than the first, attracting widespread support from the USA, Japan and Europe, with 120 submitted articles, including 12 from some of the most prominent chemists in the area of heterocyclic chemistry. As an awareness raising forum for chemistry electronic journals, both conferences must be considered outstanding successes.

1.11 Two "Webmasters" days were organised, in November 1995 and June 1996. Each attracted around 80 attendees and are regarded as very successful events. The CLIC project was presented as a talk on each occasion, and there was ample opportunity for demonstrations and informal interaction. We regard the people that attended as the key personnel that will be involved in raising awareness of the CLIC product in chemistry departments and chemical industry. In conjunction with these meetings an e-mail discussion list has been established (chemweb@ic.ac.uk), and this now has 193 subscribers and a significant international following as a "high content, low volume forum" to quote one posting.

1.12 A number of scholarly articles and talks have been presented in which the CLIC project is discussed. These include;

(a) H. S. Rzepa, "The Future of Electronic Journals in Chemistry". Trends in Analytical Chemistry, 1995, 14, 464.

(b) B. J. Whitaker and H. S. Rzepa, "Chemical Publishing on the Internet", Conference on Chemical Information, Nimes, France, October, 1995.

(c) D. James, B. J. Whitaker, C. Hildyard, H. S. Rzepa, O. Casher, J. M. Goodman, D. Riddick, P. Murray-Rust The Case for Content Integrity in Electronic Chemistry Journals: The CLIC Project., New Review of Information Networking, 1996, 61-70.

(d) O. Casher and H. S. Rzepa, "The Molecular Object Toolkit: A New Generation of VRML Visualisation tools for use in Electronic Journals", Proceedings of the 14th UK Eurographics Conference, March, 1996.

(e) S. M. Bachrach, P. Murray-Rust, H. S. Rzepa, B. J. Whitaker, "Publishing Chemistry on the Internet", Network Science, 1996, 2 (3).

2.0 Learning from the Process of Implementation.

2.1 Leeds

2.2 Cambridge. A number of points have come to light during the implementation of the project plan:

The conclusions to be drawn from these points are:

2.3 Imperial. The prime focus here was on establishing a robust document handling technology that could be scaled up as required, and could form the basis for the implementation (exit) strategy in year 3. Our initial focus was on server solutions provided by Netscape, but we soon realised that "hyperlink maintenance" was in fact a major problem that needed to be addressed. Following initial experiments with the Harvest server, we settled on the Hyper-G (now Hyperwave) solution to this problem. We have experienced significant difficulty with support for this product from the University of Graz where it originates. To better understand the development strategy of this product, three of us (DJ, HSR and OC) visited the Graz development team, and we are now in the process of joining the Hyper-G consortium so that access to the latest information is available to us. Whatever the future of HyperWave as a Web solution, we believe that we have gained invaluable experience in maintaining and indexing a structured document collection.

Our second focus has been on developing new formats for integrating complex information into the journal format. This has focused on VRML as a 3D descriptor, and we have been closely involved both in the evolution towards VRML 2.0 and the integration of "Molecular Inventor". It appears that VRML community is not working as closely with the W3C organisation (custodians, inter alia of the HTML standard) as we would wish, and the standards process in this area is still an unpredictable process. In part to address such problems, we have identified a major growth area in the future as being object oriented "applets" written in the Java language. This we believe will form a major thrust for the CLIC project in its second half, and to help focus on standards in this area, we are in the initial stages of setting up the "Open Molecule Foundation". We believe that this could become a significant activity for CLIC in the future.

Our final focus has been on organising and implementing electronic chemistry conferences. We view these very much as evaluation mechanisms for the technologies and styles that the electronic journal will eventually use. An enormous amount of experience has been gained in areas such as how authors submit articles in electronic form, how an on-line refereeing process works, how articles gain from a 4 week discussion period (as an alternative paradigm to conventional refereeing), how Chemical Abstracts process electronic materials, the technology of CD ROM production, and the problems of "Atlantic" bandwidth resulting in poor response for the end user. As a result, a major problem that the CLIC project needs to address in the second year is how to mirror (cache) the delivery of the electronic journal on a global basis.

3. Interim Evaluation Results.

3.1 Delivery of the Journal. Enhanced Graphical Contents pages have been on-line for about 6 months. (RSC to add comment here). In the last two months, full versions of Keynote articles have been available for inspection (RSC to add hits here). The responses have been largely favourable. Where the products attracted criticism we have identified as being due to

a) Inadequate initial documentation and training for readers on how to actually use the product

b) Inadequate quality of materials received from authors

c) Limitations of the HTML standard.

3.2 Mobilisation. We believe that we have successfully mobilised the community towards using the CLIC product via the awareness raising forums we have organised, including the two conferences, the two one-day meetings and the discussion list.

3.3 Cultural Change. Within a large community such as chemistry, it is difficult to assess what proportion of potential readers is in a receptive frame of mind to wish to evaluate an electronic journal. Our experiences with authors, both of the journal and the associated conferences, is that some authors resolutely refuse to modify their standard method of preparing a manuscript, whilst others enthusiastically provide high quality materials for inclusion. Of the 120 articles submitted to the ECHET96 conference, some 45 were prepared in electronic form by the authors, and some 30 more were prepared to edit their contribution after initial processing by the editors. ECHET96 derived from a special interest group of the Royal Society of Chemistry, which would normally hold a conference with perhaps 20-30 discussion papers. From this criterion, ECHET96 succeeded in mobilising and inducing cultural change in this community on a very short time scale.

3.4 Cost Effectiveness/value-added. The CLIC e-journal is currently at the stage of establishing and testing various mechanisms which will form the basis for a sustainable process in the future. This is being done with a relatively small resource compared to that devoted to the printed version of the journal, and there is little doubt that it forms a cost effective supplement to the printed version. The most striking success is in the perceived value-added component. The inclusion of 3D models within the body of the CLIC journal has been possible because of a) our developments of MIME standards to accomplish this and b) our presenting the CLIC project at an early stage to MDLI, a commercial company that specialises in chemical databases. This has resulted in the production of a Netscape plug-in, based on the standard visualisation program RasMol. We have thus been able to achieve an early added value to the electronic journal that to a significant extent anticipates our stated deliverables for year 3 of the project. We are now focusing on other value-added aspects such as delivery of analytical data, mathematical markups, numerical information, and various forms of usage statistics and index searching.

3.5 Sustainability. The adoption of SGML derived technologies, together with the development of parsing and conversion tools, and the use of scaleable servers such as Hyper-G indicates we have taken the problems of Sustainability very seriously.

(RSC and Leeds to write something here please).

3.6 Demand/Performance and Future Scenarios. Although many publishers now claim to offer an electronic version of their journal, in the most part this comprises either index pages only, or exact replications of the printed form. Responses from users, both via conferences, at the one-day meetings and via e-mail discussion lists, re-inforces our belief that demand for such products will be generated largely on the value-added components present, and the simplest possible end-user installation requirements. As demand increases, so delivery performance will become a large issue. We intend to address that by investigating mirroring solutions in the USA and elsewhere, by looking at subject specific Caching solutions based on the UK Hensa Caching site. Finally, issues of end-user software installations via Java are classed as a high priority.

4. Future Development.

4.1 Main objectives.

4.1.1 We will move from developing the basic infra-structure to sustaining the on-going production of the electronic journal. This will involve extending the working RSC dtd to including the value-added molecular components we have already demonstrated in the keynote articles and the conferences. Our development work on the chemistry specific CML dtd will continue, as will a strategy for integrating the two threads for the final stage of the CLIC project.

4.1.2 We will concentrate on the complex issues of indexing not merely the text based content of the journal (an intrinsic feature of the Hyperwave server) but the chemistry based content. We are actively investigation suitable technologies for the purpose.

4.1.3 We will concentrate on implementing Java and VRML based value-added components. A demonstration of such technology, to display spectral data, is already available, and we are actively developing VRML 2.0 solutions to the display of more complex visual data.

4.1.4 We will focus on analysing usage statistics and reader-profiles. The use of persistent client states (Cookie files) in this regard needs to be evaluated.

4.1.5. In the first year, the electronic version of the journal will be made available to any institute that already subscribes to the printed version. Various charging scenarios will be evaluated in Year 2, including Java based solutions.

4.1.6. Issues of security, in areas such as Java and persistent client states need to be addressed in the context of operation by the Royal Society of Chemistry.

4.2 Implementation/Exit Strategy.