|[Molecules: None] [Related articles/posters: None]|
Not only is the information of when a particular document was accessed but also the machine and the browser the user used to read the document. Popularity of articles and facilities within the conference can be monitored and compared from the information in logs. Various statistics from the access logs have been complied to show how the conference was used.
We do note that this analysis does not include accesses via the North American mirror site, not does it include statistics for the 34 articles provided by authors from their own servers. For this reason, we estimate that the analysis below relates to around 60% of the conference totals only.
There were 120 articles and poster abstracts were initially submitted to the conference editors and accepted by the scientific committee. Of these, 17 are not present on these conference proceedings either because (a) the authors requested they not be included or (b) no full article following the abstract was received. We have included the access statistics derived from the articles present in June 1996 (rather than those present on this CD-ROM) in the analysis below. Three contributions were added after the conference proceedings relating to post-conference analysis, of which this is one.
|Conference Theme||Number of Articles|
|Reaction Mechanisms and Conformational Analysis||12|
|Molecular Modelling and Databases||8|
|New Synthetic Methodology||59|
The authors were allowed to submit their articles in various ways. Thirty four authors chose to use their own World-Wide Web servers. Although there were the advantages that the authors could update their own articles easily and the network load on the conference server was reduced, these papers were unavailable for us to create accurate access statistics. There was also the risk that the author's server was on an unreliable network.
Most of the authors (59% - 72/123) chose to use the forms mechanism to submit their abstract while they were registering themselves to the conference as authors. The other popular method (18% - 22/123) of sending abstracts was by e-mail in various formats including Microsoft Word documents and HTML. When the full articles were submitted, the e-mail approach was more popular (55% - 68/123) against the forms method (15% - 19/123). It appeared that it was more comfortable to send multiple files in e-mail attachments than to rely on the forms system and rely on extraction of the files from uploaded archives.
The conference started on 24 June 1996 for four weeks until 22 July 1996. During this time there were 71,498 accesses by remote users, those accessing outside the Chemistry Department at Imperial College. The heavy testing of the conference by the conference editors and the range of computer systems held within the department would have skewed the access results. For this reason, accesses within the department have been excluded. In fact, remote accesses during the conference where responsible for 97.3% of the total, so excluding local access does not significantly ifluence the final results. The actual number of bytes downloaded also showed the same proportion between the remote users and the local users, 246 Mbytes against 6.5 Mbytes.
One of the various methods to classify the accesses to the conference is to consider the type of file retrieved, whether it is an HTML file, one of the several types of picture file types or molecule coordinates. The number of pictures in the conference was much greater than the number of HTML files, so the number of accesses to these files ought to reflect this.
|File type||Accesses||Percentage||Number||Percentage||Average accesses per file|
Most authors had inlined pictures in their articles. These were often downloaded when the document was retrieved over the internet. The average accesses per file normalised any differences between the number of pictures and HTML documents and their access statistics. Unfortunately some HTML documents had more pictures associated with them than others, which skewed the averages. The picture access average was greater than the HTML average indicating that the number of documents that were accessed had a larger proportion of associated pictures than the conference average.
The number of local accesses to the conference pages were so small that they have been excluded from further discussion. There were three important peaks in the weekly accesses at weeks 17, 22 and 25. Week one was the first week corresponding to January 1996.
|22||Full versions in|
|25||ECHET96 Conference Starts|
|33||Deadline for updates|
|56||End Changes to CD-ROM version|
The milestones show clearly that the increased activity by the authors were due to the uploading and checking of their articles in week 17 and 22 before the two major deadlines for the abstracts and the articles to be in. The largest activity occurred in week 25 at the start of the conference. This activity was divided into three parts:
Cgi-bin requests to the server were very useful in the way that they could not be cached on the participants' machines. These could provide an accurate method of determining how the conference was used. There were several programs written for ECHET96 to administer various parts of the conference.
There was a very small peak in week 17 when the abstracts were submitted. The accesses to the papers reflected the number of available complete articles to the conference until week 22. As expected, the majority of the conference statistics were to the paper directories since this was where the majority of the conference files were.
There was some elevated activity recently due to the deadlines for the final versions of articles for the CD-ROM version.
These pages contained the information on how to write and submit articles and participation to the conference. These were available from the start of the conference to encourage people to contribute and participate. The accesses were at the random noise level before and after the conference, the graph showed the same shape as the previous graphs for weeks 17, 22 and 25.
Outside the conference period the accesses experienced a random fluctuation similar to random noise. These accesses were from people who were either new to the web and wanted to have a quick browser though the site to see what there was or the few who bothered to come back to see if there were any updates. This conference is quite unusual in the way that the documents have not changed after the conference. Therefore it is unlikely that there will be people coming back to check whether there is anything new. Therefore it was possible to see the life cycle of an unchanged document. Before the conference the documents were being accessed by the participants and the authors as the conference was being built. On the start of the conference the readers assumed that the documents would no longer change, so the accesses started to fall as people finished reading the final versions of the conference. Seven weeks later the documents were of no interest except for the random browsers on the internet.
The accesses during the conference were viewed by the hour. Figure 5 shows that during the first week of the conference the accesses were up to 800 hits an hour. The weekends showed very little activity in the conference, shown by the blue troughs. Each week the majority of the accesses occurred during the hours of 09:00 to 17:00 BST indicating that most of the users were from Europe. The American accesses have not been included as most of these accesses occurred on the mirror site in America.
When the participants accessed the conference they left behind in the logs information about the machine and operating system and the type of browser that they use. The absolute monthly statistics followed the same shape as the weekly statistics peaking in June.
The number of Windows 3.x/95 users were more abundant than the Macintosh users since August. The number of Windows 95 machines has only recently been the most popular operating system as more people upgrade from Windows 3.x. The specification of the operating system has an effect on what the participants can view. The latest technologies usually come out for Windows 95 before the Macintosh and Windows 3.x. Majority of the participants during the conference where more likely to use a Macintosh computer than a Windows based machine machine, indicating that chemists were more likely to use Macintoshes whereas the general usage of computers on the internet is predominantly Windows 95.
Various types of statistical analysis can be done on the conference or on individual articles or files to see how users use the conference. It should be possible to follow a user's route through a session seeing what they visit first and how they visit other sections using the navigation systems set out in the conference. If multiple sessions are mapped out, then it would be possible to work out the best routes though the information. This could be used to improve the flow of information through the rest of the conference. Unfortunately there were problems with caching which make it difficult to trace the complete sessions, as not all accesses conducted with the server but with the local file cache.
It is very important to keep an eye on the information that is produced by the access logs as they indicate how users using the web site. They can help to improve the workings of the site. The availability of browser information allows the use of advanced HTML techniques to be monitor since not all users have accesses to the latest browsing technology.