Instructions for Experiment 3: Chemical Informatics

These pages represent detailed instructions for the techniques described. Some techniques are deemed sufficiently "intuitive" that no details are given here. In other cases, the information provided by the supplier itself is sufficient, and is not replicated here. To access each individual information point, click on the icon you see. To find out more about the information provider, click on the blue hyperlink next to the icon. The strip means you can return to an overview of the "Information booth".

Claris Works





BIDS

This runs a terminal emulation program called Telnet. Login with account iic02p (the password is available from members of staff) and select the ISI service. The menus are largely driven by typing appropriate characters from the keyboard. In this instance, they are case insensitive.





More detailed instructions are summarised in a small manual available from the chemistry library or the on-line office in the Lyon Playfair laboratory.


CAS On-line

This will enter a Telnet session. Type z as the first thing you enter, followed by an account number and password (this is available on request) followed by 3 for the terminal type. You will next have to specify the database file you want by type FILE CA (the Main CAS bibliographic file) or FILE REG (the Registry file of substances) or FILE LCA (The learning CA file). Entry to these files is only at certain times of day (normally AM). Only the LCA file has no associated cost, all other searches will accumulate a charge which will have to be paid by the owner of the account number.


Silver Platter

This offers "samplers" of a number of commercial databases. The full versions of several of these are also available via dedicated software, which on a Macintosh is known as "MacSPIRS".





Libertas

Again the Telnet program is used to access the College catalogues;






Aldrich Chemical Catalogues

A major chemical supplier has made available their available chemicals directory.



Fisher Chemical Catalogues

A major chemical supplier has made available their available chemicals directory. One of the "value added services they offer is hazard safety sheets for all their entries. You can take advantage of this by searching for information on any penicillins.



Daylight Information systems

This information provider was one of the first companies to offer a "Web" interface to their databases. Here you can see it in action on "sampler" datasets. A number of interfaces are offered in this service, including the World Drug Index, and the "Savant" system.

Start with the WDI database, entering a suitable search term;

The results of a keyword search are displayed on the screen.

Clicking on the "thumbnail" graphic of any molecule found will reveal further properties, one of which is the so-called SMILES string. This is a powerful and popular method of representing molecular structure as a sequence of simple characters. It is also one of the few methods for transferring molecular structure definitions between different programs. To illustrate how this works, select the SMILES string shown below the structure, and "copy" it to the clipboard using the "edit" menu.

Now select the "back" button on the WWW browser, and enter instead the "Savant" database. The query SMILES string can be pasted into the keyword search field. Savant produces a "similarity" search which finds bibliographic references to recent literature relating to the synthesis of compounds either identical to, or similar to the SMILES search query.
Select a compound which may look interesting, and investigate one or two recent literature references to its synthesis;



Cambridge Crystallographic Data Centre

This experiment should ideally be performed on the Silicon Graphics computers. If you use a Macintosh, a program called MacX will start up, and following prompts, you will need to enter your SGI account and its password. On the SGI, type in response to the prompts that appear;
>MENU

On a Macintosh computer, enter the line
quest penicillin
in the MacX window that appears, followed by
TERM X11
MENU
You will need to search the Cambridge crystal structure database for the penicillin and cephalosporin sub-structures. The first menu allows a molecule to be drawn. You will notice that there is no common standard for drawing molecules on screen (unfortunately). With Quest, a click on the screen draws the atom selected (by default C, but in the example below currently S). Further clicks add a further atom to the last atom. If you want the next atom not to be added to the previous, select MOVE first. To convert a single to a double bond, select DOUB, then MOVE and click on the bond desired. To convert a carbon to another atom on the menu, select the type from the screen menu and then MOVE before clicking at the desired atom. If the atom is not present on the menu, click on OTHER. Errors can be corrected using DELATOM or DELBND, or in extremis CLEAR which removes the entire molecule.



When you are happy with your structure, click on DEFINE STRUCTURE and then QUEST;

Your defined search operators are shown as T1 (T2 etc). In this case they are sub-structures, but many other definitions are possible, including author names, formulae, etc. It is possible to combine several of these using logical operators. In this case none of this is necessary. Just click on the T1 box to select this item, then START-SEARCH. As soon as a "hit" is found, it is displayed on the screen;



The structure can be rotated using the four small arrows in the bottom right corner of the menu (you might wish to discuss in your writeup the pros and cons of this method compared with those found in other programs; again no common standard applies).

At this point, you have to either KEEP or REJECT the structure before the next one is displayed. If you KEEP the structure, its co-ordinates are written to a file on disk (called penicillin.dat in this case). Continue to the end, when the exit from the database will occur. To perform a further search, type Quest <name of search> from the console window of the workstation, as shown in an earlier diagram above. At any stage you can also go to the alternative 3D projection, in which bond lengths, angles etc can be displayed on the screen.




Brookhaven Protein Databank

This enables a simple keyword search of the PDB archives, with the result displayed in a window on your screen. Be aware that the PDB files can be quite large, and that response times for their retrieval are likely to be better in the morning than the afternoon. A mor sophisticated interfaces is avalailable via the European Bioinformatics Institute.





Swiss-Prot






Beilstein Crossfire

When Beilstein Commander starts up, you will be confronted by a login-prompt. Use chsr as the account. The password is available from staff.
Due to a programming error in this program, the molecule window is empty. Click on a second time, and this will be replaced with a molecule query.
If you wish to edit or modify this structure, double click on the molecule, whereupon a structure editor window will open;
The important thing about this structure, is that missing valencies are assumed to be hydrogen rather than generic substituents. To allow a search to proceed on the assumption that any substituent can occupy a free site, go to the capture tool (the dotted square icon), then highlight an atom (or several, by dragging the capture box) and from the Query menu, select "Free Sites". This places a star against the atom(s). If you do not do this, you may not find any structures. Other editing operations are (sort of) intuitive, and it is suggested you explore them. Once happy with the structure, click on the "BC" button along the top to return to Commander.
Now click on Start to commence the search. You will be told how many hits are found (for an unmodified structure query it should be between 100-200!). A new program called Display Hits starts up (by this time, your Macintosh might be suffering from a lack of memory, and you may need to shut down any non-essential programs). From menu View, select Short display to get a preview of structures found.
You can select individual entries by clicking on them. Go back to full display to obtain data such as optical rotation or melting point.



CambridgeSoft

This section will show how structural information from the Daylight WDI and Savant database and the Cambridge CCDC searches can be transferred to a local computer using a simple program such as Telnet, and used to generate a local database.

Part 1:

There are several ways of transferring a file from a remote computer to the one you are using. The method that follows is the oldest, but in some ways the simplest and probably the fastest.

Login to the Unix cluster using your own account and password, making use of the Telnet program invoked by clicking on the icon above. Firstly, enable file transfer if it is not already;


Type ftp, then a space, then select the IP number;

Select the Macintosh directory where the file will go;

Press the RETURN key, when the file will be transferred.

Part 2:

A file called "penicillin.dat" (or whatever name you used) should now be found in the directory you chose to put it. Selecting this file and drag it over an Icon called ChemFinder located on the desktop of the Mac you are using and release the mouse button.

You will be asked to identify the type of file being processed;

If the conversion is successful, an entry will be created in a database called ChemFinder;

The analogy (or "metaphor") of this database is that each collection of chemical information is stored in a folder. In your case, when the folder is "double clicked" you get to see the entries in it, in this case one compound. Each compound has a triangle on the left, which if clicked points down, and various attributes of the molecule are revealed. In your case, you should see a ChemDraw 2D representation and a Chem3D picture, in addition to the formula already displayed;

Double clicking in the centre of either "cell" (to use spreadsheet analogy) will enable you to edit that entry using either ChemDraw or Chem3D (again just like a spreadsheet). If you double click on the 3D diagram, you can create an animation of the molecule as shown below;

Also whilst in Chem3D, you may wish to select the molecule, copy to the clipboard, and paste it into your report, or print it directly on any printer attached to the Macintosh.


MDLi ISIS/Base

The action of clicking on the ISIS logo should activate a program called ISIS/Draw. This should already contain a reaction query defined for you.

You can if you wish define an entirely new reaction. To do so, proceed as follows. Build the reactant using ring templates, and suitable bond tools. To insert heteroatoms, select the "A" tool, click on the atom, when a small box should appear, and type the atom symbol from the keyboard.

Draw the product in the same manner. Then go to the "box" tool, and "select all" from the edit menu. Now from the "Chem" menu, select "reaction".

You now have to define the relationship of the two molecules

A reaction is defined. Now you have to launch ISIS/Base. Because this requires a lot of memory, you will have to close any other applications that might be running (including Netscape!). Launch the program from the Apple menu in the "chemistry programs" sub-menu. Once open, select "database" from the "file" menu. Open RXN browser. You will need to provide a user number and password. Ask a member of staff for this.

Once Base is up and running (it may take about 2-3 minutes to achieve this state) return to the Draw program. Select ALL and then COPY from the Edit menu, return to Base, and PASTE the diagram in. The reaction should now appear in the Base window;

Some bonds will be highlighted to indicate a possible bond mapping between the two structures. This guess is likely to be wrong, so the safest course of action is to clear mapping. Now from the Search menu, select RSS (reaction sub-structure);

A list of "hits" will appear. Select the "su" button, and if you wish scroll down the list of hits with the "down" button. If you wish to copy an entire hit, select the "copy form" item from the edit menu, and then paste this into a word processor. (since you may have stopped all other programs due to memory, you may now have to start Claris up again).




VChemLib

This represents a virtual library of structural and chemical informatio, presented in the style of a textbook of "knowledge". It represents the final stage of the transition from data to knowledge. Identify if any contents are relevant to penicillin.


Electronic Conferences

Unlike real conferences, electronic versions can be easily indexed. The ECTOC conference, hosted by Imperial College during June-July of 1995, was the first such conference to support index searching. Find out if penicillin was mentioned!


Electronic Journals (CLIC)

Browse through the Chemical Copmmunications or Network Science to get a typical impression of these new media.