Protein Structure

CML is particularly; useful in managing protein structures as there is a complex mixture of information in most 'flat' files such as those from the Protein Data Bank (PDB). These include administrivia, citations, annotation, sequence, crystallography and 'small molecules' as well as the 3-D coordinates of the protein itself. Unfortunately most of this information is very rarely used because there is no useful way of managing it. (How many molecular modelling packages manage citations satisfactorily?).

This example includes two protein structures from PDB, each with points of interest for the markup. Remember that JUMBO is not intended to be a full molecular viewing and manipulation program, so don't expect high quality rendering! Note that the CML parser will only fully parse strict PDB files (there are a large number of mutant 'PDB-like' files in which the only communality is that they contain 'ATOM' cards; JUMBO does what it can with these).


The PDB file 1insmini.pdb is a cut-down version (only a monomer is selected). The TOC

shows the large variety of information collected and standardised for a typical PDB entry. If you are familiar with PDB files, you'll see the order is preserved as much as possible.

Here is an expanded version of the citations (BIBLIST)

CML allows MOLs to contain other MOLs which is valuable for macromolecular structures. In this case the two chains of the molecule are held separately and it's up to the application how they are treated (for this example JUMBO is displaying the chains in separate windows with different orientations and scales, but they could be combined and ganged together).

The resulting CML file is 1ins.cml. It's no larger than the original PDB file and reads in much quicker as it doesn't need parsing.


This is an example of a protein with a single chain, but a small molecule ligand. The (edited) PDB file is 4fxnmini.pdb and a typical screenshot is

where you can see the ligand (FMN - middle right), and the annotated SEQUENCE ( top window - sine curve = HELIX, bar = SHEET). The reader has just clicked on the bar under "VVVET...".

Back to index
© Peter Murray-Rust, 1996, 1997