Archive for the ‘Chemical IT’ Category

A new way of exploring the directing influence of (electron donating) substituents on benzene.

Friday, April 17th, 2015

The knowledge that substituents on a benzene ring direct an electrophile engaged in a ring substitution reaction according to whether they withdraw or donate electrons is very old.[1] Introductory organic chemistry tells us that electron donating substituents promote the ortho and para positions over the meta. Here I try to recover some of this information by searching crystal structures.

(more…)

References

  1. H.E. Armstrong, "XXVIII.—An explanation of the laws which govern substitution in the case of benzenoid compounds", J. Chem. Soc., Trans., vol. 51, pp. 258-268, 1887. https://doi.org/10.1039/ct8875100258

Goldilocks Data.

Wednesday, April 8th, 2015

Last August, I wrote about data galore, the archival of data for 133,885 (134 kilo) molecules into a repository, together with an associated data descriptor[1] published in the new journal Scientific Data. Since six months is a long time in the rapidly evolving field of RDM, or research data management, I offer an update in the form of some new observations.

(more…)

References

  1. R. Ramakrishnan, P.O. Dral, M. Rupp, and O.A. von Lilienfeld, "Quantum chemistry structures and properties of 134 kilo molecules", Scientific Data, vol. 1, 2014. https://doi.org/10.1038/sdata.2014.22

How-open-is-it?

Thursday, February 12th, 2015

The title of this post refers to the site http://howopenisit.org/  which is in effect a license scraper for journal articles. In the past 2-3 years in the UK, we have been able to make use of grants to our university to pay publishers to convert our publications into Open Access (also called GOLD). I thought I might check out a few of my recent publications to see what http://howopenisit.org/ makes of them.

(more…)

A convincing example of the need for data repositories. FAIR Data.

Thursday, January 15th, 2015

Derek Lowe in his In the Pipeline blog is famed for spotting unusual claims in the literature and subjecting them to analysis. This one is entitled Odd Structures, Subjected to Powerful Computations. He looks at this image below, and finds the structures represented there might be a mistake, based on his considerable experience of these kinds of molecules. I expect he had a gut feeling within seconds of seeing the diagram.

(more…)

Data discoverability

Wednesday, December 17th, 2014

I have written earlier about the Amsterdam Manifesto. That arose out of a conference on the theme of “beyond the PDF“, with one simple question at its heart: what can be done to liberate data from containers it was not designed to be in? The latest meeting on this topic will happen in January 2015 as FORCE2015.

(more…)

Blasts from the past. A personal Web presence: 1993-1996.

Saturday, November 1st, 2014

Egon Willighagen recently gave a presentation at the RSC entitled “The Web – what is the issue” where he laments how little uptake of web technologies as a “channel for communication of scientific knowledge and data” there is in chemistry after twenty years or more. It caused me to ponder what we were doing with the web twenty years ago. Our HTTP server started in August 1993, and to my knowledge very little content there has been deleted (it’s mostly now just hidden). So here are some ancient pages which whilst certainly not examples of how it should be done nowadays, give an interesting historical perspective. In truth, there is not much stuff that is older out there!

(more…)

More simple experiments with crystal data. The pyramidalisation of nitrogen.

Saturday, November 1st, 2014

We are approaching 1 million recorded crystal structures (actually, around 716,000 in the CCDC and just over 300,00 in COD). One delight with having this wealth of information is the simple little explorations that can take just a minute or so to do. This one was sparked by my helping a colleague update a set of interactive lecture demos dealing with stereochemistry. Three of the examples included molecules where chirality originates in stereogenic centres with just three attached groups. An example might be a sulfoxide, for which the priority rule is to assign the lone pair present with atomic number zero. The issue then arises as to whether this centre is configurationally stable, i.e. does it invert in an umbrella motion slowly or quickly.  My initial intention was to see if crystal structures could cast any light at all on this aspect.

(more…)

Electronic notebooks: a peek into the future?

Tuesday, September 16th, 2014

ELNs (electronic laboratory notebooks) have been around for a long time in chemistry, largely of course due to the needs of the pharmaceutical industries. We did our first extensive evaluation probably at least 15 years ago, and nowadays there are many on the commercial market, with a few more coming from opensource communities. Here I thought I would bring to your attention the potential of an interesting new entrant from the open community.

(more…)

One molecule, one identifier: Viewing molecular files from a digital repository using metadata standards.

Monday, September 8th, 2014

In the beginning (taken here as prior to ~1980) libraries held five-year printed consolidated indices of molecules, organised by formula or name (Chemical abstracts). This could occupy about 2m of shelf space for each five years. And an equivalent set of printed volumes from the Beilstein collection. Those of us who needed to track down information about molecules prior to ~1980 spent many an afternoon (or indeed a whole day) in the libraries thumbing through these weighty volumes. Fast forward to the present, when (closed) commercial databases such as SciFinder, Reaxys and CCDC offer information online for around 100 million molecules (CAS indicates it has 89,506,154 today for example). These have been joined by many open databases (e.g. PubChem). All these sources of molecular information have their own way of accessing individual entries, and the wonderful program Jmol (nowadays JSmol) has several of these custom interfaces programmed in. Here I describe some work we have recently done[1] on how one might generalise access to an individual molecule held in what is now called a digital data repository.

(more…)

References

  1. M.J. Harvey, N.J. Mason, and H.S. Rzepa, "Digital Data Repositories in Chemistry and Their Integration with Journals and Electronic Notebooks", Journal of Chemical Information and Modeling, vol. 54, pp. 2627-2635, 2014. https://doi.org/10.1021/ci500302p

Data galore! 134 kilomolecules.

Wednesday, August 6th, 2014

I do go on a lot about the importance of having modern access to data. And so the appearance of this article[1] immediately struck me as important. It is appropriately enough in the new journal Scientific Data. The data contain computed properties at the B3LYP/6-31G(2df,p) level for 133,885 species with up to nine heavy atoms, and the entire data set has its own DOI[2]. The data is generated by subjecting a molecule set to a number of validation protocols, including obtaining relaxed (optimised) geometries at the B3LYP/6-31G(2df,p) level. It would be good to replicate this set with inclusion of a functional that also includes dispersion, and of course making the coordinates all available in this manner greatly facilitates this. The collection also includes data for e.g. 6095 constitutional isomers of C7H10O2, which reminds me of an early, delightfully entitled, article adopting such an approach in quantum chemistry[3]. Such collections are an important part of the process of validating computational methods[4] This way of publishing data does raise some interesting discussion points.

(more…)

References

  1. R. Ramakrishnan, P.O. Dral, M. Rupp, and O.A. von Lilienfeld, "Quantum chemistry structures and properties of 134 kilo molecules", Scientific Data, vol. 1, 2014. https://doi.org/10.1038/sdata.2014.22
  2. Raghunathan Ramakrishnan., P. Dral, P.O. Dral, M. Rupp, and O. Anatole Von Lilienfeld., "Quantum chemistry structures and properties of 134 kilo molecules", 2014. https://doi.org/10.6084/m9.figshare.978904
  3. P.P. Bera, K.W. Sattelmeyer, M. Saunders, H.F. Schaefer, and P.V.R. Schleyer, "Mindless Chemistry", The Journal of Physical Chemistry A, vol. 110, pp. 4287-4290, 2006. https://doi.org/10.1021/jp057107z
  4. P. Murray-Rust, H.S. Rzepa, J.J.P. Stewart, and Y. Zhang, "A global resource for computational chemistry", Journal of Molecular Modeling, vol. 11, pp. 532-541, 2005. https://doi.org/10.1007/s00894-005-0278-1