The Houk-List Transition states for organocatalytic mechanism revisited

Alan Armstrong, Roberto Boto, Paul Dingwall, Julia Contreras-García, Matt J. Harvey, Nicholas Mason and Henry S. Rzepa*

Department of Chemistry, Imperial College London, South Kensington Campus, London, SW7 2AZ, UK.

Abstract. The ten year old Houk-List model for rationalising stereoselective organocatalysis of the intermolecular aldol condensation is revisited using a variety of computational techniques which have been introduced or improved since the original study. Even for such a relatively small system, the role of dispersion interactions is shown to be crucial, along with the use of basis sets where the superposition errors are low. An NCI (non-covalent-analysis) of the transition states shows good correlation with the relative computed free energies, which provides a new analytical technique to be used in modelling stereoselectivity. Alternative mechanisms, such as proton-relays involving a water molecule or the Hajos-Parrish alternative, are shown to be higher in energy.

Introduction

The promotion of a wide range of organic synthetic reactions using new generations of so-called organocatalysts (metal free systems) is highly topical. So too is the increasing adoption of computational modelling to chart the most probable mechanistic pathway and, in particular, to establish the factors responsible for reactivity and selectivity, particularly stereoselectivity.1 In this regard, one particular report in 20032 of computational investigation of the proline-mediated asymmetric intermolecular aldol reactions (Scheme 1) has achieved the rare distinction of having the key transition state model being named after the two principle authors; Houk and List. This computational model has informally become known as the gold standard for such investigations, since it set out the procedures for comparing computational modelling with experimental results for organocatalysed reactions, and outlines a methodology for predicting the stereoselectivity for new organocatalysed reactions.

Scheme for Houk-List transition state
Scheme 1. The mechanism for intermolecular aldol condensation with proline as a chiral auxilliary (green), and key stereocentres created by bond formation between the C=C and the carbonyl (red bond).

A one-proline mechanism based on enamine activation, the Houk-List transition state model involves the stereogenic carbon-carbon bond formation step of the catalytic cycle, when the catalytically active enamine attacks an electrophile. The carboxylic acid of proline plays a central role in this model, directing the electrophile to the Re face of the enamine. The enamine can be anti or syn, relative to the acid, and the electrophile can offer two prochiral faces for attack, Re or Si, resulting in four different stereochemical outcomes (Scheme 2).


Scheme 2. The stereochemical possibilities for the asymmetric aldol condensation. Cahn-Ingold-Prelog conventions are shown for R=Ph.

Through extensive computational studies,2-6 Houk and Bahmanyar suggest that the energy differences between these transition state, and so the origin and degree of stereoselectivity displayed by the reaction, depends on two critical structural elements: the relative degree to which each transition state can adopt a planar enamine, and the degree of electrostatic stabilisation provided to the forming alkoxide. A planar enamine allows for the greatest possible nucleophilicity of the terminal olefin while also reducing the geometric distortion experienced by the forming iminium. The proton transfer, from the carboxylic acid to the forming alkoxide, was determined to provide the majority of the electrostatic stabilisation and is key to the Houk-List transition state.2,7 A smaller, yet important, stabilising contribution also results from NCHᵟ+∙∙∙Oᵟ- interactions from the pyrrolidine ring.8

The stereoisomer for which these key elements were most favourable was calculated as that noted (S,R), for R = Ph, iPr. This prediction was then verified by experimental study, where the (S,R) stereoisomer was observed formed in 60% enantiomeric excess (R=Ph) or 96% (R=iPr). The challenge of computational methods then is to establish a transition state model in which this behaviour can be rationalised and so used to predict the outcome of further reaction variations. Any success of such models can then be used build confidence for diversification to new reactions and mechanisms.

Indeed, the Houk-List model has enjoyed a high degree of success in application to the prediction of the stereochemical outcomes of a number of proline mediated reactions: from the inter-2 and intramolecular6 aldol, Mannich,4 α-alkylation,9 and Michael reactions,10 among several others,1 to the correct prediction of the outcome of reactions mediated by analogues of proline.7,8,11-13 However, despite the apparently excellent comparison made between computational and experimental selectivities, for which this research became widely known, two significant issues have come to light regarding the details of the proline-cyclohexanone-benzaldehyde system. Houk has since disclosed that calculations carried out subsequent to publication, in which a greater volume of conformational space was investigated, suggest that lower energy structures of the transition states exist;1,14 potentially altering the predicted selectivity of the reaction. In addition, repetition of the experimental results by List found product selectivities to vary, with the reaction found to be extremely sensitive to both water content and temperature. As such, despite the success of the Houk-List model as a concept, the reportedly excellent agreement between computational and experimental results appears to be coincidental.

Advances in Computational Methods

The ten years that have elapsed since the original report has also seen many incremental and sometimes dramatic improvements to the computational methods used, and so in the present article we re-evaluate the Houk-List model by taking advantages of these; the gold standard itself must evolve. A more accurate computational model in turn allows the experimental findings to be subjected to improved scrutiny, and potentially this can itself result in suggestions for further experimental tests.

Because fully substituted reacting systems can have > 20 atoms and the organocatalysts themselves are potentially large molecules, the level of theory adopted by the Houk-List model in 2003 was (as always) a pragmatic compromise between the quality of the theory and the requirement to be able to compute the model in a reasonable time (a four day batch run is a typical resource available to most). The level of theory originally selected by Houk and co-workers was B3LYP/6-31G*, and geometries, activation enthalpies and free energies (ΔG298) based on calculating the normal vibrational modes at this level were obtained for a gas phase model. To these energies (and at these geometries) various further corrections could be added, including an estimation of the solvation energy computed for a polarizable continuum model. This procedure was then applied to an exploration of the conformational space available to the system. Because this has many degrees of freedom, only the most likely conformations were explored; no exhaustive procedure was applied. So how can this approach be improved upon in 2013? We have focused on six aspects set out below.

  1. The density functional used, B3LYP, has proved to be a very effective one for the study of reaction mechanisms over a twenty year period, and it has been subjected to extensive testing and scrutiny during that period. As a result, one particular and general problem was identified with this functional; it captures only the short range dynamic correlation energy and fails to capture the longer range correlations. This term can be identified as the dispersion interactions between non-bonded atoms (also referred to as the non-covalent interactions or NCI). It is now recognised that the most effective way to correct a DFT method for dispersion terms is to add empirical corrections to the nuclear-nuclear repulsion terms, and three or more generations of methods have been developed over the last decade to achieve this. Grimme15 in particular has pioneered this approach and we have adopted his third generation procedure, known as D3, to correct the B3LYP method for these energy terms. Most modern functionals also incorporate such terms, and the ωB97XD method, which incorporates a version of Grimme's D2 correction, was developed five years ago16 to specifically include both dispersion and the ability to reproduce reaction barriers.
  2. The basis set quality is another feature that can be steadily improved as computers increase in speed, aspiring of course to a complete-basis-set limit (CBS). In 2003, a practical limit using conventional computational resources to the total number of basis functions that could be used for a modelling study was about 650 functions.17 For a 6-31G* basis set as used by Houk (the modern notation would be 6-31G(d), indicating only d-polarisation functions on non-hydrogen atoms) that would normally mean a limit of around 100 atoms, or perhaps up to about 120 if only the reacting core was specified at this level and having the sterically large substituents defined at a much smaller basis set level, say STO-3G.17 At this size, using a two or four parallel processing computer of the period, the second derivatives of the energy with respect to geometry (the Hessian, required to locate and characterise transition states) could be computed in about 170-200 hours of elapsed time. This would naturally limit the degree of conformational exploration that could be undertaken. The technological advances over a decade, the more common availability of 12/16-processor systems with much larger memory (improved from ~4 Gbyte to ~94 Gbytes) and faster algorithms for analytically computing the Hessian have reduced the elapsed time for such a calculation by a factor of about 100. Consequently, a much better level of basis set can now be deployed on molecules of such size. In the present study, we have used the TZVP basis, a triple-ζ quality set with polarisation functions on both hydrogen as well as non-hydrogen atoms. This basis was selected since the Grimme-D3 dispersion corrections have been specifically parametrised for it. This now gives an opportunity to evaluate the errors due to basis-set-superposition that might have been incurred using the 6-31G* basis. For the system R=Ph (Scheme 1), this results in 556 basis functions for the calculation, compared with 376 for the 6-31G* basis and allows selected intrinsic reaction coordinates (IRC) to be computed in a reasonable time. The IRC procedure requires the DFT-based Hessian to be evaluated 20-50 times, each taking about 1 hour using the TZVP basis, to map a full IRC. This requirement imposes a current practical limit of ~800 basis functions for such a study. It is worth noting that this size of basis precludes the use of other Hamiltonians based on perturbative expansions such as MP2 (an alternative procedure for evaluating dynamic correlation). Even on a very large memory 64-processor system, it was not possible to locate a transition state for this basis (estimated ~1-2 months).
  3. It still proved possible using the larger basis to perform a more systematic exploration of conformation space (Scheme 2).
  4. The implementation of continuum solvation models has also improved considerably.18 Previously, optimisation of molecular geometries with such models was unreliable, since the first and second derivatives of the energy in a solvent continuum often resulted in discontinuities. York and Karplus18 revolutionised this with the introduction of a smoothed solvent cavity. Since about 2010, it proved possible to fully optimise geometries within such a model.19 This means that a more self-consistent solvation approach can be undertaken whereby the normal mode frequencies required to correct for thermal and entropy terms can be computed for geometries obtained using the solvation model at the relaxed solvation geometry, rather than simply being a correction applied to a gas phase model. The reaction can develop charge separation during its course (to form a zwitterion), and solvation of charge-separated species can in turn impact upon their geometries and consequent properties.
  5. The Houk-List model recognised the effect of solvent by correction for solvation free energies, but it did not also include the effects of discrete additional molecules in the model. For example, it is likely in any reaction that produces water as a product that at least one explicit water molecule could actively participate in the mechanism, acting as a proton relay. Indeed, mechanisms involving as many as four discrete waters in which a proton relay can occur can be investigated; in fact the free energy of the reaction can decrease by incorporating such an explicit additional solvent. Of course, such a super-molecule can itself be treated by a continuum solvent model layered on top of the explicit model. Here were report an exploration of such a model.
  6. The last decade has seen considerable improvements in the digital archiving of data from computational experiments. In 2003, optimised geometrical coordinates, but no other derived properties (the full computed wavefunction, an IRC pathway, etc) were inserted into paginated PDF documents as text and submitted as supporting information. Re-using this data by extraction from the PDF document is non-trivial, as we ourselves experienced in re-using the original Houk-List data. The introduction of digital chemical data repositories in 200520 has revolutionised this aspect of data curation and citation,21 and accordingly, all the results in the present study are presented in this manner.

Results and Discussion

1. The Core Transition State

The core mechanism, as shown in Scheme 1, results in the creation of two stereogenic centres by the formation of a new C-C bond, accompanied by a key proton transfer from the carboxyl group to the oxygen of the carbonyl substrate. Our base-level theory is specified as B3LYP+D3/TZVP/SCRF=DMSO, and geometries at this level for R=Ph and R=iPr (Figure 1) reveal subtle differences in the degree of C-C bond formation and accompanying proton transfer. Larger variation is induced by a change of functional (to ωB97XD), but the latter values are similar to that obtained using the MP2/6-31G(d,p)/SCRF=DMSO approach. We note that the smaller 6-31G(d,p) basis set is the largest for which a practical MP2 calculation can be run (on a 64-processor, ultra large memory system).

The effect of including the D3-dispersion term on the B3LYP-computed geometries is not entirely insignificant; the correction changes the C-C length of the forming bond at the transition state from 2.251 to 2.273Å and the proton transfer geometry from 1.242/1.166 to 1.247/1.162Å (R=Ph). The corresponding values for R=iPr are 2.159 to 2.115 for C-C and 1.154/1.262 to 1.151/1.269Å for the proton transfer.

<# some text #>
Figure 1. Computed selected bond lengths at B3LYP+D3/TZVP/SCRF=DMSO for (a) R=iPr in Å (original Houk-List values) (b) R=Ph (original Houk-List values) [MP2(FC)//6-31G(d,p)/SCRF=DMSO], {ωB97XD/TZVP/SCRF=CPCM}

A B3LYP+D3/TZVP/SCRF=DMSO intrinsic reaction coordinate for R=Ph and R=iPr (Figure 2) shows C-C bond formation is synchronous in both cases with proton transfer. There is no sign of what Cremer22 has referred to as a hidden intermediate, the frustrated formation of an intermediate sandwiched between C-C bond formation and proton transfer.

IRC R=PhIRC R=iPr
Figure 2. Computed intrinsic reaction coordinate at B3LYP/TZVP+D3/SCRF=DMSO, for (a) R=Ph, (b) R=iPr. The corresponding digital repository identifiers are DOI: n46 and n45

2. Basis Set Superposition Error

Before undertaking a stereochemical and conformational exploration of the transition state described above, we first addressed the effect that basis set quality could have on the difference in total calculated energy between two stereoisomers. The BSSE (basis set superposition error) between the anti and syn transition state geometries was estimated for three basis sets; the original Houk-List 6-31G(d), the commonly used rather larger 6-311G(d,p) basis and TZVP (using the Boys-Bernadi counterpoise method, as implemented in ORCA 2.9.1)23

These results clearly show that errors due to this effect can be of the order of 1 kcal/mol at the original 6-31G(d) level selected in 2003, are still relatively large (~0.8 kcal/mol) for the larger 6-311G(d,p) basis and only become insignificant for the TZVP basis (~0.2 kcal/mol). From these results we suggest that the TZVP is the minimum basis set quality appropriate for fidelity in stereochemical prediction; it is used for all the calculations reported here.


Table 1.

3. The Stereochemical Outcome

We next explored the relative free energies of the four stereochemical outcomes possible for the reaction (Scheme 2), for each of which there are four plausible conformational possibilities (Scheme 3). The results for R=Ph are shown in Table 2 and those for R=iPr in Table 3.


Scheme 3. The four considered conformational possibilities for the asymmetric aldol condensation.


The results are presented in the form of two tables. Table 2 contains a replication of the original Houk-List study2 using the same methods are previously reported and Table 3 an updated version using the improved methodologies discussed above. ......... tba


Table 2.

The results for R=Ph (Table 3) confirm the Houk-List result that the (S,R) stereoisomer is the dominant isomer formed; the nearest alternative stereoisomer is the (S,S) form, which emerges as 4.59 kcal/mol higher in free energy using B3LYP/TZVP/SCRF=DMSO, a value reduced to 2.97 kcal/mol if the dispersion energy corrected is included in the procedure. Overall, we see that inclusion of such corrections for this system decreases the predicted enantioselectivity to 98.6% ee from effectively 100%. The is nevertheless still at odds with the reported experimental value of ~60% ee. As we note in the introduction, the experimental value may however be extremely sensitive to both water content and temperature. Similar conclusions can be drawn for R=iPr (Table 3).

The maximum difference in energy between the lowest conformer of the (S,R) form and the highest energy conformer of the least stable stereoisomer (R,S) is 7.88 kcal/mol using the B3LYP+D3 functional, 7.72 kcal/mol at the ωB97XD level (which includes a 2nd generation dispersion correction) and 8.11 kcal/mol at the MP2 level using a 6-31G(d,p) basis set (it proved impractical to use the larger TZVP basis for this calculation).


Table 3.
Table 4.

4. Non-Covalent Interactions

Stereoselectivity is controlled by a combination of stereoelectronically-induced bond orientations and formations in the transition state, influenced by non-bonded weak interactions (NCI) within the framework. Although the balance of these latter interactions can be difficult to compute from total energies alone, it has been recently shown24 that the the computed reduced (electron) density gradients (RDG), filtered to remove the density due to covalent bonds from the analysis to leave only the weaker non-covalent interactions, can be much more effective at revealing the non-bonded interactions present and also a more realistic analysis than pure topological indices such as QTAIM.25 Such an NCI index can be obtained by contouring the RDG as isosurfaces, and colour-coding these using the appropriate eigenvalue (sign(λ2)ρ) of the density Hessian to reveal both stabilizing regions (blue for stronger hydrogen-bonded types, green for weaker van der Waals types) or destabilizing regions (strong steric clashes map to red, and weaker ones to yellow). This represents a semi-quantitative and highly visual way of analysing what is often merely described in a non-quantitative (and often intuitive) manner merely as transition state steric clashes and hydrogen bond/electrostatic attractions. Since the weaker van der Waals type dispersion attractions are very much part of such an analysis, we consider it crucial to also compute the geometry by inclusion of such terms in the Hamiltonian, as has been done here via the D3- dispersion correction.

The NCI surface (B3LYP+D3/TZVP/SCRF=DMSO, Figure 3a) for the transition state stereoismer shown in Figure 1, representing that of lowest energy (Table 3), reveals multiple stabilising regions coloured cyan (light blue) or green (van der Waals). In contrast, the highest energy isomer (Figure 3b) shows a much sparser stabilising NCI distribution.

Figure 3. A NCI analysis of the weaker interactions present in the B3LYP+D3/TZVP/SCRF=DMSO computed transition state for (a) the lowest energy isomer (Table 3) and (b) the highest energy isomer (Table 3). The surface colour code indicates red = strongly destabilizing through to blue= strongly stabilizing.

The NCI analysis can be extended further by creating a spline of the reduced density gradients for two stereoisomers, and integrating based on the comparative values of the value of sign(λ2)ρ. The results (R=Ph) ...

5. Kinetic Isotope Effects

The utility of having a computed force constant model for the transition state is that it can yield kinetic isotope effects for the reaction.26 For example, Meyer, Houk, and co-workers experimentally and computationally investigated 13C KIEs for the Hajos-Parish-Eder-Sauer-Wiechert reaction, suggesting that the rate determining step of the proline-mediated intramolecular aldol reaction occurs prior to C-C bond formation.27 In the proline-mediated intermolecular aldol reaction, between acetone and o-chlorobenzaldehyde, Armstrong, Blackmond, and co-workers reported a normal isotope effect of kH/kD = 1.93-2.26 when using acetone-d6.28 Previous kinetic investigations of the reaction had led the researchers to expect an inverse isotope effect. The observation of a small, normal isotope effect led the researchers to postulate that the final KIE was the result of a balance between normal and inverse isotope effects. Computational modelling, at the B3LYP/6-31G(d)/SCRF=DMSO level of theory, predicted a KIE of 1.97; thus offering further support for this hypothesis. This evidence allowed the inference that nucleophilic addition of the enamine to the electrophile, suggested as the rate determining step of the catalytic cycle by kinetic investigations, may be aided by protonation of the carbonyl oxygen by the carboxyl proton, supporting the Houk-List mechanism and the central role of the carboxylic acid.

Here, we report KIEs of a similar magnitude for the proline-mediated intermolecular aldol reaction between cyclohexanone and benzaldehyde. Although not experimentally verified, they are included here for future verification. The calculations have been performed using the lowest energy transition states, as established in the previous section. The predicted isotope effects reflect the synchronous nature of the reaction revealed by the IRC for R=Ph in exhibiting normal deuterium and carbon effects, and a small inverse oxygen effect.


Table 5.

6. Passive Explicit Solvation

Patil and Sunoj have computationally investigated the role of explicit solvent molecules in the modelling of the proline-mediated conjugate addition reaction in polar protic solvents, at the mPW1PW91/6-31G(d) and B3LYP/6-31G(d) levels of theory.29 The researchers proposed, that only via the explicit inclusion of two methanol molecules in a cooperative hydrogen-bonding network between the carboxylic acid proton and the oxygen atoms of a nitro group, could the correct stereochemical outcome of the reaction be predicted. This study however is based on the use of total calculated energies for transition states, rather than the more correct thermal and entropy-corrected relative free energies for the reaction profile. In addition to molecules of solvent, additives such as triethylamine30 and DBU31have been shown to play an important role in determining the selectivity of proline-mediated reactions through their explicit inclusion in transition state calculations.

Here, we investigate not the role of additive, but that of water in the Houk-List transition state. The first step of the catalytic cycle, which involves the formation of an enamine from the carbonyl compound and proline, liberates one water molecule. Potentially, this water could then participate in the subsequent reaction. Here, a single water molecule is introduced into the Houk-List model (R=Ph) as a passive solvent for one of the three oxygens via a hydrogen bond, augmenting the continuum solvation already applied. The effect this has on the relative energies of the anti and syn diastereomeric transition states, as well as providing a benchmark for comparison with proton relay mechanisms is shown in Table 5. Addition of just one explicit water molecule to the model was computed to induce little change in the computed free energy difference between (S,R) and (S,S) stereoselectivity, changing it from 2.97 to 3.15 kcal/mol.


Table 6.

7. Proton Relay Mechanisms

Beyond a passive role, it is possible that a single molecule of water could participate directly in the Houk-List mechanism, acting as an active proton transfer relay between the proline carboxylic acid group and the incoming aldehyde (R=Ph) (Table 6). This results in the expansion of the ring size of the cyclic model by two atoms, thus allowing some stereochemical models previously excluded on the basis of strain to be included. Proton relay mechanisms in organocatalysis have precedent. One of the initial mechanistic proposals for the intermolecular aldol reaction, the HPESW reaction, involved a second molecule of proline acting as a proton transfer agent during carbon-carbon bond formation, based on the apparent observation of a non-linear effect.32 This proposal was later discredited through experimental studies confirming the absence of non-linear effects in the proline-mediated aldol reactions.33,34 Patil and Sunoj have also investigated the role of a proton-relay mechanism in hemiacetal-, iminium, and enamine formation for a model system mimicking the first step in the catalytic cycle of proline catalysis, at the mPW1PW91/6-31G(d) level of theory.35 It was found that protic additives, such as methanol, allow for significantly lower activation energies of these pathways. The reported energies did not include the effects of entropy, which would be expected to result in significantly higher free energies of activation compared to those based on total energies alone. Here we present computed free energy differences for the proton relay mechanism, isomeric with the passive mechanism described in the previous section.


Table 7.

Four features are noteworthy.

  1. The (S,S) diastereomer is now predicted as slightly lower in free energy than the observed (S,R) products.
  2. The lowest free-energy for a proton-relay transition state is nevertheless 9.1 kcal/mol higher than that the passively solvated isomer. Because proton transfer reactions in particular can depend on the nature of the density functional used, we also evaluated this energy difference at the ωB97XD/TZVP level. The free energy difference of 6.4 kcal/mol is smaller than using B3LYP+D3, but still clearly excludes the involvement of water in the proton-relay. This model would clearly predict the incorrect stereochemical outcome; it may not always be so, of course, for other mechanistic variations.
  3. The intrinsic reaction coordinate (B3LYP+D3/TZVP/SCRF=DMSO, Figure 4) indicates that the presence of the proton relay delays the C-C bond formation; its length at the transition state is now a much shorter 1.920Å (compared to 2.273Å without such intervention). Only after this transition state is passed does the proton transfer commence. At a value of the IRC indicated with an arrow (Figure 4b), C-C bond formation is largely complete (1.676Å) and a so-called hidden intermediate is revealed in which a strong symmetric hydrogen bond between the proton relay and the carbonyl group manifests. The significance of detecting such intermediates is that the electronic influences at this geometry could be potentially targetted in order to induce the system to form a real intermediate; the relative timing of e.g. C-C bond formation and proton-relays can therefore be seen as a function of structure and substituents. This feature of a potential energy surface is relatively insensitive to e.g. the density functional procedure employed. Shown in Figure 4c is the same reaction charted using the ωB97XD functional; the hidden intermediate occurs at the position in the IRC, but its profile is stronger (the gradient norm approaches zero more closely than with the B3LYP functional).
  4. The kinetic isotope effects computed for this mechanism (Table) are very different from the one involving no intervention by water.


Figure 4. The computed intrinsic reaction coordinate for the proton-relay mechanism for R=Ph @B3LYP+D3/TZVP/SCRF=DMSO, showing (a) the relative energy and (b) the gradient norm. The red arrow indicates the location of the hidden intermediate. The calculation is archived at DOI:n44. (c) The IRC recomputed at @ωB97XD/TZVP/SCRF=DMSO showing the gradient norm and the more prominent hidden intermediate. The calculation is archived at DOI:n43

8. Hajos-Parrish mechanism

This mechanistic variation proposed by Hajos and Parrish36 for the intramolecular aldol reaction involves an initial proton transfer from the carboxyl group to the nitrogen of the enamine. This species can then react to form a C-C bond directly via a 6-membered ring transition state, accompanied by proton transfer from N to the carbonyl oxygen (Figure 5). The computed reaction IRC reveals an entirely synchronous process, with C-C bond formation coincident with proton transfer. The free energy however (Table 2) is 20.5 kcal/mol higher than the isomeric Houk-List model. The augmentation of the model with a proton-relay increases the overall activation energy even further (40.4 kcal/mol). This mechanistic variation reveals the incursion of two hiden intermediates after the transition state. As with the proton-relay variation, we can conclude that the Hajos-Parrish mechanism is also not viable for this reaction.




Figure 5. Hajos-Parrish mechanism, R=Ph (see Table 2) showing (a) the computed B3LYP+D3/TZVP/SCRF=DMSO geometry, (b) the IRC energy profile and (c) the IRC gradient norm, doi: n5d (d) the IRC energy profile for the Hajos-Parrish mechanism, R=Ph with inclusion of one water molecule acting as a proton relay and (e) the IRC gradient norm for the proton-relay mechanism, with hidden intermediates indicated with an arrow, doi: n5n

9. Substituted variations.

Three models (Figure 6) were devised to evaluate the impact of structural changes upon predictions made using Houk-List model.

(a) 4-Piperidinone-Benzaldehyde-Proline: This model aims to maximise the π-facial non-covalent interaction, observed in the syn transition state for system 1, between the proton at the 4-position of the cyclohexanone-derived enamine and the face of the aryl ring of the aldehyde.
(b) Cyclohexanone-Pyrrole-2-carboxaldehyde-Proline. This model aims to maximise the hydrogen bonding interaction, observed in the syn transition state for system 1, between the proton at the 2-position of the aryl ring of the aldehyde and the oxygen of the carboxylic acid of the bound catalyst.
(c) Acetone-Pyrrole-2-carboxaldehyde-Proline: This model aims to maximise the hydrogen bonding interaction, observed in the syn transition state for system 1, between the proton at the 2-position of the aryl ring of the aldehyde and the oxygen of the carboxylic acid of the bound catalyst and uses acetone instead of cyclohexanone as a less sterically demanding environment.


Figure 6. (a) 4-Piperidinone-Benzaldehyde-Proline, (b) Cyclohexanone-Pyrrole-2-carboxaldehyde-Proline, (c) Acetone-Pyrrole-2-carboxaldehyde-Proline. Bond lengths in Å computed at the B3LYP+D3/TZVP/SCRF=DMSO level.

The striking feature of all three variations is the difference in the predicted outcome that addition of a dispersion correction has Table 7). By model (c), with reduced steric congestion, the predicted ratio of the (S,R)/(S,S) diastereomers has inverted, but only when the dispersion correction is included. This result provides a compelling reason for including such corrections.


Table 8.

Computational procedures

The Gaussian 09 program, revision D.01 was used for all calculations excepting the BSSE evaluation, where ORCA 2.9.1 was employed, and thermal corrections were applied based on computed normal vibrational frequencies. Kinetic isotope effects were computed from the difference in thermally corrected free energies obtained from the calculated Hessian (force constant) matrix and appropriate atomic masses for the isotopes. Intrinsic reaction coordinate (IRC) evaluations were obtained using the following keyword in the Gaussian program: irc(calcfc,recalc=10,maxpoints=200,maxcycle=40,tight,cartesian,stepsize=7,lqa) (the outcome of such calculations can be very sensitive to the imposed parameters). All calculations were routinely archived in both of two digital repositories (DSpace-SPECTRa and Figshare) and assigned a digital-orbject-identified (DOI). Some DOIs were contracted to a shorter form using the resource http://shortdoi.org/ NCI (non-covalent-interactions) were computed using the methodology previously describednci A Gaussian cube of the computed electron density at fine resolution was computed and converted to an NCI analysis using the resource at DOI: n5b, which is based on using the Jmol applet. The output of this process comprises a coordinate file (.xyz) and a compressed isosurface (.jvxl), which are visualised using the Jmol applet embedded in the interactive version of e.g. Table 2. This table has a DOI: ???

Conclusions

Computational modelling of the stereochemical outcome of catalysed solution-phase reactions has a stringent criterion for accuracy, defined in a sense by the equation ΔΔG = -RT ln K. The predicted relative transition state energy ΔΔG has to be sufficiently accurate to result in reliable values for K, the ratio of the concentrations of two stereoisomers. In practice, this means achieving a computed accuracy for ΔΔG of < 1 kcal/mol or less for values of ΔΔG of between 2-5 kcal/mol. The lower limit here being typical of moderately stereoselective reactions and the latter being towards the top end of the most highly stereospecific reactions known. This will require a reasonable exploration of conformational space, and mechanistic variations such as implicit passive solvation, and active proton-relay interventions, and this all to be achieved using methodology which can be implemented with reasonable time-turnover on modern computational resources. We have tested an improved variation on that used to establish the Houk-List model of the intermolecular aldol condensation catalysed by proline. A triple-ζ quality basis set is required to remove significant basis-set-superposition-errors, coupled with a dispersion energy correction to the B3LYP functional. Alternatively, more modern functionals which already include such terms can be used. The solvation treatment now possible is both more accurate and more self-consistent, since thermal corrections to the free energy are evaluated from frequencies determined within the solvation model. It is possible to explore more conformational space for the basic reaction, and to augment the mechanism with either passive or explicit intervention of small molecules such as water.

Whilst routinely achieving accuracies in ΔΔG of < 1 kcal/mol is perhaps not entirely achievable at present, it surely will be in just a few years time, as the density functionals and other methodologies continue to improve. Here progress may be hindered by the relative lack of accurate experimental data determined under standard kinetic conditions. Achieving a stochastic exploration of the mechanism using molecular dynamics is perhaps rather further off for these relatively complex mechanisms.

References

  1. P. H. Cheong, C. Y. Legault, J. M. Um, N. Çelebi-Ölçüm, and K. N. Houk,Chem. Rev., 2011, 111, 5042-5137. DOI: 10.1021/cr100212h
  2. S. Bahmanyar, K. N. Houk, H. J. Martin and B. List, J. Am. Chem. Soc., 2003, 125, 2475-2479. DOI: 10.1021/ja028812d
  3. S Bahmanyar, K. N. Houk, H. J. Martin, B. List, B. J. Am. Chem. Soc., 2003, 125, 2475.
  4. L. Hoang, S. Bahmanyar, K. N. Houk, B List, B. J. Am. Chem. Soc., 2003, 125, 16.
  5. S. Bahmanyar, K. N. Houk, Org. Lett., 2003, 5, 1249.
  6. S. Bahmanyar, K. N. Houk, K. N. J. Am. Chem. Soc., 2001, 123, 11273.
  7. S. Bahmanyar, K. N. Houk, J. Am. Chem. Soc., 2001, 123, 12911.
  8. A. Armstrong, Y. Bhonoah, A. J. P. White, J. Org. Chem., 2009, 74, 5041.
  9. A. Fu, B. List, W. Thiel, J. Org. Chem., 2006, 71, 320.
  10. M. P. Patil, R. B. Sunoj, Chem. Eur. J., 2008, 14, 10472.
  11. K. N. Houk, P. H. Y. Cheong, J. S. Warrier, S. Hanessian, Adv. Synth. Catal., 2004, 346, 1111.
  12. C. Shinisha, R. B. Sunoj, Org. Biomol. Chem., 2007, 5, 1287.
  13. S. Mitsumori, H. Zhang, P. Ha-Yeon Cheong, K. N. Houk, F. Tanaka, C. F. Barbas 3rd J. Am. Chem. Soc., 2006, 128, 1040.
  14. S. M. Bachrach, Computational Organic Chemistry; Wiley-Interscience, 2007.
  15. S. Grimme, J. Antony, S. Ehrlich, and H. Krieg, J. Chem. Phys., 2010, 132, 154104, DOI:10.1063/1.3382344
  16. J. -D. Chai, M. Head-Gordon, M., Phys. Chem. Chem. Phys, 2008, 6615.
  17. V. C. Gibson, E. L. Marshall, H. S. Rzepa, J. Am. Chem. Soc., 2005, 127, 6048-6051. DOI: 10.1021/ja043819b
  18. G. Scalmani and M. J. Frisch,J. Chem. Phys, 2010, 132, 114110− 114115; DOI: 10.1063/1.3359469; York, D. M. and Karplus, M. J. Phys. Chem. A, 1999, 103, 11060− 11079; DOI: 10.1021/jp992097l
  19. J. Kong, P. v. R. Schleyer and H. S. Rzepa, J. Org. Chem., 2010, 75, pp 5164-5169. DOI: 110.1021/jo100920e
  20. J. A. Townsend, J. Downing, M. J. Harvey, P. B. Morgan, P. Murray-Rust, H. S. Rzepa, D. C. Stewart and A. P. Tonge,"SPECTRa-T: Machine-based data extraction and semantic searching of chemistry e-theses", J. Chem. Inf. Mod., 2010, 50, 251-261. DOI: 10.1021/ci9003688
  21. H. S. Rzepa, "Emancipating data", Chemistry World, 2013
  22. D. Cremer, E. Kraka, Acc. Chem. Research, 2010, 43, 591-601.
  23. F. Neese, "The ORCA Program System", Wiley Interdiscip. Rev.: Comput. Mol. Sci, 2012, 2, 73-78. DOI: 10.1002/wcms.81 and http://cec.mpg.de/forum/
  24. E.R. Johnson, S. Keinan, P. Mori-Sánchez, J. Contreras-García, A.J. Cohen, and W. Yang, J. Am. Chem. Soc., 2010, 132, 6498-6506, DOI: 10.1021/ja100936w
  25. J. R. Lane, J. Contreras-García, J.-P. Piquemal, B. J. Miller, and H. G. Kjaergaard, J. Chem. Theory Comput., 2013, 9, 3263-3266. DOI: 10.1021/ct400420r
  26. See for example the use of KIE in resolving the mechanism of the benzidine rearrangement, W. Subotkowski, L. Kupczyk-Subotkowska, and H.J. Shine, J. Amer. Chem. Soc., 1993, 115, 5073-5076, DOI: 10.1021/ja00065a018 and similarly for the epoxidation of ethene by peracid; T. Koerner, H. Slebocka-Tilk, and R.S. Brown, J. Org. Chem, 1999, 64, 196-201, 10.1021/jo981652x
  27. H. Zhu, F. R. Clemente, K. N. Houk, M. P. Meyer, J. Am. Chem. Soc., 2009, 131, 1632.
  28. N. Zotova, L. J. Broadbelt, A. Armstrong, D. G. Blackmond, Bioorg. Med. Chem. Lett. 2009, 19, 3934.
  29. M. P. Patil, R. B. Sunoj,Chem. Eur J., 2008, 14, 10472.
  30. A. Fu, B. List, W. Thiel, J. Org. Chem. 2006, 71, 320.
  31. J. E. Hein, J. Bures, Y. H. Lam, M. Hughes, K. N. Houk, A. Armstrong, D. G. Blackmond, Org. Lett., 2011.
  32. C. Puchot, O. Samuel, E. Dunach, S. Zhao, C. Agami, H. B. Kagan, J. Am. Chem. Soc. 1986, 108, 2353; C. Agami, J. Levisalles, C. J. Puchot, J. Chem. Soc. Chem. Comm., 1985, 441.
  33. L. Hoang, S. Bahmanyar, K. N. Houk, B. List, J. Am. Chem. Soc., 2003, 125, 16.
  34. M. Klussmann, S. R. Mathew, H. Iwamura, D. H. Wells, A. Armstrong, D. G. Blackmond, Angew. Chem. Int. Ed., 2006, 45, 7989; M. Klussmann, H. Iwamura, S. P. Mathew, D. H. Wells, U. Pandya, A. Armstrong, D. G. Blackmond, Nature, 2006, 441, 621.
  35. M. P. Patil, R. B Sunoj, R. J. Org. Chem., 2007, 72, 8202.
  36. Z. G. Hajos, D. R. Parrish J. Org. Chem., 1974, 39, 1615-1621. DOI: 10.1021/jo00925a003 DOI 10.1021/jo00925a003 and discussed in section 2.1 of Ref 1.