Sunday, May 24, 2015

Energy Minimization: Finding and Connecting Stationary Points



Let’s start by considering a simple example with one coordinate (for example a distance), $R$, and where the energy is given by
$$ E= \frac{1}{2}(R-R_e)^2$$
The corresponding potential energy surface, known as a quadratic PES, is shown in Figure 1.

Figure 1. Plot of the (quadratic) potential energy surface given by Equation 1.

Here $R_e$ is the value of $R$ at which the energy is lowest (this is known as the equilibrium geometry) and this is what we’d like to find.  We start by taking a guess at $R$, $R_g$.  We already know how to check whether this is an energy minimum: we need to evaluate the gradient, which is
$$\left(\frac{\partial E}{\partial R}\right)_{R_g}=k(R_g-R_e)$$
It’s clear that the gradient is non-zero for $R_e \ne R_g$.  However, by rearranging the equation the gradient also tells us how to change $R$ to get an energy minimum
$$R_e=R_g-\frac{1}{k}\left(\frac{\partial E}{\partial R}\right)_{R_g}=R_g+\frac{1}{k}F_g$$
where $F$ is the force, which is the negative gradient.  If we know $k$ we can find $R_e$ in one step starting from $R_g$ (Figure 2).

Figure 2. The equilibrium geometry can be found in one step on a quadratic PES

If we don’t know $k$ then it is safest to take many small steps, i.e. scale the gradient by some small constant ($c$) and repeat until the gradient falls below some threshold (Figure 3)
$$R_{n+1}=R_n-c\left(\frac{\partial E}{\partial R}\right)_{R_n}$$

Figure 3. Energy minimization by steepest descent.

This general approach is known as steepest descent, since the gradient tells you in which direction where the energy is decreasing (descending) the most. Notice that this means that minimizing the energy will find the closest minimum to the starting geometry.  When steepest descent is used to find equilibrium geometries it is often combined with a so-called “line search”, which tries to find the lowest energy in the direction of the current gradient, but the general idea is still the same.

Another use of steepest descent is to connect a transitions state with its two closest minima, since this tells you what two minima the transition state connects.  Here the guess structure is a transition state structure displaced along the normal mode corresponding to the imaginary frequency (the transition state structure itself cannot be used because its gradient is zero).  This path is the minimum energy path (MEP) between reactions and products and the resulting collection of structures is known as the intrinsic reaction coordinate (IRC).  An IRC is usually depicted as a plot of the potential energy vs the mass-weighted root-mean-square-displacement of the Cartesian coordinates relative to some reference geometry (usually the transition state), usually denoted $s$ (Figure 4).  When we draw a potential energy surface for a reaction we usually mean an IRC.

Figure 4. Depiction of an IRC or MEP: two steepest descent paths starting from the transition state

Clearly, the more steps we take, the higher the computational cost, and in general we want to take as few steps as possible.  As I mentioned above, if we knew $k$ we could find the energy minimum in one step.  For a quadratic surface $k$ is the second derivative of $E$ with respect to $R$ (or the first derivative of the gradient) so Eq 3 becomes
$$ R_e=R_g-\left(\frac{\partial^2 E}{\partial R^2}\right)^{-1}_{R_g}\left(\frac{\partial E}{\partial R}\right)_{R_g}$$
Such a step is known as a Newton-Raphson step or quadratic step.  This works fine if the surface is quadratic, which is a good approximation near the minimum but not far away.  For really bad guesses, where the PES tends to be flat, the quadratic step can be too large (Figure 5).

Figure 5. Quadratic steps (a) close to and (b) far from the minimum.

Most algorithms will scale back quadratic steps that are considered unreasonably large, and even using quadratic energy minimizers many steps are needed:
$$R_{n+1}=R_n-\left(\frac{\partial^2 E}{\partial R^2}\right)^{-1}_{R_n}\left(\frac{\partial E}{\partial R}\right)_{R_n}$$
$$ \mathbf{q}_{n+1}=\mathbf{q}_{n}-c\mathbf{H}^{-1}_n\mathbf{g}_n$$
Eq. 7 is the same equation as 6 but in many dimensions: qn are the current coordinates (for example Cartesian coordinates) for step number $n$ ($n = 0$ for the guess coordinates), $\mathbf{q}_{n+1}$ are the new coordinates, $\mathbf{g}_n$ is the gradient, and $\mathbf{H}_n$ is the Hessian both evaluated at the current coordinates. Note that the Hessian should be computed at each step.


Friday, May 1, 2015

Computational Chemistry Highlights: April issue

The April issue of Computational Chemistry Highlights is out.

CCH is an overlay journal that identifies the most important papers in computational and theoretical chemistry published in the last 1-2 years. CCH is not affiliated with any publisher: it is a free resource run by scientists for scientists. You can read more about it here.

Table of content for this issue features contributions from CCH editors Mario Barbatti, Steven Bachrach, and Jan Jensen:

Big Data Meets Quantum Chemistry Approximations: The ∆-Machine Learning Approach

Wednesday, April 1, 2015

Sunday, March 1, 2015

Computational Chemistry Highlights: February issue

The February issue of Computational Chemistry Highlights is out.

CCH is an overlay journal that identifies the most important papers in computational and theoretical chemistry published in the last 1-2 years. CCH is not affiliated with any publisher: it is a free resource run by scientists for scientists. You can read more about it here.

Table of content for this issue features contributions from CCH editors Dean Tantillo, Marcel Swart, Martin Korth, Johannes Hachmann, Steven Bachrach, and Jan Jensen:

Water–Water and Water–Solute Interactions in Microsolvated Organic Complexes








Sunday, February 1, 2015

Computational Chemistry Highlights: January issue

The January issue of Computational Chemistry Highlights is out.

CCH is an overlay journal that identifies the most important papers in computational and theoretical chemistry published in the last 1-2 years. CCH is not affiliated with any publisher: it is a free resource run by scientists for scientists. You can read more about it here.

Table of content for this issue features contributions from CCH editors Alán Aspuru-Guzik, Andres Cisneros, Steven Bachrach, David Bowler, and Jan Jensen:

Construction of a Highly Distorted Benzene Ring in a Double Helicene

A computational drug design success story

Orbital-free density functional theory implementation with the projector augmented-wave method

Enhanced Helical Folding of ortho-Phenylenes through the Control of Aromatic Stacking Interactions

Saturday, January 24, 2015

The conformationally averaged free energy


The free energy averaged over $N_\mathrm{conf}$ conformations of molecule X is

$G^\circ=-RT \ln \left( \sum_I^{N_\mathrm{conf}} e^{-G^\circ_I/RT} \right)$  (1)

However, the way to compute an average property of the conformations $\left\langle x\right\rangle$ is

$\left\langle x\right\rangle=\sum_I^{N_\mathrm{conf}} p_Ix_I$  (2)

where

$p_I=\frac{e^{-G_I^\circ/RT}}{\sum_J^{N_\mathrm{conf}} e^{-G_J^\circ/RT}}$ (3)

But this equation cannot be used to compute the conformationally averaged free energy because

$G^\circ \ne \left\langle G^\circ \right\rangle$

Why not? First, let's see where the first equation comes from.

Obtaining equation (1)

The first equation comes from the statistical mechanical definition of the Helmholtz free energy found any any p-chem textbook

$A^\circ = -RT \ln  \left( \sum_i^{\mathrm{states}} e^{-\varepsilon_i/kT} \right)=-RT \ln  \left( q \right)$

The sum over microstates can be split into sums over microstates of each conformation I

$A^\circ = -RT \ln  \left( \sum_I^{N\mathrm{conf}} \sum_{i\in I}^{\mathrm{states}} e^{-\varepsilon_i/kT} \right)$

$= -RT \ln  \left( \sum_I^{N\mathrm{conf}} q_I \right) = -RT \ln \left( \sum_I^{N_\mathrm{conf}} e^{-A_I^\circ/RT} \right)$

To get the corresponding expression for the Gibbs free energy [Eq (1)]:

$G^\circ = A^\circ +p^\circ V $

$=-RT \ln \left( \sum_I^{N_\mathrm{conf}} e^{-A^\circ_I/RT} \right)-RT \ln \left(  e^{-p^\circ V/RT} \right)$

$=-RT \ln \left( \sum_I^{N_\mathrm{conf}} e^{-G_I^\circ/RT} \right)$

The difference between equation (1) and (3) is the conformational entropy
$G^\circ - \left\langle G^\circ \right\rangle=-RT \ln \left( \sum_J^{N_\mathrm{conf}} e^{-G_J^\circ/RT} \right)\sum_I^{N_\mathrm{conf}} p_I$

$+RT\sum_I^{N_\mathrm{conf}} p_I\frac{-G_I}{RT}$

$=RT \sum_I^{N_\mathrm{conf}} p_I \left( \frac{-G_I}{RT}- \ln \left( \sum_J^{N_\mathrm{conf}} e^{-G_J^\circ/RT} \right) \right)$

$=RT \sum_I^{N_\mathrm{conf}} p_I \left( \ln \left(  e^{-G_I^\circ/RT} \right) - \ln \left( \sum_J^{N_\mathrm{conf}} e^{-G_J^\circ/RT} \right) \right)  $

$=RT \sum_I^{N_\mathrm{conf}} p_I \ln(p_I) =-TS_{\mathrm{conf}} $


Wednesday, January 7, 2015

Predicting binding free energies with electronic structure theory: thermodynamic considerations - Part 3


2015.01.25:  I have summarized discussion of this and related issues in this paper.

Predicting absolute binding free energies of biologically relevant molecules (in aqueous solution at pH 7 using electronic structure theory without empirical adjustments to within 1 kcal/mol) is one of the holy grails of computational chemistry. Recent work by Grimme and co-workers (here and here) have come close with mean absolute deviations of about 2 kcal/mol for host-guest complexes. This and other work has shed some light on why it is so difficult to predict binding free energies in aqueous solution, which I will discuss here and in two previous posts (Part 1 and Part 2). In addition I'll talk about further improvements that could potentially increase the accuracy further.

The approach is based on the following equation

$\Delta G^\circ = \Delta E_\mathrm{gas} + \Delta G^\circ_{\mathrm{gas,RRHO}}+\Delta \delta G^\circ_{\mathrm{solv}}$

where $\Delta E_\mathrm{gas}$ is the change in electronic energy and $\Delta G^\circ_{\mathrm{gas,RRHO}}$ the change in the translational, rotational, and vibrational free energy computed using the rigid-rotor and harmonic oscillator approximation, respectively - both evaluated in the gas phase. The vibrational frequencies are computed at a lower level of theory compared to that used to compute $\Delta E_{\mathrm{gas}}$. $\Delta \delta G^\circ_{\mathrm{solv}}$ is the change in solvation free energy computed using a continuum model of the solvent, such as COSMO.

Solvation Thermodynamics
Background. Most continuum models (CMs) of solvation compute the solvation free energy as the difference between the free energy in solution and the gas phase electronic energy

$\delta G^\circ\mathrm{_{solv}(X)}=G\mathrm{^{\circ,CM}_{soln}(X)}-E\mathrm{_{gas}(X)}$

$G\mathrm{^{CM}_{soln}(X)}$ typically contain energy terms describing the electrostatic interaction of the molecule and the continuum as well as the van der Waal inteactions with the solvent and free energy required to create the molecular cavity in the solvent (cavitation). The electrostatic interaction with the solvent alters the molecular wavefunction and is computed self-consistently.

Some software packages automatically compute $G\mathrm{^{\circ,CM}_{soln}(X)}$ and $E\mathrm{_{gas}(X)}$ in one run, while others only compute $G\mathrm{^{\circ,CM}_{soln}(X)}$. Also, some programs just compute the electrostatic component of $G\mathrm{^{\circ,CM}_{soln}(X)}$ by default.  However, the van der Waals and, especially, the cavitation component can make sizable contributions to the binding free energy and must be included for accurate results.  It is worth noting that any hydrophobic contribution to binding will derive primarily from the change in cavitation energy.

$G\mathrm{^{CM}_{soln}(X)}$ contain parameters (e.g. atomic radii) that are adjusted to reproduce experimentally measured solvation free energies

$\delta G\mathrm{^\circ_{solv}(X)}=G\mathrm{^{\circ,exp}_{soln}(X)}-G\mathrm{^{\circ,exp}_{gas}(X)}$

The standard state for both $G\mathrm{^{\circ,exp}_{soln}(X)}$ and $G\mathrm{^{\circ,exp}_{gas}(X)}$ is generally 1 M.  The latter is the reason why a 1 M reference state also must be used when computing $\Delta G^\circ_{\mathrm{RRHO}}$

Atomic radii. The solvation energy is computed using a set of atomic radii that define the solute-solvent boundary surface. These radii are usually obtained by fitting to experimentally measured solvation energies. Accurate solvation energies should not be expected from methods that use iso-electron density surfaces or van der Waals radii without additional empirical fitting. When using fitted radii one should use the same level of theory for the solute as was used in the parameterization.

Ions. For neutral molecules solvation free energies can be measured with an accuracy of roughly 0.2 kcal/mol and reproduced theoretically to within roughly 0.5 kcal/mol.  However, the solvation energies of ions cannot be directly measured and must be indirectly inferred relative to a standard (typically the solvation energy of the proton). The experimentally obtained solvation energies are typically accurate to within 3 kcal/mol and can be reproduced computationally with roughly the same accuracy.  The solvation energy of ions are therefore an especially likely source of error in binding free energies - especially if the ionic regions of the molecules become significantly desolvated due to binding.

Gas phase vs solution optimization.  The fitting of the radii described above is usually done using gas phase optimized structures only, i.e. any change in structure and corresponding rotational and vibrational effects are "included" in the radii via the parameterization.  However, for ionic species gas phase optimization can lead to significantly distorted structures or even proton transfer and in these cases solution phase optimizations and, hence, vibrational frequency calculations, tend to be used. However, numerical instability in the continuum models can make it necessary to increase (i.e. make less stringent) the geometry convergence criteria and can lead to more imaginary frequencies than in the gas phase. One option is to compute the vibrational contribution to $\Delta G^\circ_{\mathrm{RRHO}}$ using gas phase optimized structures.

When using solution phase geometries the gas phase single point energies needed to evaluate $\delta G\mathrm{^\circ_{solv}(X)}$ represent added computational expense and it can be tempting to use solution phase free energies to evaluate the binding free energies

$\Delta G^\circ = \Delta G\mathrm{^{\circ,CM}_{soln}}- \Delta G\mathrm{^{\circ,CM,RRHO}_{soln}}$

One problem with this approach is that $\Delta G\mathrm{^{\circ,CM}_{soln}}$, unlike $\Delta E_{\mathrm{gas}}$, is not systematically improveable due to the empirical parameterization.

Cavities. The atomic radii and corresponding cavity generation algorithms are parameterized for small molecules. For more complex molecules such as receptors this can lead to continuum solvation of regions of molecules, e.g. deep in the binding pocket, that are not accessible to the molecular solvent. Furthermore, any solvent molecule inside such pocket is likely to be quite "un-bulk-like" and not well-represented by the bulk solvent or fixed by the underlying parametrization.  However, how big an error this may introduce to the binding free energy is not really known, but certain models for the cavitation energy have been shown to give unrealistically large contributions to the binding free energy.

Explicit water molecules. Adding explicit solvent molecules to the receptor and/or ligand can potentially lead to more accurate results. For example, including explicit water molecules around ionic sites reduces the strong dependence of the solvation energy on the corresponding atomic radii. Also, "un-bulk-like" water molecules now are treated more naturally and the risk of solvating non-solvent-acesssible regions is reduced somewhat.  However, adding explicit solvent molecules increases the computational cost by increasing the CPU time needed to compute energies, perform conformational searches, and compute vibrational frequencies.

There are several approaches to include the effect of explicit solvent molecules in the binding free energy.  Bryantsev and co-workers suggest computing the solvation energy by

$\delta G^\circ_{\mathrm{solv},n}\mathrm{(X)}=\Delta G^\circ_{\mathrm{gas}}\mathrm{(X(H_2O)_n})+\delta G^\circ_{\mathrm{solv}}\mathrm{(X(H_2O)_n)}-$

$\delta G^{\circ,liq}_{\mathrm{solv}}\mathrm{((H_2O)_n)}$

where

$\Delta G^\circ_{\mathrm{gas}}\mathrm{(X(H_2O)_n})=G^\circ_{\mathrm{gas}}\mathrm{(X(H_2O)_n})-G^\circ_{\mathrm{gas}}\mathrm{(X})-G^\circ_{\mathrm{gas}}\mathrm{((H_2O)_n})$

and

$\delta G^{\circ,liq}_{\mathrm{solv}}\mathrm{((H_2O)_n)}=\delta G^\circ_{\mathrm{solv}}\mathrm{((H_2O)_n)}+RT\ln(\mathrm{[H_2O]/n})$

with "$\circ$" and "$liq$" referring to a standard state of 1 M and 55.34, respectively.  The term $RT\ln(\mathrm{[H_2O]/n})$ is the free energy required to change the standard state of (H$_2$O)$_n$ from 1 M to 55.34/n M.

Bryantsev et al. have shown that using this water cluster approach leads to a smooth convergence of the solvation free energy with respect to the cluster size $n$.  The optimum choice of $n$ is this one where an additional water does changes the solvation energy by less than a certain user defined amount.  One can thereby compute the optimum number of water molecules for the receptor ($n$), ligand ($m$) and receptor-ligand complex ($l$) and then compute the binding free energy as

$\Delta \delta G^\circ_{\mathrm{solv},x}=\delta G^\circ_{\mathrm{solv},l}\mathrm{(RL)}-\delta G^\circ_{\mathrm{solv},n}\mathrm{(L)}-\delta G^\circ_{\mathrm{solv},m}\mathrm{(R)}$

and computing $\Delta E_{\mathrm{gas}}$ and $\Delta G^\circ_{\mathrm{RRHO}}$ as before.  One can show that this corresponds to the free energy change for this reaction

$\mathrm{L(H_2O)_n(aq)+R(H_2O)_m(aq)+(H_2O)_l(liq) \rightleftharpoons RL(H_2O)_l(aq)}+$
$\mathrm{(H_2O)_n(liq)+L(H_2O)_m(liq)}$ (1)

In principle the free energy change is zero for

$\mathrm{(H_2O)_l(liq) \rightleftharpoons+(H_2O)_n(liq)+L(H_2O)_m(liq)+sgn(d)(H_2O)_{|d|}(liq)}$

where $d=l-m-n$ and sgn(d) returns the sign of d. So the free energy change for Reaction (1) can also be computed as the free energy change for

$\mathrm{L(H_2O)_n(aq)+R(H_2O)_m(aq) \rightleftharpoons RL(H_2O)_l(aq)+sgn(d)(H_2O)_{|d|}}$ (2)

However, this is only approximately true in practice due to errors in the computed gas phase and solvation free energies. Furthermore, Reaction 2 does not really lead to any significant reduction in CPU time because the water cluster free energies only have to be computed once. However, if Reaction (2) is used then one must add an additional term correcting for the indistinguishability of water molecules

$G^\circ_{\mathrm{RRHO}}\mathrm{(X(H_2O)_n)} \rightarrow G^\circ_{\mathrm{RRHO}}\mathrm{(X(H_2O)_n)}-RT\ln (n!)$

and similarly for the water clusters.  Using Reaction (1) leads to a cancellation of this term and also maximizes error cancellation in the other energy terms. Similar considerations apply to when using individual water molecules to the balance the reaction instead of water clusters

$\mathrm{(H_2O)_l(liq) \rightleftharpoons (H_2O)_n(liq)+L(H_2O)_m(liq)+sgn(d)|d|H_2O}$

When using many explicit water molecules the error in the continuum solvation energies can be reduced by ensuring that the continuum solvation energy of a single water molecule matches the experimental value of -6.32 kcal/mol at 298.15K as close as possible.


This work is licensed under a Creative Commons Attribution 4.0