Sunday, August 9, 2015

A very brief introduction to the electron correlation energy



RHF is often not accurate enough for predicting the change in energies due to a chemical reaction, no matter how big a basis set we use.  The reason is the error due to the molecular orbital approximation
$$ \Psi ({{\bf{r}}_1},{{\bf{r}}_2}, \ldots {{\bf{r}}_N}) \approx \left| {{\phi _1}(1){{\bar \phi }_1}(2) \ldots {\phi _{N/2}}(N - 1){{\bar \phi }_{N/2}}(N)} \right\rangle  \equiv \Phi $$
and the energy difference due to this approximation is known as the correlation energy.  Just like we improve the LCAO approximation by including more terms in an expansion, we can improve the orbital approximation by an expansion, in terms of Slater determinants
$$\Psi ({{\bf{r}}_1},{{\bf{r}}_2}, \ldots {{\bf{r}}_N}) \approx \sum\limits_{i = 1}^L {{C_i}{\Phi _i}({{\bf{r}}_1},{{\bf{r}}_2}, \ldots {{\bf{r}}_N})} $$
The “basis set” of Slater determinants $\{\Phi_i \}$ is generated by first computing an RHF wave function $\{\Phi_0 \}$ as usual, which also generates a lot of virtual orbitals, and then generating other determinants with these orbitals.  For example, for an atom or molecule with two electrons the RHF wave function is  $\left| {{\phi _1}{{\bar \phi }_1}} \right\rangle $ and we have $K-1$ virtual orbitals (${\phi _2}, \ldots ,{\phi _K}$ , where $K$ is the number of basis functions), which can be used to make other Slater determinants like $\Phi _1^2 = \left| {{\phi _1}{{\bar \phi }_2}} \right\rangle $ and $\Phi _{11}^{22} = \left| {{\phi _2}{{\bar \phi }_2}} \right\rangle $  (Figure 1).

Figure 1. Schematic representation of the electronic structure of some of the determinants used in Equation 3

Conceptually (in analogy to spectroscopy), an electron is excited from an occupied to a virtual orbital: $\left| {{\phi _1}{{\bar \phi }_2}} \right\rangle$ represents a single excitation and $\left| {{\phi _2}{{\bar \phi }_2}} \right\rangle $  a double excitation.  For systems with more than two electrons higher excitations (like triple and quadruple excitations) are also possible.  In general
$$\Psi  \approx {C_0}{\Phi _0} + \sum\limits_a {\sum\limits_r {C_a^r\Phi _a^r} }  + \sum\limits_a {\sum\limits_b {\sum\limits_r {\sum\limits_s {C_{ab}^{rs}\Phi _{ab}^{rs}} } } }  +  \ldots $$
The expansion coefficients can be found using the variational principle
$$\frac{{\partial E}}{{\partial {C_i}}} = 0 \ \textrm{for all} \ i$$
and this approach is called configuration interaction (CI).  The more excitations we include (i.e. increase L in Eq 2.12.1) the more accurate the expansion and the resulting energy becomes.  If the expansion includes all possible excitations (known as a full CI, FCI) then we have a numerically exact wave function for the particular basis set, and if we use a basis set where the HF limit is reached then we have a numerically exact solution to the electronic Schrödinger equation!  That’s the good news …

The bad news is that the FCI “basis set of determinants” is much, much larger than the LCAO basis set (i.e. $L >> K$),
$$L = \frac{{K!}}{{N!(K - N)!}}$$
where $N$ is the number of electrons.  Thus, an RHF/6-31G(d,p) calculation on water involves 24 basis functions and roughly $\tfrac{1}{8}K^4$ = 42,000 2-electron integrals but a corresponding FCI/6-31G(d) calculation involves nearly 2,000,000 Slater determinants.

Just like finding the LCAO coefficients involves the diagonalization of the Fock matrix, finding the CI coefficients (Ci) and the lowest energy also involves a matrix diagonalization.
$$\bf{E} = {{\bf{C}}^t}{\bf{HC}}$$
where $\bf{E}$ is a diagonal matrix whose smallest value ($E_0$) corresponds to the variational energy minimum.  While the Fock matrix is a $K \times K$ matrix, the CI Hamiltonian ($\bf{H}$) is an $L \times L$ matrix.  Just holding the 2 million by 2 million matrix for the water molecule using the 6-31G(d,p) basis set requires millions of gigabites! Clever programming and large computers actually makes a FCI/6-31G(d,p) calculation on $\ce{H2O}$ possible, but FCI is clearly not a routine molecular modeling tool. 

Using, for example, only single excitations (called CI singles, CIS)
$${\Psi ^{CIS}} = {C_0}{\Phi _0} + \sum\limits_a {\sum\limits_r {C_a^r\Phi _a^r} } $$
is feasible, however is doesn’t result in any improvement.  The CIS Hamiltonian has three kinds of contributions
$$\begin{split}
& \langle \Phi _0\left| {\hat H} \right| \Phi_0 \rangle = E_{RHF}\\
\langle\Phi^{CIS}\left| {\hat H} \right| \Phi^{CIS} \rangle  \rightarrow & \langle \Phi _0\left| {\hat H} \right| \Phi_a^r \rangle = F_{ar} = 0 \\
& \langle \Phi _a^r\left| {\hat H} \right| \Phi_a^r \rangle
\end{split}$$
which means that when this matrix is diagonalized $E_0=E_{RHF}$.  Thus CIS does not give us any correlation energy.  However, CIS is not completely useless.  The second lowest value of $\bf{E}$, $E_1$, represents the energy of the first excited state, at roughly an RHF quality.

Thus, we need at least single and double excitations (CISD)
$$\Psi  \approx {C_0}{\Phi _0} + \sum\limits_a {\sum\limits_r {C_a^r\Phi _a^r} }  + \sum\limits_a {\sum\limits_b {\sum\limits_r {\sum\limits_s {C_{ab}^{rs}\Phi _{ab}^{rs}} } } } $$
to get any correlation energy.  However, in general including doubles already results in an $\bf{H}$ matrix that is impractically large for a matrix diagonalization.  CI, i.e. finding the $C_i$ coefficients using the variational principle, is therefore rarely used to compute the correlation energy.

Perhaps the most popular means of finding the $C_i$’s is by perturbation theory, a standard mathematical technique in physics to compute corrections to a reference state (in this case RHF).  Perturbation theory using this reference is called Møller-Plesset pertubation theory, and there are several successively more accurate and more expensive variants: MP2 (which includes some double excitations), MP3 (more double excitations than MP2), and MP4 (single, double, triple, and some quadruple excitations).

Another approach is called coupled cluster which has a similar hierarchy of methods, such as CCSD (singles and doubles) and CCSD(T) (CCSD plus an estimate of the triples contributions).  In terms of accuracy vs expense, MP2 is the best choice of a cheap correlation method, followed by CCSD, and CCSD(T).  For example, MP4 is not too much cheaper than CCSD(T), but the latter is much more accurate.  In fact for many practical purposes it is rarely necessary to go beyond CCSD(T) in terms of accuracy, provided a triple-zeta or higher basis set it used.  However, CCSD(T) is usually too computationally demanding for molecules with more than 10 non-hydrogen atoms.  In general, the computational expense of these correlated methods scale much worse than RHF with respect to basis set size: MP2 ($K^5$), CCSD ($K^6$), and CCSD(T) ($K^7$).  These methods also require a significant amount of computer memory, compared to RHF, which is often the practical limitation of these post-HF methods.  Finally, it should be noted that all these calculations also imply an RHF calculation as the first step.

In conclusion we now have ways of systematically improving the wave function, and hence the energy, by increasing the number of basis functions ($K$) and the number of excitations ($L$) as shown in Figure 2.

Figure 2 Schematic representation of the increase in accuracy due to using better correlation methods and larger basis sets.

The most important implication of this is that in principle it is possible to check the accuracy of a given level of theory without comparison to experiment!  If going to a better correlation method or a bigger basis set does not change the answer appreciably, then we have a genuine prediction with only the charges and masses of the particles involved as empirical input.  These kinds of calculations are therefore known as ab initio or first principle calculations.  In practice, different properties will converge at different rates, so it is better to monitor the convergence of the property you are actually interested in, than the total energy.  For example, energy differences (e.g. between two conformers) converge earlier than the molecular energies. Furthermore, the molecular structure (bond lengths and angles) tends to converge faster than the energy difference.  So it is common to optimize the geometry at a low level of theory [e.g. RHF/6-31G(d)] followed by an energy computation (a single point energy) at a higher level of theory [e.g. MP2/6-311+G(2d,p)].  This level of theory would be denoted MP2/6-311+G(2d,p)//RHF/6-31G(d).

Finally, the correlation energy is not just a fine-tuning of the RHF result but introduces an important intermolecular force called the dispersion energy.  The dispersion energy (also known as the induced dipole-induced dipole interaction) is a result of the simultaneous excitation of at least two electrons and is not accounted for in the RHF energy.  For example, the stacked orientation of base pairs in DNA is largely a result of dispersion interactions and cannot be predicted using RHF.

No comments: