## Monday, June 15, 2015

### A brief introduction to basis sets

In order to compute the energy we need to define mathematical functions for the orbitals.  In the case of atoms we can simply use the solutions to the Schrödinger equation for the  $\ce{H}$ atom as a starting point and find the best exponent for each function using the variational principle.

But what functions should we use for molecular orbitals (MOs)?  The wave function of the $\ce{H2+}$ molecule provides a clue (Figure 1)

Figure 1. Schematic representation of the wave function of the $\ce{H2+}$  molecule.

It looks a bit like the sum of two 1$s$ functions centered at each nucleus (A and B),
$${\Psi ^{{\text{H}}_{\text{2}}^ + }} \approx \tfrac{1}{{\sqrt 2 }}\left( {\Psi _{1s}^{{{\text{H}}_{\text{A}}}} + \Psi _{1s}^{{{\text{H}}_{\text{B}}}}} \right)$$
Thus, one way of constructing MOs is as a linear combination of atomic orbitals (the LCAO approximation),
$${\phi _i}(1) = \sum\limits_{\mu = 1}^K {{C_{\mu i}}{\chi _\mu }(1)}$$
an approximation that becomes better and better as $K$ increases.  Here $\chi _\mu$  is a mathematical function that looks like an AO, and is called a basis function (a collection of basis functions for various atoms is called a basis set), and $C_{\mu i}$ is a number (sometimes called an MO coefficient) that indicates how much basis function $\chi _\mu$ contributes to MO $i$, and is determined for each system via the variational principle.   Note that every MO is expressed in terms of all basis function, and therefore extends over the entire molecule.

If we want to calculate the RHF energy of water, the basis set for the two $\ce{H}$ atoms would simply be the lowest energy solution to the Schrödinger equation for $\ce{H}$ atom
$${\chi _{{{\text{H}}_{\text{A}}}}}(1) = \Psi _{1s}^{\text{H}}({r_{1A}}) = \frac{1}{{\sqrt \pi }}{e^{ - \left| {{{\bf{r}}_1} - {{\bf{R}}_A}} \right|}}$$
For the O atom, the basis set is the AOs obtained from, say, an ROHF calculation on $\ce{O}$, i.e. $1s$, $2s$, $2p_x$, $2p_y$, and $2p_z$ functions from the solutions to the Schrödinger equation for the $\ce{H}$ atom, where the exponents ($\alpha_i$'s) have been variationally optimized for the $\ce{O}$ atom,
$$\Psi _{1s}^{\text{H}},\;\Psi _{2s}^{\text{H}},\;\Psi _{2p}^{\text{H}} \xrightarrow{\frac{\partial E}{\partial \alpha_i}=0} \phi _{1s}^{\text{O}},\;\phi _{2s}^{\text{O}},\;\phi _{2p}^{\text{O}} \equiv \;\left\{ {{\chi _O}} \right\}$$
Notice that this only has to be done once, i.e. we will use this oxygen basis set for all oxygen-containing molecules.  We then provide a guess at the water structure and the basis functions are placed at the coordinates of the respective atoms.  Then we find the best MO coefficients by variational minimization,
$$\frac{{\partial E}}{{\partial {C_{\mu i}}}} = 0$$
for all $i$ and $\mu$. Thus, for water we need a total of seven basis functions to describe the five doubly occupied water MOs ($K = 7$ and $i = 1-5$ in Eq 5).  This is an example of a minimal basis set, since it is the minimum number of basis functions per atoms that makes chemical sense.

One problem with the LCAO approximation is the number of 2-electron integrals it leads to, and the associated computational cost.  Let’s look at the part of the energy that comes from the Coulomb integrals
$$\begin{split} \sum\limits_{i = 1}^{N/2} {\sum\limits_{j = 1}^{N/2} {2{J_{ij}}} } &= \sum\limits_{i = 1}^{N/2} {\sum\limits_{j = 1}^{N/2} {2\left\langle {\left. {{\phi _i}(1){\phi _j}(2)} \right|\frac{1}{{{r_{12}}}}\left| {{\phi _i}(1){\phi _j}(2)} \right.} \right\rangle } } \\ &= \sum\limits_i^{N/2} {\sum\limits_j^{N/2} {2\left\langle {\left. {{\phi _i}(1){\phi _i}(1)} \right|\frac{1}{{{r_{12}}}}\left| {{\phi _j}(2){\phi _j}(2)} \right.} \right\rangle } } \\ &= \sum\limits_i^{N/2} {\sum\limits_j^{N/2} {2\left\langle {{{\phi _i}{\phi _i}}} \mathrel{\left | {\vphantom {{{\phi _i}{\phi _i}} {{\phi _j}{\phi _j}}}} \right. } {{{\phi _j}{\phi _j}}} \right\rangle } } \\ &= \sum\limits_\mu ^K {\sum\limits_\nu ^K {\sum\limits_\lambda ^K {\sum\limits_\sigma ^K {\sum\limits_i^{N/2} {\sum\limits_j^{N/2} {2{C_{\mu i}}{C_{\nu i}}{C_{\lambda j}}{C_{\lambda j}}\left\langle {{{\chi _\mu }{\chi _\nu }}} \mathrel{\left | {\vphantom {{{\chi _\mu }{\chi _\nu }} {{\chi _\lambda }{\chi _\sigma }}}} \right. } {{{\chi _\lambda }{\chi _\sigma }}} \right\rangle } } } } } } \\ &= \sum\limits_\mu ^K {\sum\limits_\nu ^K {\sum\limits_\lambda ^K {\sum\limits_\sigma ^K {\tfrac{1}{2}{P_{\mu \nu }}{P_{\lambda \sigma }}\left\langle {{{\chi _\mu }{\chi _\nu }}} \mathrel{\left | {\vphantom {{{\chi _\mu }{\chi _\nu }} {{\chi _\lambda }{\chi _\sigma }}}} \right. } {{{\chi _\lambda }{\chi _\sigma }}} \right\rangle } } } } \end{split}$$
We have roughly $(N/2)^2$ Coulomb integrals involving molecular orbitals but roughly $1/8K^4$ Coulomb integrals involving basis functions (the factor of 1/8 comes from the fact that some the integrals are identical and need only be computed once).  Using a minimal basis set $K$ = 80 for a small organic molecule like caffeine $\ce{(C8H10N4O2)}$ and results in ca 5,000,000 2-electron integrals involving basis functions! (That being said, you can perform an RHF energy calculation with 80 basis functions on your desktop computer in a few minutes.  The problem with $K^4$-scaling is that a corresponding calculation with 800 basis functions would take a few days on the same machine.  So you can forget about optimizing the structure on that machine.)  This is one of the reasons why modern computational quantum chemistry requires massive computers.  This is also the reason why the basis set size it a key consideration in a quantum chemistry project.

The 2-electron integrals also pose another problem: the basis functions defined so far are exponential functions (also known as Slater type orbitals or STOs).  2-electron integrals involving STOs placed on four different atoms do not have analytic solutions.  As a result, most quantum chemistry programs use Gaussian type orbitals (or simply Gaussians) instead of STOs, because the 2-electron integrals involving Gaussians have analytic solutions.  Obviously,
$${e^{ - \alpha {r_{1A}}}} \approx {e^{ - \beta r_{1A}^2}}$$
is a poor approximation, so a linear combination of Gaussians are used to model each STO basis function (Figure 2)
$${e^{ - \alpha {r_{1A}}}} \approx \sum\limits_i^X {{a_{i\mu }}{e^{ - {\beta _i}r_{1A}^2}}} = {\chi _\mu }$$
Figure 2. (a) An exponential function is not well represented by one Gaussian, but (b) can be well represented by a linear combination of three Gaussians.

Here the $a_{i\mu}$parameters (or contraction coefficients) as well as the Gaussian exponents are determined just once for a given STO basis function. $\chi_\mu$  is a contracted basis function and the $X$ individual Gaussian functions are called primitives.  Generally, three primitives are sufficient to represent an STO, and this basis set is known at the STO-3G basis set.  $p$- and $d$-type STOs are expanded in terms of $p$- and $d$-type primitive Gaussians [e.g. $({x_1} - {x_A}){e^{ - \beta r_{1A}^2}}$ and $({x_1} - {x_A})({y_1} - {y_A}){e^{ - \beta r_{1A}^2}}$].  An RHF calculation using the STO-3G basis set is denoted RHF/STO-3G.  Unless otherwise noted, this usually also implies that the geometry is computed (i.e. the minimum energy structure is found) at this level of theory.

Minimal basis sets are usually not sufficiently accurate to model reaction energies.  This is due to the fact that the atomic basis functions cannot change size to adjust to their bonding environment. However, this can be made possible by using some the contraction coefficients as variational parameters.  This will increase the basis set size (and hence the computational cost) so this must be done judiciously.  For example, we’ll get most improvement by worrying about the basis functions that describe the valence electrons that participate most in bonding.  Thus, for $\ce{O}$ atom we leave the 1$s$ core basis function alone, but “split” the valence 2$s$ basis function into linear combinations of two and one Gaussians respectively,
$$\begin{split} {\chi _{1s}} &= \sum\limits_i^3 {{a_{i1s}}{e^{ - {\beta _i}r_{1A}^2}}} \\ {\chi _{2{s_a}}} &= \sum\limits_i^2 {{a_{i2s}}{e^{ - {\beta _i}r_{1A}^2}}} \\ {\chi _{2{s_b}}} &= {e^{ - {\beta _{2{s_b}}}r_{1A}^2}} \\ \end{split}$$
and similarly for the 2$p$ basis functions. This is known as the 3-21G basis set (pronounced “three-two-one g” not “three-twenty one g”), which denotes that core basis functions are described by 3 contracted Gaussians, while the valence basis functions are split into two basis functions, described by 2 and 1 Gaussian each.  Thus, using the 3-21G basis set to describe water requires 13 basis functions: two basis functions on each $\ce{H}$ atom (1$s$ is the valence basis function of the H atom) and 9 basis functions on the $\ce{O}$ atom (one 1$s$ function and two each of 2$s$, 2$p_x$, 2$p_y$, and 2$p_z$).

The $\chi _{2{s_a}}$  basis function is smaller (i.e., the Gaussians have a larger exponent) than the   basis function.  Thus, one can make a function of any intermediate size by (variationally) mixing these two functions (Figure 3).  The 3-21G is an example of a split valence or double zeta basis set (zeta, ζ, is often used as the symbol for the exponent, but I find it hard to write and don’t use it in my lectures).   Similarly, one can make other double zeta basis sets such as 6-31G, or triple zeta basis sets such as 6-311G.

Figure 3. Sketch of two different sized $s$-type basis functions that can be used to make a basis function of intermediate size

As the number of basis functions ($K$ in Eq 2) increase the error associated with the LCAO approximation should decrease and the energy should converge to what is called the Hartree-Fock limit ($E_{\text{HF}}$) that is higher than the exact energy ($E_{\text{exact}}$) (Figure 4).  The difference is known as the correlation energy,
$$E_{\text{corr}}=E_{\text{HF}}-E_{\text{exact}}$$
and is the error introduced by the orbital approximation
$$\Psi ({{\bf{r}}_1},{{\bf{r}}_2}, \ldots {{\bf{r}}_N}) \approx \left| {{\phi _1}(1){{\bar \phi }_1}(2) \ldots {\phi _{N/2}}(N - 1){{\bar \phi }_{N/2}}(N)} \right\rangle$$

Figure 4. Plot of the energy as a function the number of basis functions.

However, in the case of a one-electron molecule like $\ce{H2+}$ we would expect the energy to converge to $E_{\text{exact}}$ since there is no orbital approximation.  However, if we try this with the basis sets discussed thus far we find that this is not the case (Figure 5)!

Figure 5. Plot of the energy of $\ce{H2+}$ computed using increasingly larger basis sets.

What’s going on? Again we get a clue by comparing the exact wave function to the LCAO-wave function (Figure 6).

Figure 6. Comparison of the exact wave function and one computed using the 6-311G basis set.

We find that compared to the exact result there is not “enough wave function” between the nuclei and too much at either end.  As we increase the basis set we only add $s$-type basis functions (of varying size) to the basis set.  Since they are spherical they cannot be used to shift electron from one side of the $\ce{H}$ atom to the other.  However, $p$-functions are perfect for this (Figure 7).

Figure 7. Sketch of the polarization of an s basis function by a p basis function

So basis set convergence is not a matter of simply increasing the number of basis functions, it is also important to have the right mix of basis function types.  Similarly, $d$-functions can be used to “bend” $p$-functions (Figure 8).

Figure 8. Sketch of the polarization of a p basis function by a d basis function

Such functions are known as polarization functions, and are denoted with the following notation. For example, 6-31G(d) denotes d polarization functions on all non-$\ce{H}$ atoms and can also be written as 6-31G*.  6-31G(d,p) is a 6-31G(d) basis set where p-functions have been added on all $\ce{H}$ atoms, and can also be written 6-31G**.  A RHF/6-31G(d,p) calculation on water involves 24 basis functions: 13 basis functions for the 6-31G part (just like for 3-21G) plus 3 $p$-type polarization functions on each H atom and 5 $d$-type polarization functions (some programs use 6 Cartesian d-functions instead of the usual 5).

Anions tend to have very diffuse electron distributions and very large basis functions (with very small exponents) are often needed for accurate results.  These diffuse functions are denoted with “+” signs: e.g. 6-31+G denotes one s-type and three $p$-type diffuse Gaussians on each non-$\ce{H}$ atom, and 6-31++G denotes the addition of a single diffuse $s$-type Gaussian on each $\ce{H}$-atom. Diffuse functions also tend to improve the accuracy of calculations on van der Waals complexes and other structures where the accurate representation of the outer part of the electron distribution is important.

Of course there are many other basis sets available, but in general they have the same kinds of attributes as described already.  For example, aug-cc-pVTZ is a more modern basis set: “aug” stands for “augmented” meaning “augmented with diffuse functions”, “pVTZ” means “polarized valence triple zeta”, i.e. it is of roughly the same quality as 6-311++G(d,p).  “cc” stands for “correlation consistent” meaning the parameters were optimized for correlated wave functions (like MP2, see below) rather than HF wave function like Pople basis sets [such as 6-31G(d)] described thus far.

Related blog posts
Complete basis set extrapolation