Wednesday, May 25, 2016

Automating calculations: pKa predictions

I want to compute the standard free energy difference for proton exchange between heliotridane and a molecule which I'll call compound 1 (comp1)

heliotridaneH+ + comp1 <=> heliotridane + comp1H+

To do this I need to geometry optimize in the gas phase and then do a single point calculation using a solvation model, all using GAMESS. I want to automate this so I can do it for comp2, comp3, ...

To do this I use bash and the python programs listed below

Step 1 make a file file called smiles.list which contains the name and SMILES strings, which defines both the molecule, the protonation state, and the conformation around the protonated N. I still need to automate this step.

Step 2  make so-called header files for the geometry optimization (pm6_opt.header) and solvation energy (pm6_sol.header) calculations.  These files contain the relevant keywords for the calculations.

Step 3 convert the SMILES strings to 3D coordinates (xyz files).

python smiles2coord.py smiles.list

This also creates som sdf files which I can now remove (rm *.sdf).


Step 4 convert the xyz files to GAMESS input files for geometry optimization
When naming the molecules in smiles.list I was carefull to label all the protonated forms with a "+".

for x in *.xyz; do babel -ixyz $x -ogamin ${x%.*}_opt.inp -xf pm6_opt.header; done

I could have modified smiles2coord_lis.py to generate these files directly, but I like to keep things modular.  At this point I can also get rid of the xyz files (xyz files).

Step 5 change the charge for the protonated forms.

sed -i 's/icharg=0/icharg=1/g' *+*.inp


Step 6 submit the GAMESS calculations.

for x in *_opt.inp; do submit_gamess $x; done

My GAMESS submit script is called "submit_gamess"

Step 7 wait til all the optimizations are done and that they all converged.

grep EQUILIBRIUM *.log

Step 8 transfer the optimized geometry to a new input file for the solvation energy calculations

for x in *_opt.log; do babel -igamout $x -ogamin ${x%.*}_sol.inp -xf pm6_sol.header; done

Step 9 change the charge for the protonated forms.

sed -i 's/icharg=0/icharg=1/g' *+*_sol.inp

Step 10 submit the GAMESS calculations.

for x in *_sol.inp; do submit_gamess $x; done

Step 11 wait til all the optimizations are done and that the calculations finished normally.

grep NORMALLY *_sol.log

Step 12 extract the last heat of formation in each file and put them in a file called energies

python readv2.py *._sol.log > energies

Step 13 find the lowest energy for each conformer and compute the relative energies and pKa values
python pka.py energies

Files referred to above

import sys
pKa_ref = 11.4
ref = "heliotridane"
filename = sys.argv[1]
file = open(filename, "r+")
energies = {}
log_files = {}
names = []
for line in file:
words = line.split()
log_file = words[0]
ion = words[0].split("_")[0]
name = ion[:-1]
energy = float(words[1])
if name not in names:
names.append(name)
if ion not in energies:
energies[ion] = energy
log_files[ion] = log_file
elif energy < energies[ion]:
energies[ion] = energy
log_files[ion] = log_file
delta_G_ref = delta_G = energies[ref+"+"] - energies[ref+"-"]
for name in names:
delta_G = energies[name+"+"] - energies[name+"-"]
pKa = pKa_ref + (delta_G_ref - delta_G)/1.36
print name, pKa, log_files[name+"+"], log_files[name+"-"]
view raw pka.py hosted with ❤ by GitHub
$contrl runtyp=optimize icharg=0 $end
$basis gbasis=pm6-d3h+ $end
$statpt nstep=500 $end
view raw pm6_opt.header hosted with ❤ by GitHub
$contrl icharg=0 $end
$basis gbasis=PM6-D3H+ $end
$pcm solvnt=WATER mxts=15000 smd=.T. $end
$tescav mthall=4 ntsall=60 $end
view raw pm6_sol.header hosted with ❤ by GitHub
"""
type "python readv2.py *.log" to use
This program finds the last heat of formation in all GAMESS log files
for semiempirical geometry optimizations
"""
import sys
for filename in sys.argv[1:]:
for line in reversed(open(filename,'r').readlines()):
words = line.split() # split line into words
if len(words) < 1:
continue
if words[0] == 'HEAT':
heat = words[-2]
print filename, heat
break
view raw readv2.py hosted with ❤ by GitHub
heliotridane-a C[C@H]1CC[N@]2CCC[C@@H]12
heliotridane-b C[C@H]1CC[N@@]2CCC[C@@H]12
heliotridane+a C[C@H]1CC[N@H+]2CCC[C@@H]12
heliotridane+b C[C@H]1CC[N@@H+]2CCC[C@@H]12
comp1-a C[C@H]1[C@@H]2[C@@H](C(N(C)C2=O)=O)C3[N@]1CCC3
comp1-b C[C@H]1[C@@H]2[C@@H](C(N(C)C2=O)=O)C3[N@@]1CCC3
comp1+a C[C@H]1[C@@H]2[C@@H](C(N(C)C2=O)=O)C3[N@H+]1CCC3
comp1+b C[C@H]1[C@@H]2[C@@H](C(N(C)C2=O)=O)C3[N@@H+]1CCC3
view raw smiles.list hosted with ❤ by GitHub
import subprocess,sys
filename = sys.argv[1]
file = open(filename, "r+")
babel = '/opt/openbabel/openbabel-master/bin/babel'
for line in file:
words = line.split()
name = words[0]
smiles = words[1]
smiles = smiles.replace("#","%23")
smiles = smiles.replace("[","%5B")
smiles = smiles.replace("]","%5D")
sdffile = name+".sdf"
xyzfile = name+".xyz"
print sdffile,xyzfile
url="https://cactus.nci.nih.gov/chemical/structure/"+smiles+"/sdf"
subprocess.call(['curl',url, '-o',sdffile])
subprocess.call([babel, '-isdf', sdffile, '-oxyz', xyzfile])
view raw smiles2coord.py hosted with ❤ by GitHub




This work is licensed under a Creative Commons Attribution 4.0

No comments: