This is a technical tutorial for the
calculation of the relative dissociation free energies (RDFE) for the
SAMPL9/WP6 host-guest systems on high performing architectures
using a nonequilibrium alchemical approach with the GROMACS code. The
theoretical background of the methodology can be
found
here . As an example, we shall calculate the the RDFE between the
compounds g01-g02 , two chemically
distant ligands, differing in charge, shape, volume and with a small Tanimoto index.
Here, each of the thirteen directories for the guests of the SAMPL9/WP6 challenge
(g01, g02, ..., g13
) contains three subdirectories
(b, u, g
) where we stored the PDB configurations of the WP6-bound
guest (200), of the guest in bulk solvent (125), and of the
gas-phase (200) guest. These decorrelated equilibrium conformations for the end-states can be obtained with
replicates of ordinary MD simulations or using Hamiltonian Replica
Exchange Method (HREM). For using HREM with GROMACS in ligand-protein or
host-guest systems, see the excellent PLUMED
tutorial at the PLUMED
official site or go through the tutorial
for absolute dissociation free energy calculations with vDSSB in GROMACS.
itp
files of the host and of the guest, and of OPC3 water.
bin
contains ancillary scripts for pre- and post- processing.
data
contains gromacs-computed RDFEs for the SAMPL9 batch.
workdata_example
contains example of work data.
doc
contains this tutorial.
bin
directory and issue the command
$ source SOURCE_THIS_FILE.bashThis will activate all the scripts in the directory. N.B.
gfortran
and GROMACS
must be installed.
$ make_rbfe_dir.bash g01 g02 b $ make_rbfe_dir.bash g01 g02 u
This will produce the directories b-g01-g02
and u-g01-g02
with topology, ghost
and coupled top/itp
files for the dual
g01-g02 topology and the starting configurations (PDB files
stored in the new b-g01-PDBS
and
u-g01-PDBS
subdirectories) with ghost and the ligand in the
bound and unbound states, respectively. Note that you do not need to
run an equilibrium simulation with the ghost and the fully
coupled ligand to produce the starting states for the NE alchemical
transitions. These initial states can be straightforwardly obtained by
combining equilibrium configurations of the gas-phase ghost with
equilibrium configuration of the fully coupled (bound or unbound)
ligand, by making coincident the centers of mass of the two ligands.
This is precisely what make_rbfe_dir.bash
does using the
conformations in the g??/[gub]
directories.
b-g01-g02
and issue the command
$ maketpr.bashIf everything is OK, you should see a list of ending quotes from GROMACS. This will produce 200 subdirectories with the corresponding
topol.tpr
files for the bound state using the
stored initial configurations in the
b-g01-PDBS
subdirectory.
u-g01-g02
directory and issue the command
$ maketpr.bash
This will produce 125 subdirectories with
the topol.tpr
files for the unbound state using the stored
initial configurations in the
u-g01-PDBS
subdirectory. Again you should see a list of 125 random GROMACS quotes on the standard output
while the subdirectories are generated.
bin
directory, we provide the
code make_submit_slr.bash
for generating
a SLURM
submission script for the
M100 HPC
system at CINECA.
b-g01-g02
directory issue the commands:
$ make_submit_slr.bash 50 $ sbatch submit.slrThis will generate and submit the submission script
submit.slr
requesting 50 nodes for
a total of 200 GPUs.
u-g01-g02
directory issue the commands
$ make_submit_slr.bash 31 $ sbatch submit.slrThis will generate and submit the submission script
submit.slr
requesting 31 nodes for
a total of 124 GPUs.
NE-RDFE
,
give the command:
$ RDFE.bash -b 1000 -u 500 g01 g02
The application script
RDFE.bash
(that you find in the bin
directory) can be used to evaluate the work from the
dhdl.xvg
files produced in the previous step, as well as to
compute the corresponding bidirectional estimates of the relative
dissociation free energy. The above command will produce the
following files in a subdirectory named Datarel
.
g01-g02_b_1000.wrk g01-g02_u_500.wrk g02-g01_b_1000.wrk g02-g01_u_500.wrk
The name of these files are indicative of the pair, the state (bound o unbound) and the duration of transition specified in the
b.mdp
and u.mdp
input files
(1000 ps for the bound state and 500 ps for the unbound state).
These files are all you need to produce your estimate for the RDFE between
the pair g01-g02 . From the main directory, re-issue the RDFE.bash
command with the option -f
:
$ RDFE.bash -b 1000 -u 500 -f g01 g02
Here, RDFE.bash
computes the RDFE for the transition of
g02 (initially fully coupled) to g01 (initially fully decoupled). For usage
of the RDFE.bash
script, just issue the command with no argumets or options.
You will get on the standard output a line with something like
Dir=Datarel Sign=1 Time_u=500 Time_b=1000 xs =3.3 g02->g01 DG_bar= -4.71 0.41 DG_ff_G= -5.4 0.8 DG_ff_J= -4.7 0.4 sig_AB_u= 0.80 sig_AB_b= 2.12 ADT_AB_u= 0.30 ADT_AB_b= 0.65 DG_rr_G= -1.8 1.2 DG_rr_J= -4.3 0.5 sig_BA_u= 1.23 sig_BA_b= 2.96 ADT_BA_u= 3.18 ADT_BA_b= 0.23 DG_fr_G= -5.85 0.65 DG_fr_J= -4.45 0.28 bias_fr= 0.0 DG_rf_G= -1.50 1.14 DG_rf_J= -5.53 0.55 bias_rf= 0.9 DG_BAR= -4.61 0.0 tb= 1000 tu= 500
The first entry, DG_bar= -4.71
, refers to
the
BAR computed RDFE in kcal/mol between g01 and g02. The negative value
indicates that g02 is a stronger binder than
g01 . There is no need for further correction to this RDFE, as
GROMACS automatically accounts in the dhdl
derivatives
for the volume-dependent contribution due to the charge change in the
alchemical transition (more specifically, this contribution is
evaluated via the function ewald_charge_correction
in
the ewald
module of the GROMACS code).
The output of the RDFE.bash
command contains a lot
of info and looks pretty awkward indeed. The meaning of the various
entries is described here . To get a
nicer output, save the output of the RDFE.bash
command
to a file (e.g. TABLE
), make a file
(e.g. fields
) with the entries (e.g. DG_bar DG_ff_J
DG_rr_J sig_AB_b sig_BA_b sig_AB_u sig_BA_u
) that you want to
tabulate and issue the command.
cat fields TABLE | make_table.awk
If you do so on the full table (TABLE_RDFE
that you find
in the data
directory) listing 21 RDFEs involving the
thirteen guests of the SAMPL9 challenge, you will get
this output . There, we have chosen to
print the BAR bidirectional estimate, the Jarzynski RDFE estimate for
the forward process (namely g02 into g01 ), the Jarzynski estimate
from the reverse process, the standard deviation forward and reverse
in the bound and unbound states. These latter quantities are related
to the dissipation of the processes and ultimately to the uncertainty
on the estimates.
NE-RDFE in a (unix) nutshell
In the following we provide the full sequence of the commands to compute a NE-RDFE on the M100 HPC system:unzip NE-RDFE.zip cd NE-RDFE cd bin # activate the scripts source SOURCE_THIS_FILE.bash # g02 to g01 transmutation in the b and u states cd ../ make_rbfe_dir.bash g01 g02 b make_rbfe_dir.bash g01 g02 u cd b-g01-g02 maketpr.bash make_submit_slr.bash 50 sbatch submit.slr cd ../u-g01-g02 maketpr.bash make_submit_slr.bash 31 sbatch submit.slr cd ../ # g01 to g02 transmutation in the b and u state make_rbfe_dir.bash g01 g02 b make_rbfe_dir.bash g01 g02 u cd b-g01-g02 maketpr.bash make_submit_slr.bash 50 sbatch submit.slr cd ../u-g01-g02 maketpr.bash make_submit_slr.bash 31 sbatch submit.slr cd ../... Wait for HPC Jobs to Finish ...# compute the RDFE RDFE.bash -b 1000 -u 500 -f g01 g02