This page was last updated on October 30th, 2019 at 10:36 pm
A simple re-docking tutorial
In this tutorial we illustrate how to re-dock a known ligand into its native receptor. We will use the cyclic dependent kinase protein 2 CDK2 (pdbid:4EK3) and one of its ligands (pdbid:4EK4).
In this tutorial you will learn:
- to generate a target file for a docking in a box defined by the known ligand
- to run ADFR to re-dock the ligand
- to understand the output of an ADFR docking run
Generate the target file containing the affinity maps.
Alternatively, you can use agfrgui and follow the “https://ccsb.scripps.edu/agfr/boxligand” tutorial
Details: the agfr is used to position and size the docking box over the receptor pocket(s) into which we want to dock a ligand, and compute affinity maps for a given list of AutoDock4 atoms types. The maps are calculated by the AutoGrid4 program.
arguments:
-r : specifies the receptor file
-l : specifies the ligand file
-o : specifies the name for the target files
The receptor is specified using the –r/–receptor command line option. This option is required. The position and size of the box can be specified in a variety of ways using the –b/–boxMode option. In the example above, agfr creates the box as the bounding box of the given (known) ligand atoms (-l/–ligand). By default, a padding of 4 Å is added to every side of the bounding box. The padding value can be modified using the –P/–padding option. is added on every side.
The agfr command generates a target file with a .trg extension. This file will be saved under the name specified by the –o/–output command line option. When omitted, a unique and descriptive filename will be created automatically. The target file contains the calculated affinity maps, translational points for placing the ligand in sensible places in the docking box, and meta-data about the gird size, position, spacing, receptor atoms involved, affinity maps, etc. These files can be inspected using the about command.
The translational points are computed using the AutoSite program. This program will analyze affinity maps and cluster high affinity points to identify clusters of points modeling potential binding pockets. The –p/–pocketMode allows specifying how to handle multiple clusters of affinity points representing the pockets found in the docking box. Since this option is omitted here, all clusters are merged to create a single set of translational points.
By default, maps are computed for all AutoDock4 atom types. The list of atom types for which to compute affinity maps can be set using the –m/–mapTypes option. Generating affinity maps for fewer atom types generates smaller target files and takes less time to perform the calculation. However, such a target file cannot be used for docking ligands containing atoms for which the target files does not contain the affinity map.
Running this command generates the following output (saved in ligPocket.log)
Display information about a target file
Details: the target file meta data is read and displayed.
The command produces the following output:
Dock the randomized ligand using the generated target file
Details: adfr detects the number of cores available and by default will use them all to perform 8 independent searches (–nbRuns 8) each using up to 200’000 evaluations of the scoring function (–maxEvals 2000). By default adfr performs 50 searches, each allotted 2.5 million evaluations. Typically, more complex docking problems require more searches to be performed to increase the chances to find the best possible docked pose (i.e. global minimum of the scoring function). Here we set these parameters to lower values to perform a quick run that is sufficient to illustrate the docking principles.
This calculation generates the following files:
-
4EK4_random_rigid_summary.dlg # the docking log file that captures most of what is displayed on the terminal
-
4EK4_random_rigid_out.pdbqt # the docking pose file containing the docking solutions
-
4EK4_random_rigid.dro # the docking object file that contains input, output, and meta-data for this docking run
NOTES:
- The output files are named using the ligand name followed by the job name (–jobName if specified)
- ADFR’s search procedure is stochastic, meaning that docking the same ligand into the same target twice can produce different results if different random number generator seeds are used. However, the energy landscape for this receptor and ligand is the same in both runs. If both docking runs find the global minimum of this energy landscape, the solutions produced by both runs will be the same, independently of the paths taken by the search to get there. On the other hand, searches that get trapped in a local minima, yield docking poses that differ from each other. Specifying the seeds used by the random number generator (–seed) allows reproducing a docking calculation, for a given version of the code.
The output of the command is listed below:
Here we describe line by line the messages output during the docking procedure.
- Hostname and platform architecture on which the program is running
- Date and time of execution
- ligand docked
- number of detected and used cores.
- target files used
- Reference ligand used
NOTES:
- Number of cores. By default ADFR will use all cores available to parallelize the search threads comprised in a run. Use the “-c” command line option to limit the number of used cores.
- Reference ligand. It is not uncommon to start a docking project by re-docking a known ligand to gain confidence that the docking process is working for this complex. In such a case specifying a reference ligand allows estimation of success as the RMSD of the docked poses to the reference will be listed in the output.
By default, ADFR performs 50 independent searches, i.e. 50 evolutions of a population of 100 individuals using a Genetic Algorithm (GA). In this example we intentionally reduced this number to 8 very short runs.
Lines 1-4 display a progress bar indicating the percentage of these runs that completed.
The lines below provide statistics over the termination status of these searches. ADFR implements several termination criteria in its search method. In this example all search terminated because they reached their maximum number of evaluations. The default number of evaluations is 2.5 millions and is usually never reached because of other termination criteria such as convergence of the population, meaning that there is no more diversity in the population and the chances to discover new solutions has become small, or the population still has diversity (i.e. it contains multiples competitive solutions) but none of these solution has improves over a user-defined number of generations (default 5).
Typically, you want searches to end because the population converged or there was no improvement. A result like the one shown here is a clear indication that this docking problem needs more evaluations per search (i.e. increased –maxEvals).
The next section lists the results:
In this docking run, the 8 searches lead to 3 distinct solutions, listed in the result table above. The solutions are sorted by descending predicted affinity. The top ranking solution was identified by 4 of the 8 searches (clust. size column) and the pose with the best affinity was found by search number 4 (best run column). The second best solution was found 3 times and the best pose in this cluster of 3 search results has an RMSD of 5.0 Angstroms with the top ranking solution (clust. rmsd column). If a reference ligand pose had been specified (-r/referenceLigand option), The ref. rmsd column would list the RMSD between the docked pose and the reference structure instead of listing -1.
NOTE: RMSD values are calculated using all isomorphisms between the 2 molecules, thus matching symmetry related atoms and providing a more accurate measure that used in AutoDock4, Vina, and previous versions of AutoDockFR.
Display information about a docking result
Details: the meta data about this docking run is displayed
the command
leads to 3 distinct solutions. Can you modify the command to re-dock the randomized ligand in crystallographic pose reliably?
The crystal structure of the ligand is in data/4EK4_lig.pdbqt