This page was last updated on October 30th, 2019 at 11:03 pm

Docking a flexible ligand into a receptor with flexible side chains

In this tutorial we illustrate how to re-dock a known ligand into its native receptor. We will use the cyclic dependent kinase protein 2 CDK2 (pdbid:4EK3) and one of its ligands (pdbid:4EK4).

In this tutorial you will learn:

  • to compute affinity maps for a pocket with flexible receptor side chains
  • to run ADFR to re-dock the ligand into the receptor with 3 flexible side chains

Generate the target file containing the affinity maps.

flexible residues can be specified on the command line using the -f/–flexRes option with a selection string e.g. “A:ILE10,VAL32;B:SER48“.

Copy to Clipboard

Details: a target file for pocket of PDB id 1EK3 binding the crystallographic ligand can be computed using:

Copy to Clipboard

However, agfr will fail to generate the target file in this case because the docking box defined with the default padding of 4 Angstroms around the ligand is too small to cover the flexible side chains

To correct this we will increase the padding to 8.0. For reducing calculation time for the tutorial we also tell adfr to only compute maps for atom types present in the ligand ( -m ligand), and use the AutoSite 1.0 algorithm -as.

Copy to Clipboard

-r : specifies the receptor file

-l : specifies the ligand file

-f : indicates the receptor side chains to be made flexible. These atoms will not contribute to the calculation of the grids. To describe residues in different chains use a semi-colon between residues from different chains”;” e.g. “A:PHE80 ; B:LYS89”.

-m : specifies that maps should only be calculated for atom types present in the ligand

-o : specifies the name of the target file to create

The command produces the following output:

Display the meta-data from a target file

Copy to Clipboard

Details:

Perform the docking

Copy to Clipboard

Details: ADFR will dock the ligand into receptor while treating the 2 side chains A:ILE10,LYS33 as flexible. adfr detects the number of cores available and by default will use them all to perform 8 independent searches (–nbRuns 8) each using up to 20’000 evaluations of the scoring function (–maxEvals 2000). By default adfr performs 50 searches, each allotted 2.5 million evaluations. Typically, more complex docking problems require more searches to be performed to increase the chances to find the best possible docked pose (i.e. global minimum of the scoring function). Here we set these parameters to lower values to perform a quick run that is sufficient to illustrate the docking principles.

Running this command generates the following 3 files.

  • 4EK4_random_flexRec_summary.dlg : Docking log file. captures most of the messages printed to stdout and lists additional clustering information
  • 4EK4_random_flexRec_out.pdbqt       : Multi-model pose file, listing the solutions
  • 4EK4_random_flexRec.dro                     : Docking Result Object file, containing the input, output and meta-data for this docking

NOTES:

  • The output files are named using the ligand name followed by the job name (–jobName if specified)
  • ADFR’s search procedure is stochastic, meaning that docking the same ligand into the same target twice can produce different results if  different random number generator seeds are used. However, the energy landscape for this receptor and ligand is the same in both runs. If both docking runs find the global minimum of this energy landscape, the solutions produced by both runs will be the same, independently of the paths taken by the search to get there. On the other hand, searches that get trapped in a local minima, yield docking poses that differ from each other. Specifying the seeds used by the random number generator (–seed) allows reproducing a docking calculation, for a given version of the code.

The output of the command is listed below:

Here we describe line by line the messages output during the docking procedure.

  1. Hostname and platform architecture on which the program is running
  2. Date and time of execution
  3. ligand docked
  4. number of detected and used cores.
  5. target files used

NOTES:

  • Number of cores. By default ADFR will use all cores available to parallelize the search threads comprised in a run. Use the “-c” command line option to limit the number of used cores.

By default, ADFR performs 50 independent searches, i.e. 50 evolutions of a population of 100 individuals using a Genetic Algorithm (GA). In this example we intentionally reduced this number to 8 very short runs.

Lines 1-4 display a progress bar indicating the percentage of these runs that completed.

The lines below provide statistics over the termination status of these searches. ADFR implements several termination criteria in its search method. In this example all search terminated because they reached their maximum number of evaluations. The default number of evaluations is 2.5 millions and is usually never reached because of other termination criteria such as convergence of the population, meaning that there is no more diversity in the population and the chances to discover new solutions has become small, or the population still has diversity (i.e. it contains multiples competitive solutions) but none of these solution has improves over a user-defined number of generations (default 5).

Typically, you want searches to end because the population converged or there was no improvement. A result like the one shown here is a clear indication that this docking problem needs more evaluations per search (i.e. increased –maxEvals).

The next section lists the results:

In this docking run, the 8 searches lead to 4 distinct solutions, listed in the result table above. The solutions are sorted by descending predicted affinity. The top ranking solution was identified by 4 of the 8 searches (clust. size column) and the pose with the best affinity was found by search number 3 (best run column). The second best solution was found 1 time and has an RMSD of 2.8 Angstroms with the top ranking solution (clust. rmsd column). If a reference ligand pose had been specified (-r/reference option), The ref. rmsd column would list the RMSD between the docked pose and the reference structure instead of listing -1.

NOTE: RMSD values are calculated using all isomorphisms  between the 2 molecules, thus matching symmetry related atoms and providing a more accurate measure that used in AutoDock4Vina, and previous versions of AutoDockFR.