DockVision Cookbook

For DockVision Version 1.0.3

  Copyright 1998. The University of Alberta.
Please email questions and comments to:
support@dockvision.com

Contents

1. Introduction
2. The Recipes
2.1 Generating Ligand Conformations
2.2 Single Ligand Docking
2.3 Single Ligand Docking from a Starting Position
2.4 Single Ligand Docking using a Conformer Database
2.5 Screening of a Ligand Database

1. Introduction

The DockVision package is an integrated docking package featuring the Research and Gamma docking programs, both developed at the University of Alberta by Trevor Hart, Steven Ness and Randy Read. This document is intended as a resource to assist users to get the most out of DockVision. DockVision was designed as an easy-to-use yet flexible interface, allowing for numerous different types of simulations to be performed. Below, you'll find a number of the more common types of simulations described. However, if you don't find what you're looking for here, it may still be possible to do, so please don't hesitate to contact us at support@dockvision.com. And if you can't perform the calculation you want with DockVision now, please tell us what you want so we can incorporate it into a future version.

2. The Recipes

DockVision has been designed as tool to handle many different kinds of docking and related problems. Amoung other things, DockVision can be used for

generating ligand conformations

single ligand docking

single ligand docking from a starting position

single ligand docking using a conformer database

screening of a ligand database

There is a considerable amount of repetition between these examples, as they are presented so as to be studied separately from one another.

2.1 Generating Ligand Conformers

DockVision can be run without explicitly specifying a target protein. In effect, this allows it to be used to generate a database of conformers for the ligand, which may be used for docking (see section 2.4) or for other applications.

The basic procedure will be to read in a ligand, make it flexible, turn off all target protein effects, and then perform Monte Carlo simulated annealing to search for minima of the ligand by itself (using only the intra-molecular energy). The setting of the multiple-start procedure will control the number of conformations that are generated. These will be output into a multiple-PDB file, which can be used as input for other docking calculations.

Here are the steps to generating a conformation database:

1. Prepare your ligand coordinate file. You don't need hydrogens to generate conformations, but if you want to do docking later, you will need to add hydrogens now. Use the addh program:

		addh -i ligand.pdb -o ligand_h.pdb
assuming your starting ligand coordinate file is "ligand.pdb". This program will add only polar hydrogens, and save the result in "ligand_h.pdb". (If your ligand already has full hydrogens, then this will work also. In general, there is only a modest performance loss in having full hydrogens instead of only polar hydrogens.)

2. Run DockVision. Make sure the mode pulldown is set to "Research" (the default). In the Setup screen, choose "Research Potential Function". Type a name for your run (this will be used for the name of the .out and .log files).

3. Go to the Ligand screen. Select the ligand file with the added hydrogens ("ligand_h.pdb" generated in step 1). Under "Parameters", you want "Generate Automatically".

4. Go to the Flexibility screen. Select "Flexible Ligand". Under "Topology" choose "Generate Automatically". Under "Conformers", select "Generate Automatically". To the right, there is an input for "Num Confs". This will determine the number of randomly generated conformers which will be the starting point for the conformation search procedure. A good rule of thumb is to select this number to be equal to the total number of conformations you expect. Thus, if you want a database of 30 conformations for a particular ligand, select "Num Confs" to be 30. (However, the actual number generated will be determined elsewhere.)

5. Go to the Target screen and select "no target".

6. Go to the Grid screen and select both "No Grid" and "No Near Grid".

7. Go to the Options screen. Under "Constraint", select "Build", then enter (X,Y,Z) = (0,0,0) and "Sphere Radius" = 1000. This generates a large constraint region. Save this to a file then select "back" to go back to the main Options screen.

8. Back in the Options screen. Now under "Schedule", select "Build". Here the annealing schedule is built to control how the Monte Carlo algorithm runs. For conformation searching, we want some high-temperature steps to randomize and search configurational space, followed by low-temperature steps to minimize the configuration. The best schedule will depend on the nature of the ligand, but generally the larger the ligand and the greater the number of rotatable bonds, the greater the number steps that are required. In general a good numbers for "average" ligands would be:

   high T: Temperature=500, Steps=500,  Maximum Rotation=20e-6, Maximum Translation=0
   low  T: Temperature=1,   Steps=1000, Maximum Rotation=8e-6,  Maximum Translation=0
Enter these values, in this order, in the "Schedule Builder" screen, save in a file, and then hit "back". The reason we select "20e-6" for the rotation is due to the fact that "Maximum Rotation" controls both the rotation of the ligand as a whole as well as the torsional degrees of freedom. However, we only want the torsional degrees of freedom to move. We correct this in the next step.

9. Back in the Options screen. Select "Refine Mode" and "Conformational Energy" (so the "yellow lights" appear). Choose "torconv" to be the value "1.0e+6". This parameter controls how much the torsional degrees of freedom are changed relative to the overall rotation. In step 8, the maximum rotation was set to the values 20e-6 and 8e-6, both neglegible numbers. However, by scaling the torsions by the large number 1.0e+6, the torsions are changed by 20 degrees in the first Monte Carlo stage and 8 degrees in the second, with virtually no change in the overall orientation (thus making it easier to compare different conformations).

10. Still in the Options screen. Select "ntrials" to be roughly the number of conformations you wish to get. This parameter, along with "ecut", will determine the number output conformations that are generated. "Ntrials" will be the number of independent Monte Carlo runs to be initiated, while "ecut" is the final energy value for saving into the output file (runs generating energy values higher than this value will not be saved). Since most intra-molecular calculations with the Research Potential Function will result in positive energies, you should set "ecut" to a positive value. In general, a bit of experimentation will yield a good set of parameters. For an "average" ligand, a good starting set might be ntrials=100 and ecut=100.

11. Go to the Research screen. Select "Run". If no errors are encountered, the program may take a few minutes to complete. If your output file is empty, then select a larger (more positive) value for "ecut" and select "Run" again.

When the run has been completed, select the Tools mode. The RMS Tool, Energy Histogram and Cluster Output screens may be useful in interpreting the results.

Back to top

2.2 Single Ligand Docking

The problem of performing a single-ligand docking calculation is the principal idea behind the Research docking algorithm. We'll assume for this example that the ligand is flexible and that we don't know what the binding conformation will be (hence we have to search the translational, rotational and torsional degrees of freedom).

Here are the steps for performing a general single-ligand docking:

1. Prepare your ligand and target coordinate files. If the ligand file doesn't have polar hydrogens, add them with the addh program:

		addh -i ligand.pdb -o ligand_h.pdb
The new file "ligand_h.pdb" will be used for docking. (If your ligand already has full hydrogens, this will work also.) For the target, you may already have polar or full hydrogens added by another program. If not, run the hydroman program:
		hydroman target.pdb target_h.pdb
You can also specify the pH by using the optional "-p" flag. (This program can also be run within DockVision under "Tools/Add Hydrogens".) Unlike addh, which is a general algorithm, hydroman makes use of specific peptide characteristics and assigns hydrogens based on known properties of amino acids. If the protein target contains non-peptide residues, then place the peptide and non-peptide parts in separate files, use hydroman on the peptide part, addh on the non-peptide part, and merge the resulting files into a single target file. (If the non-peptide part does not require polar hydrogens, then simple run hydroman, as it ignores non-peptide residues.)

2. Determine the binding site. For DockVision to search effectively, the binding site must be identified. Best results are obtained when the XYZ coordinates of a point in the middle of the binding site can be identified. A simple way to do this is to use computer graphics to manually position an average-sized ligand in the middle of the site, then determine the XYZ coordinates of a central atom. If a crystallographic or modelled structure of a bound inhibitor is available, then the XYZ of a central atom will work equally well. Make a note of this value as the information will be needed later.

3. Run DockVision. Make sure the mode pulldown is set to "Research" (the default). In the Setup screen, choose "Research Potential Function". Type a name for your run (this will be used for the name of the .out and .log files).

4. Go to the Ligand screen. Select the ligand file with the added hydrogens ("ligand_h.pdb" generated in step 1). Under "Parameters", you want "Generate Automatically".

5. Go to the Flexibility screen. Select "Flexible Ligand". Under "Topology" choose "Generate Automatically". Under "Conformers", select "Generate Automatically". To the right, there is an input for "Num Confs". This will determine the number of randomly generated conformers used in the Monte Carlo search procedure. Unless the ligand has more than 8 rotatable bonds, use the default value of 100.

6. Go to the Target screen. Select the target file with the added hydrogens ("target_h.pdb" generated in step 1). Under "Parameters", you want "Generate Automatically".

7. Go to the Grid screen. Select "Auto Generate Grid" and turn off "Auto Bounds" immediately below. The form on the right should appear. Enter the XYZ values from step 2, leaving the others in their default values (0.5, 50,50,50). This generates a cubic grid centered on you XYZ coordinates, and 25 angstroms per side. Select "No Near Grid".

8. Go to the Options screen. Under "Constraint", select "Build", then enter the XYZ coordinates from step 2. Set the "Sphere Radius" = 6.0. Save and hit "back".

9. Back in the Options screen. Now under "Schedule", select "Build". Here the annealing schedule is built to control how the Monte Carlo algorithm runs. For multiple-start docking, it is usually most efficient to use only low temperature steps, since the generation of the conformers (step 5) removes the need to perform conformational searching. The following schedule is recommended:

   stage 1: Temperature=1, Steps=500, Maximum Rotation=12, Maximum Translation=2
   stage 2: Temperature=1, Steps=500, Maximum Rotation=6,  Maximum Translation=1
Enter these values, in this order, in the "Schedule Builder" screen, save in a file, and then hit "back".

10. Back in the Options screen. Select "Conformational Energy" (so the "yellow light" appears). The parameters "ntrials" and "ecut" will determine the number output conformations that are generated. "Ntrials" will be the number of independent Monte Carlo runs to be initiated. In general, a good rule of thumb is 50 trials per ligand conformer. Thus, if you selected "Num Confs" to be 100 in step 5, "ntrials" should be 5000. However, you may not have 100 independent conformations and you can get by with a much smaller number. "Ecut" is the final energy value for saving into the output file (runs generating energy values higher than this value will not be saved). The best way to determine "ecut" is to do a test run with a small value for "ntrials" (eg. 50) and then check the size of the output file. You probably don't want more than 10% of the trials saved.

11. Go to the Research screen. Select "Run". If no errors are encountered, the program may take a few minutes to complete, depending on the value of "ntrials". The progress indicator will update how far along the program is for completing the prescribed number of trials. If the output file is empty, check the value of "ecut".

When the run has been completed, select the Tools mode. The RMS Tool, Energy Histogram and Cluster Output screens may be useful in interpreting the results.

Back to top

2.3 Single Ligand Docking from a Starting Position

Docking a single ligand from a given starting position is much like general single-ligand docking (section 2.2). Here, we already have an inital position (either modelled graphically or by other means) which we want to improve and study by docking.

Here are the steps for docking from a given starting position:

1. Prepare your ligand and target coordinate files. If the ligand file doesn't have polar hydrogens, add them with the addh program:

		addh -i ligand.pdb -o ligand_h.pdb
The new file "ligand_h.pdb" will be used for docking. (If your ligand already has full hydrogens, this will work also.) For the target, you may already have polar or full hydrogens added by another program. If not, run the hydroman program:
		hydroman target.pdb target_h.pdb
You can also specify the pH by using the optional "-p" flag. (This program can also be run within DockVision under "Tools/Add Hydrogens".) Unlike addh, which is a general algorithm, hydroman makes use of specific peptide characteristics and assigns hydrogens based on known properties of amino acids. If the protein target contains non-peptide residues, then place the peptide and non-peptide parts in separate files, use hydroman on the peptide part, addh on the non-peptide part, and merge the resulting files into a single target file. (If the non-peptide part does not require polar hydrogens, then simple run hydroman, as it ignores non-peptide residues.)

2. Determine the binding site. For DockVision to search effectively, the binding site must be identified. Best results are obtained when the XYZ coordinates of a point in the middle of the binding site can be identified. Use the XYZ of a central atom in your ligand structure, since it is already well-positioned in the binding site. Specifying this value will prevent the ligand from wandering out of the site. Make a note of this value as the information will be needed later.

3. Run DockVision. Make sure the mode pulldown is set to "Research" (the default). In the Setup screen, choose "Research Potential Function". Type a name for your run (this will be used for the name of the .out and .log files).

4. Go to the Ligand screen. Select the ligand file with the added hydrogens ("ligand_h.pdb" generated in step 1). Under "Parameters", you want "Generate Automatically".

5. Go to the Flexibility screen. Select "Flexible Ligand". Under "Topology" choose "Generate Automatically". Under "Conformers", select "No Conformers". This will force DockVision to use the original conformation for the starting point for the Monte Carlo runs.

6. Go to the Target screen. Select the target file with the added hydrogens ("target_h.pdb" generated in step 1). Under "Parameters", you want "Generate Automatically".

7. Go to the Grid screen. Select "Auto Generate Grid" and turn off "Auto Bounds" immediately below. The form on the right should appear. Enter the XYZ values from step 2, leaving the others in their default values (0.5, 50,50,50). This generates a cubic grid centered on you XYZ coordinates, and 25 angstroms per side. Select "No Near Grid".

8. Go to the Options screen. Under "Constraint", select "Build", then enter the XYZ coordinates from step 2. Set the "Sphere Radius" = 6.0. Save and hit "back".

9. Back in the Options screen. Now under "Schedule", select "Build". Here the annealing schedule is built to control how the Monte Carlo algorithm runs. Since there is already a (presumably) good starting position, we are mainly interested in minimization of the ligand. Also, since we will only use one run (instead of multiple runs for a full search), we can afford a little more computation to get a more minimized structure. The following schedule is recommended for this type of calculation:

   stage 1: Temperature=1, Steps=500,  Maximum Rotation=12, Maximum Translation=2
   stage 2: Temperature=1, Steps=500,  Maximum Rotation=6,  Maximum Translation=1
   stage 3: Temperature=1, Steps=1000, Maximum Rotation=3,  Maximum Translation=0.5
Enter these values, in this order, in the "Schedule Builder" screen, save in a file, and then hit "back".

10. Back in the Options screen. Select "Refine Mode" and "Conformational Energy" (so the "yellow light" appears). "Refine Mode" will stop the usual random generation of initial position and orientation for the ligand that would be used for a full search. This, combined with the "No Conformers" selected in step 5 forces the starting position to be the one in the ligand input file. Set "ntrials" to be 1, since we just want to minimize from the starting position. Choose "ecut" to be a large number (eg 1000) since the size of the output file won't be a problem. If you ligand isn't particularly well-positioned, then you may want to use more than one trial. In this case, you may want to change your annealing schedule in step 9 to have high-temperature steps at the beginning. For most problems with a fixed starting position, you won't want ntrials to be greater than 10.

11. Go to the Research screen. Select "Run". If no errors are encountered, the program may take a few minutes to complete, depending on the value of "ntrials". The progress indicator will update how far along the program is for completing the prescribed number of trials. If the output file is empty, check the value of "ecut".

When the run has been completed, select the Tools mode. The RMS Tool, Energy Histogram and Cluster Output screens may be useful in interpreting the results.

Back to top

2.4 Single Ligand Docking using a Conformer Database

In this example, we present an alternative method to single-ligand docking, particularly useful when dealing with highly flexible ligands. Here, we pre-generate a set of conformations for the ligand. The advantage of this approach is that the ligands may be more efficiently energy-minimized, since they are minimized alone before being subjected to docking. Here we assume a conformation database was generated as in section 2.1.

1. Prepare your target coordinate file. You may already have polar or full hydrogens added by another program. If not, run the hydroman program:

		hydroman target.pdb target_h.pdb
You can also specify the pH by using the optional "-p" flag. (This program can also be run within DockVision under "Tools/Add Hydrogens".) Unlike addh, which is a general algorithm, hydroman makes use of specific peptide characteristics and assigns hydrogens based on known properties of amino acids. If the protein target contains non-peptide residues, then place the peptide and non-peptide parts in separate files, use hydroman on the peptide part, addh on the non-peptide part, and merge the resulting files into a single target file. (If the non-peptide part does not require polar hydrogens, then simple run hydroman, as it ignores non-peptide residues.)

2. Determine the binding site. For DockVision to search effectively, the binding site must be identified. Best results are obtained when the XYZ coordinates of a point in the middle of the binding site can be identified. A simple way to do this is to use computer graphics to manually position an average-sized ligand in the middle of the site, then determine the XYZ coordinates of a central atom. If a crystallographic or modelled structure of a bound inhibitor is available, then the XYZ of a central atom will work equally well. Make a note of this value as the information will be needed later.

3. Run DockVision. Make sure the mode pulldown is set to "RSDB". This is the docking program for use with databases. Type a name for your run (this will be used for the name of the .out and .log files).

4. Go to the Ligand screen. Select the file containing the conformation database.

5. Go to the Flexibility screen. Select "Flexible Ligand".

6. Go to the Target screen. Select the target file with the added hydrogens ("target_h.pdb" generated in step 1). Under "Parameters", you want "Generate Automatically".

7. Go to the Grid screen. Select "Auto Generate Grid" and turn off "Auto Bounds" immediately below. The form on the right should appear. Enter the XYZ values from step 2, leaving the others in their default values (0.5, 50,50,50). This generates a cubic grid centered on you XYZ coordinates, and 25 angstroms per side. Select "Auto Generate Near Grid" and turn off "Auto Bounds" immediately below. The same values with be used for both grids.

8. Go to the Options screen. Under "Constraint", select "Build", then enter the XYZ coordinates from step 2. Set the "Sphere Radius" = 6.0. Save and hit "back".

9. Back in the Options screen. Now under "Schedule", select "Build". Here the annealing schedule is built to control how the Monte Carlo algorithm runs. The following schedule is recommended for single-ligand databases:

   stage 1: Temperature=1, Steps=200, Maximum Rotation=12, Maximum Translation=2
   stage 2: Temperature=1, Steps=200, Maximum Rotation=6,  Maximum Translation=1
Enter these values, in this order, in the "Schedule Builder" screen, save in a file, and then hit "back". Now under "Schedule2", select "Build". This schedule gives the Monte Carlo run for the second stage, used in database docking. This is executed only once per database structure, and hence can be more intense than the first schedule (which is executed for each trial). The following is recommended:
   stage 1: Temperature=1, Steps=500, Maximum Rotation=12, Maximum Translation=2
   stage 2: Temperature=1, Steps=500, Maximum Rotation=6,  Maximum Translation=1

10. Back in the Options screen. Select "Conformational Energy" (so the "yellow light" appears). Choose "ntrials" to be 50, according to the general rule of thumb of 50 trials per conformer, since each input ligand represents a single conformation we wish to dock. Each ligand in the file will be docked with 50 trials, and there will be one output for each input. There is no "ecut" for RSDB.

11. Go to the RSDB screen. Select "Run". If no errors are encountered, the program may take a few minutes to complete, depending on the value of "ntrials" and the size of your database. The progress indicator will update how far along the program is through the database.

When the run has been completed, select the Tools mode. The RMS Tool, Energy Histogram and Cluster Output screens may be useful in interpreting the results.

Back to top

2.5 Screening a Ligand Database

In this example, we show how to use DockVision to screen a chemical database. Normally, this database will consist of a collection of different compounds, and we want to identify which ones are likely to bind well to our selected binding site.

1. Prepare your target coordinate file. You may already have polar or full hydrogens added by another program. If not, run the hydroman program:

		hydroman target.pdb target_h.pdb
You can also specify the pH by using the optional "-p" flag. (This program can also be run within DockVision under "Tools/Add Hydrogens".) Unlike addh, which is a general algorithm, hydroman makes use of specific peptide characteristics and assigns hydrogens based on known properties of amino acids. If the protein target contains non-peptide residues, then place the peptide and non-peptide parts in separate files, use hydroman on the peptide part, addh on the non-peptide part, and merge the resulting files into a single target file. (If the non-peptide part does not require polar hydrogens, then simple run hydroman, as it ignores non-peptide residues.)

The speed of the docking calculation can be considerably improved by editing the target file and keeping only those residues that are immediately involved in the binding site of interest. This can be done before or after generating hydrogens.

2. Determine the binding site. For DockVision to search effectively, the binding site must be identified. Best results are obtained when the XYZ coordinates of a point in the middle of the binding site can be identified. A simple way to do this is to use computer graphics to manually position an average-sized ligand in the middle of the site, then determine the XYZ coordinates of a central atom. If a crystallographic or modelled structure of a bound inhibitor is available, then the XYZ of a central atom will work equally well. Make a note of this value as the information will be needed later.

3. Prepare the database. DockVision requires an input in PDB format with either polar or full hydrogens. If your database is in a different format, then it will have to be converted. If you only need to add hydrogens, then use the addh program:

		addh -i database.pdb -o database_h.pdb
To convert from either SDF format or MDL/MOL format, use the dbconvert program, using one of the forms below:
		dbconvert -F sdf -i database.sdf -o database.pdb
		dbconvert -F sdf -H -i database.sdf -o database.pdb
		dbconvert -F mol -i database.mol -o database.pdb
		dbconvert -F mol -H -i database.mol -o database.pdb
The "-H" is used to add polar hydrogens if the original database was without them.

4. Run DockVision. Make sure the mode pulldown is set to "RSDB". This is the docking program for use with databases. Type a name for your run (this will be used for the name of the .out and .log files).

5. Go to the Ligand screen. Select the file containing the conformation database.

6. Go to the Flexibility screen. Select "Flexible Ligand".

7. Go to the Target screen. Select the target file with the added hydrogens ("target_h.pdb" generated in step 1). Under "Parameters", you want "Generate Automatically".

8. Go to the Grid screen. Select "Auto Generate Grid" and turn off "Auto Bounds" immediately below. The form on the right should appear. Enter the XYZ values from step 2, leaving the others in their default values (0.5, 50,50,50). This generates a cubic grid centered on you XYZ coordinates, and 25 angstroms per side. Select "Auto Generate Near Grid" and turn off "Auto Bounds" immediately below. The same values with be used for both grids.

9. Go to the Options screen. Under "Constraint", select "Build", then enter the XYZ coordinates from step 2. Set the "Sphere Radius" = 6.0. Save and hit "back".

10. Back in the Options screen. Now under "Schedule", select "Build". Here the annealing schedule is built to control how the Monte Carlo algorithm runs. The following schedule is recommended for single-ligand databases:

   stage 1: Temperature=1, Steps=100, Maximum Rotation=12, Maximum Translation=2
   stage 2: Temperature=1, Steps=100, Maximum Rotation=6,  Maximum Translation=1
Enter these values, in this order, in the "Schedule Builder" screen, save in a file, and then hit "back". Now under "Schedule2", select "Build". This schedule gives the Monte Carlo run for the second stage, used in database docking. This is executed only once per database structure, and hence can be more intense than the first schedule (which is executed for each trial). The following is recommended:
   stage 1: Temperature=1, Steps=500, Maximum Rotation=12, Maximum Translation=2
   stage 2: Temperature=1, Steps=500, Maximum Rotation=6,  Maximum Translation=1

11. Back in the Options screen. Select "Conformational Energy" (so the "yellow light" appears). Choose "ntrials" to be 100. Combined with the values chosen in step 10, this will give a total of

	ntrials*(sch_stage1 + sch_stage2) + sch2_stage1 + sch2_stage2
	= 100*(100+100) + 500 + 500 = 21,000 
energy calculations for each database member. This is about the maximum that is normally feasible for the docking to complete in a reasonable amount of computer time. There is no "ecut" for RSDB.

12. Go to the RSDB screen. Select "Run". If no errors are encountered, the program may take a some time to complete, depending on the value of "ntrials" and the size of your database. The progress indicator will update how far along the program is through the database.

When the run has been completed, use the outman program sort the output file for the highest scoring dockings. For example

	outman -n 100 -i screen.out -o screen_top100.out
will place the top 100 scoring dockings from "screen.out" and place them in "screen_top100.out".

Back to top