DockVision

An Integrated Software Package for Molecular Docking

Version: 1.0.3

Developed by: Trevor N. Hart and Steven R. Ness, The University of Alberta

Assisted by: Randy J. Read, The University of Alberta

  Copyright, 1997, 1998. The University of Alberta.
Please email comments and questions to: support@dockvision.com

Contents

1. Introduction
2. Fundamental Concepts
3. Graphical Interface
4. Research
5. Gamma
6. RSDB
Appendix I: Simulation Parameter Files
Appendix II: File Formats/Utility Programs
Appendix III: Covalent Docking
Appendix IV: DockVision Errors
Appendix V: Converting Databases
Appendix VI: Handling Large Output Files

1. Introduction

The DockVision package is an integrated docking package featuring the Research and Gamma docking programs, both developed at the University of Alberta by Trevor Hart, Steven Ness and Randy Read. In addition to the two docking programs, many utility programs are integrated into the interface, both for setting up docking calculations and for analyzing the results. The graphical interface is designed to lead the user through the essential steps required to set up and run a calculation, perform job control operations and provide for analysis of the generated data.

1.1 Research

Research is a docking program designed to simulate an exhaustive search of a specified region on the surface of a protein or other target macromolecule (see Proteins, Suppl. 1:205-209, 1997). The method is based on Monte Carlo simulated annealing using a variety of score functions, including a grid-based steric score function and a pairwise energy function. Research requires that a set of parameters be specified, which control how it performs calculations. In most cases, these parameters refer to data files supplied by the user; however in a few cases they refer to files that must be generated by utility programs before Research is actually run. These utility programs can be run within the DockVision interface, or as standalone programs.

Research uses a novel hybrid Monte Carlo algorithm to generate energetically favourable positions for a ligand in the specified search region around the target protein. The ligand is first randomly positioned (in both position and orientation) within the search region. If flexible, a random conformer (specified by a pre-calculated conformer set, see below) is selected to facilitate search of conformational space. The ligand is then checked for steric clashes with the target using the floating algorithm (see "floating algorithm"). If a steric clash is present, then the floating algorithm is run to remove steric clashes. The floating algorithm is a Monte Carlo procedure using a grid-based steric score function. Once steric clashes are removed, an Energy-based Monte Carlo procedure searches for optimal binding modes for the ligand. A typical docking problem involves between 100 and 10,000 independent runs to completely search all possible binding modes (the number depends on the size of the search region, the size and flexibility of the ligand and other factors). All dockings whose energies fall below a specified energy cutoff are saved into an output file. The energy profile, clustering and other statistics of the output file may be analyzed using the commands available in the DockVision analysis menu (see "analysis").

1.2 Gamma

Gamma is a docking program that employs a hybrid Evolutionary Algorithm method to search ligand configurational space around a target protein. Gamma and Research have many common features, including conformer sets, constraint sets, topology, atom and charge libraries and the floating algorithm. Unlike classical genetic algorithm approaches, which represent the ligand position, orientation and conformation by a mapping from a set of bit strings, Gamma uses these basic ligand parameters as its "genome" for mutation and crossover operations. This allows for a wide variety of operator types that have been specifically designed to search and optimize ligands in a binding site. Several subpopulation schemes are implemented to ensure population diversity and avoid premature convergence to non-optimal local minima.

1.3 RSDB

RSDB (ReSearch DataBase) is a version of Research adopted for rapid screening of structural databases. The principal difference is that RSDB uses a rapid score function based on hydrogen bond formation and steric complementarity instead of the more computationally expensive energy calculation. RSDB has fewer input parameters because many aspects of the calculation are determined on the fly, as RSDB is designed to assign parameters for a large heterogeneous database of structures. Each compound in the database (which is in PDB format) is first docked rigidly to determine the mostly likely general position of the ligand. The best trial is then refined, allowing for flexibility, to a final position. This algorithm is designed to screen a database for ligands that are likely to fit in the selected binding pocket.

2. Fundamental Concepts

2.1 Energy Calculation

The energy calculation in Research and Gamma is a pairwise calculation between atoms in the ligand and the protein with terms for van der Waals and electrostatics. A distance cutoff is employed and atoms in both the ligand and the protein are organized into charge groups. Intramolecular energies for the ligand are optional: if they are not employed, a simple steric clashes detection algorithm is used to ensure that the ligand conformation does not become unreasonable.

There are currently two forcefields implemented in DockVision: the original forcefield developed for Research and the MMFF94 forcefield (T.A. Halgren, J.Comp.Chem. vol. 17 pg. 490-519 (1996)). The Research forcefield uses a Lennard-Jones 6-12 function together with a standard electrostatic function. All charge groups are neutralized to reduce long range ionic interactions. Intramolecular energies include torsion and van der Waals terms: no electrostatic terms are included. A dielectric constant of 1.0 is used. Charges and atomtypes for the ligand and protein may be automatically assigned within DockVision.

The implementation of the MMFF94 force field is only a partial implementation because only van der Waals, electrostatic and, for the ligand, torsion terms, are used in DockVision (since the other degrees of freedom, involving bondlengths and bond angles, are constant). The dielectric constant is adjustable.

2.2 Hydrogen Bond Function

To improve computation speed, RSDB employs a hydrogen bond function in place of the pairwise energy function. Hydrogen bond donors and acceptors in the ligand and protein are identified and hydrogen bonding pairs are formed by computing distance and angles between them. An energy-like score is then calculated based on the deviation from ideal values. To efficiently treat close van der Waals contacts, a grid-based "pseudo-van der Waals" function is used to calculate a fast score function that is similar to the pairwise van der Waals term. RSDB employs these to functions together in generating the docking energies.

2.3 Conformer Sets

The purpose of conformers in Research and Gamma is to provide a basis for unbiased sampling of ligand conformational space. Conformer sets may be pre-generated by several utility programs we have developed. A topology file is first created to determine the connectivity and specify rotatable torsions. Conformers are then generated by simply specifying sets of allowed angles for each torsion: a file consisting of each possible combination is then created. This file is then screened for conformers containing steric clashes, resulting in a somewhat smaller set of conformers that represents sterically viable conformations. This file is used by the algorithm to generate initial conformations for the ligand at the beginning of the simulation.

2.4 The Floating Algorithm

The floating algorithm is a Monte Carlo procedure with a grid-based score function that determines if there is a steric clash between the target and the ligand and determines the degree of the clash. The grid is generated by the program Flgrid. The grid represents a rectilinear region around the receptor and has values of 0 for points that are "outside" the receptor and positive values for points "inside" the protein surface. The positive values represent the distance to the nearest surface point and are thus greater the farther the point is from the surface. The score function is simply the average over all the atoms using a computationally cheap zero order (ie. value of the nearest grid point) interpolation scheme. The Monte Carlo procedure is run on this score function until the score falls below a user-defined tolerance, at which point the ligand will have minimal or no steric overlap. The ligand is then subjected to the next step of the energy-based Monte Carlo optimization.

2.5 Constraint sets

The algorithms use the concept of a "constraint set" to generate the initial positions for the ligand as well as to restrict their positions during the floating and Monte Carlo runs. A constraint set is specified by a set of spheres, which will generally overlap with each other, although this is not required. Spheres are of two kinds: "include" spheres define allowed regions for the center of the ligand, while "exclude" spheres define regions into which the ligand center is forbidden to go. A constraint set is defined as a union of include and exclude spheres. At least one include sphere must be present, so the set has a finite volume. The initial position of the ligand center is selected randomly from the include spheres, weighted according to the sphere volume so that the overall distribution of initial positions will be as near to uniform as possible. New trial positions are required to lie within the constraint set.

3. Graphical Interface

The graphical interface for DockVision is designed to assist the user in setting up, running and analyzing docking simulations. Since docking is a complex problem, most docking programs involve elaborate setup procedures that require considerable time to learn, and are often time consuming even for the experienced user. While the underlying docking algorithms in DockVision are no less complex, the graphical interface creates a structured environment that leads the user through the necessary steps to start the docking simulation. In almost all cases, automated algorithms can perform required data assignment (assignment of ligand topology, charges, etc), while the advanced user has the option of complete control over all docking parameters.

3.1 Pulldown Menu

At the top of the interface is the pulldown menu, which allows the user to select features that have an overall effect on the interface.

     File

       Load
          Read a saved job file. A job file consists of a set of
          keywords specific to the particular docking module being
          used. See Appendix I for a completed description of the
          allowed keywords.

       Save 
          Write the current parameters as a jobfile.

       Quit
          Exit the program. Parameters are not saved.

     Mode

       Research
          Select Research docking algorithm.

       Gamma
          Select Gamma docking algorithm.

       RSDB
          Select RSDB (Research Database) algorithm.

       JobControl
          Change priority and kill background processes.

       Tools
          Menu of setup and analysis tools.

     Edit

       Undo
          Cancel the effects of the previous command.

     Help 

       About DockVision
          Information about the program.

       Help
          Access the program documentation.

3.2 Parameter Forms: Research, Gamma, RSDB

The buttons on the left of the window and the corresponding main form represent the various parameter assignment stages necessary to set up and run a docking simulation. These forms must have appropriate data input before a docking run can start. Initially, the buttons are red in colour but turn green when all necessary data has been assigned. The interface keeps track of which parameter and data files are present and will generate any missing files at the beginning of the docking simulation. The following is a brief description of each form.
     Setup
        Assign a RUNNAME for the simulation. The output
        and log files will be based on this name. Existing
        files may be overwritten. The forcefield is also
        selected at this point.

     Ligand
        Choose a ligand coordinate file. Forcefield parameter
        files are also selected or automatically generated
        at this point.

     Flexibility
        Decide if the ligand is rigid or flexible. If
        flexible, then topology files will have to be chosen
        or generated. Also, the user may decide to use
        conformers, in which case a conformer file may be
        read in or generated.

     Target
        Choose a coordinate file to use as the docking target.
        Forcefield parameter files are also selected or
        automatically generated at this point.

     Grid 
        Define the floating grid. If the target has not
        been used before, then a new grid will need to be
        generated. A set of bounds for the grid may be typed
        in here, or automatically generated. An existing
        grid file may be selected.

     Options
        Select other parameters that affect the docking
        run. These include the atomtype library, the constraint
        file, the annealing schedule for Monte Carlo algorithms
        and all other parameters set by the job file. The effect
        of each parameter is documented in the section
        describing the jobfile for each docking program. 

     Research/Gamma/RSDB
        Begin the docking run for the selected docking module.

3.3 JobControl

Once a docking simulation has been started, it can be controlled within the JobControl menu. Active processes are accessed by their system process number.
     Kill
        Terminate the selected process.

     Renice
        Make the selected process run at a lower priority. Selecting this
        more than once will not lower the priority any further.

3.4 Tools

The Tools menu contains a selection of functions that are useful in either setting up a docking run or in performing analysis of the docking output.

     RMS Tool
        Calculate a scatterplot of the RMS difference between the output
        from the docking simulation and a reference ligand. The ligand
        could be a modelled or crystallographic structure. The X-axis
        is the RMS distance from the reference, while the Y-axis is the
        docked energy. This utility can be run on the output file while
        the simulation is running.

     Energy Histogram
        Calculate a histogram showing the number of docked structures in
        the output that fall within a selected set of energy ranges. This
        utility can be run on the output file while the simulation is running.

     Cluster Output
        Run a cluster algorithm on the output to generate a new output
        file containing the cluster leaders. The utility will eliminate
        repeated docking to basically equivalent binding modes.

     Statistics
        Calculate the number of molecules in the output, the maximum,
        minimum and average energies. This utility can be run on the output
        file while the simulation is running.

     GA view
        Graph the evolution of the ligand population for a Gamma run. This
        utility can be run on the output file while the simulation is running.

     Genop view
        Graph the probabilities for the selected genops (genetic
        operators) for a Gamma run. Genops dynamically change their
        likelyhood of being selected based on their success during the
        course of the simulation.

     Build Schedule
        A utility to assist in building an annealing schedule for the
        Monte Carlo runs in Research and RSDB. An annealing schedule
        consists of a sequence of lines specifying a run with a particular
        number of steps, temperature, and maximum rotation and translation.
        Once a sequence of lines has been built up in the window, it can be
        saved to a file. Can also be accessed by the options menu in
        Research and RSDB.

     Build Constraints
        A utility to assist in building a constraint file. A constraint
        file consists of a set of spheres with a center and radius. The
        center of the ligand is restricted to be within the constraint
        sphere during the entire docking run. Once a constraint set has
        been built up in the window, it can be saved to a file.

4. Research

The following is a description of the use of the parameter forms involved in setting up a Research simulation. While some features are common to all modules, some features differ between them.

Setup

The run name is selected in this form. This name determines the base name for the output files, so if the run name was "run01", then the files "run01.log", "run01.out", etc. are generated. The forcefield is selected here, either Research or MMFF94 are the allowed choices.

Ligand

This form involves selecting input involving the ligand. The coordinate file for the ligand is selected here. In addition, force field parameters may be either selected from a file or chosen to be generated automatically. The automatic generation algorithm is implemented only for the Research forcefield. If MMFF94 is being used, then parameters must first be generated by the FFAssign program (Richard Gillilan, Cornell Theory Centre) or built using a file editor.

Flexibility

This form involves determining the treatment of flexibility. The ligand may be rigid or flexible. If flexible is chosen, then a topology file must be either loaded or generated automatically. In addition, handling of conformers must be determined. If "no conformers" is selected, then all docking is done from the conformation represented in the coordinate file. Conformers may be either loaded from a pre-generated conformer file, or automatically generated. If the latter is selected, the number of conformers to be used must also be input.

Target

This form involves selecting the target protein. The target is a PDB file, normally with polar hydrogens added to the heavy atoms.Like the Ligand form, forcefield parameters are also either selected or automatically generated.

Grid

The floating grid file is generated in this form. Generally, this needs to be done only once for each active site region of the protein that is being explored. The floating grid is a three-dimensional array of values determining where the volume occupied by the protein. The "auto bounds" option will generate a grid that will cover the entire protein.However, for large proteins or cases where the active site is well known, the grid may be somewhat smaller, saving the computation time (which may be up to 30 minutes for larger grids, on some computers). In this case, it is often best to select the coordinates of some atom central to the active site as the grid center, and allow the grid to be 25 or more angstroms on each side (this may depend on the size of the ligand and the search site). Once a grid has been generated, then on subsequent runs the grid file may be selected here.

The near grid is not currently used in Research and may be ignored.

Options

Here specific parameters which control the behaviour of the docking algorithm are set. A description of each item follows. In most cases, these values correspond directly to values in the job file.

     Constraint
        Build a new constraint file or load a previously built
        one. These determine the search region used for the docking
        by defining a collection of spheres. The spheres define the
        location in space were the center of the ligand will be
        constrained.

     Schedule
        Build a new annealing schedule or load an exising one. The
        schedule determines the parameters for the simulated annealing,
        such as the temperature, number of steps and maximum translation
        and rotation.

     Refine Mode
        Turning this on will stop the initial randomization of the
        ligand (in position and orientation) from being performed.
        This is useful in docking from a modelled position or refining
        a previously generated output file. Conformers are still
        generated (to stop this, choose "no conformers" in the
        Flexibility menu).

     Energy Only
        Calculate the initial energy of the input ligand and exit. No
        docking is performed.

     Conformational Energy
        Include intramolecular energy of the ligand in the energy
        calculation. If this is off, only steric clash checking on the
        ligand is performed on the ligand during docking.

     ftol  
        Tolerance value for the floating algorithm. This controls how far
        the algorithm is allowed to "float" an initial ligand with a bad
        contact out of the protein. Normally a value of 1.0 is used.

     ctol 
        Tolerance value for the steric clash detection between the ligand
        and the protein. Normally a value of 1.0 is used.

     ntrials 
        Number of independent Monte Carlo runs to be performed.

     seed 
        Random number seed. If a value of 0 forces the seed to be
        generated from the system clock.

     torconv 
        Scaling factor between maximum allowed deviation for torsion
        angles and rotation angles. If torconv is 2, then the torsion
        angles will be randomly generated with twice the magnitude of
        the (orientational) rotation angles

     Dielectric
        Set the dielectric constant. Implemented for MMFF94 only.

     Cutoff 
        Maximum allowed distance in angstroms between groups in the energy
        calculation. Interactions between groups exceeding this distance
        is ignored.

     ecut 
        Maximum allowed energy for a docking to be saved into the output
        file. Normally, this value is set low enough so a relatively
        small percentage of total trials are saved. Warning: if this
        value is too high, the output file may become very large.

     Blankfile
        Used to perform docking of a covalently bound ligand. See Appendix III
        for information on covalent docking.

Research

Begin a Research docking run. The button will appear green if all required setup steps have been done, otherwise it will appear red and the run is not allowed to start. If it is red, click on the form buttons that are still red in colour and perform the necessary setup steps.

When a run is started, any required setup steps are first performed. These sets are reported in a small screen below the parameter form. Once these steps have been successfully completed, DockVision will begin the docking run itself.

5. Gamma

The following is a description of the use of the parameter forms involved in setting up a Research simulation. While some features are common to all modules, some features differ between them.

Setup

The run name is selected in this form. This name determines the base name for the output files, so if the run name was "run01", then the files "run01.log", "run01.out", etc. are generated. Only the Research forcefield is currently implemented for Gamma.

Ligand

This form involves selecting input involving the ligand. The coordinate file for the ligand is selected here. In addition, force field parameters may be either selected from a file or chosen to be generated automatically.

Flexibility

This form involves determining the treatment of flexibility. The ligand may be rigid or flexible. If flexible is chosen, then a topology file must be either loaded or generated automatically. In addition, handling of conformers must be determined. If "no conformers" is selected, then all docking is done from the conformation of the coordinate file. Conformers may be either loaded from a pre-generated conformer file, or automatically generated. If the latter is selected, the number of conformers to be used must also be input.

Target

This form involves selecting the target protein. Like the Ligand form, forcefield parameters are also either selected or automatically generated.

Grid

The floating grid file is generated in this form. Generally, this needs to be done only once for each active site region of the protein that is being explored. The floating grid is a three-dimensional array of values determining where the volume occupied by the protein. The "auto bounds" option will generate a grid that will cover the entire protein.However, for large proteins or cases where the active site is well known, the grid may be somewhat smaller, saving the computation time (which may be up to 30 minutes for larger grids, on some computers). In this case, it is often best to select the coordinates of some atom central to the active site as the grid center, and allow the grid to be 25 or more angstroms on each side (this may depend on the size of the ligand and the search site). Once a grid has been generated, then on subsequent runs the grid file may be selected here.

The near grid is not currently used in Gamma and may be ignored.

Options

Here specific parameters which control the behaviour of the docking algorithm are set. A description of each item follows. In most cases, these values correspond directly to values in the job file. For Gamma there are three separate options menus.

Main Options

These options are the principal options which need to be set.

     Constraint
        Build a new constraint file or load a previously built
        one. These determine the search region used for the docking
        by defining a collection of spheres. The spheres define the
        location in space were the center of the ligand will be
        constrained.

     Genop File
        Select a file specifying the genetic operators to be used.

     Refine Mode
        Turning this on will stop the initial randomization of the
        ligand (in position and orientation) from being performed.
        This is useful in docking from a modelled position or refining
        a previously generated output file. Conformers are still
        generated (to stop this, choose "no conformers" in the
         Flexibility menu).

     Calc. Energy Only
        Calculate the initial energy of the input ligand and exit. No
        docking is performed.

     Conformational Energy
        Include intramolecular energy of the ligand in the energy
        calculation. If this is off, only steric clash checking on the
        ligand is performed on the ligand during docking.

     Print Int. Pop.
        Save the initial population generated before the genetic
        algorithm begins.

     Elitism
        Only allow offspring to survive if they are more fit than their
        parents.

     Number of Generation
        Total number of generations to run the genetic algorithm.

     Max Population Size
        The maximum number of individuals allowed in the population.
        
     Min Population Size
        The minimum number of individuals allowed in the population. The
        difference between the max population size and the min population
        size determines the number of new individuals generated at each
        generation.
        
     Random Number Seed 
        Random number seed. If a value of 0 forces the seed to be
        generated from the system clock.

     Initial Screen Maxsize
        Number of individuals to be generated by Monte Carlo docking for
        the initial population.

     Migration Frequency
        Number of generations between migration events. Migration events
        allow for individuals to move between subpopulations.

     Subpopulations
        Number of subpopulations. Subpopulations are independent collections
        of individuals that can only interact through migration events.
        Subpopulations help maintain diversity in the overall population.

GA Options

GA options involve specific parameters which control the behaviour of the genetic algorithm. These are consisted options for the advanced user.
     Outline File
        Not currently supported.

     Organism File
        Specify the initial population from an organism (.org) file. These
        are generated as output from Gamma. The organism file stores the
        position, orientation and torsional information for the ligand.

     Biased Selection
        Select individuals for breeding based on their fitness. More fit
        individuals are more likely to be selected than less fit ones.

     Genop Biased Selection
        Select genetic operators to be used based on their "fitness"
        (success rate).

     Tournament Size 
        Size of tournament population in tournament selection method.

     Deme Migration Frequency
        Number of generations between deme migration events.

     Number of Demes
        Demes are a fitness-biased subpopulation scheme. This selects the
        number of deme subpopulations to be used (with 1, no demes are used).

     Selection Method
        This specifies the method by which the population is pruned from the
        maximum size to the minimum size at the end of each generation. In
        Rank selection, only the most fit individuals are selected. In
        Tournament selection, pairs of individuals are chosen successively
        with the most fit member being selected.

     Biased Selection Type
        This specifies the probability weighting for selecting which
        individuals are used to generate offspring. With Roulette Wheel
        probabilities are determined by their relative fitness. With
        Linear and Quadratic, an appropriate scaling function
        is used to determine the relative probabilities.

Energy Options

These are options involving calculation of energy.
     Target2 File
        Not implemented.

     Bump Check
        Implement a simple bump check for ligand conformation if
        conformational energy is not used.

     Cutoff 
        Maximum allowed distance in angstroms between groups in the energy
        calculation. Interactions between groups exceeding this distance
        is ignored.

     ftol  
        Tolerance value for the floating algorithm. This controls how far
        the algorithm is allowed to "float" an initial ligand with a bad
        contact out of the protein. Normally a value of 1.0 is used.

     ctol 
        Tolerance value for the steric clash detection between the ligand
        and the protein. Normally a value of 1.0 is used.

     Numb. MC steps
        Number of steps to be run in the Monte Carlo procedure, when used.
        Monte Carlo runs are performed on initial seeding and by special
        "Monte Carlo" genetic operators.

     Bump Penalty
        Not implemented.

     Torconv 
        Scaling factor between maximum allowed deviation for torsion
        angles and rotation angles. If torconv is 2, then the torsion
        angles will be randomly generated with twice the magnitude of
        the (orientational) rotation angles

     HB Factor
        Not implemented.

     PVDW radius
        Not implemented.

     Energy scale
        Not implemented.

     PVDW factor
        Not implemented.

     Dielectric
        Not implemented.

Gamma

Begin a Gamma docking run. The button will appear green if all required setup steps have been done, otherwise it will appear red and the run is not allowed to start. If it is red, click on the form buttons that are still red in colour and perform the necessary setup steps.

When the a run is started, any required setup steps are first performed. These sets are reported in a small screen below the parameter form. Once these steps have been successfully completed, DockVision will begin the docking run itself.

6. RSDB

The following is a description of the use of the parameter forms involved in setting up an RSDB simulation. While some features are common to all modules, some features differ between them.

Setup

The run name is selected in this form. This name determines the base name for the output files, so if the run name was "run01", then the files "run01.log", "run01.out", etc. are generated. The hydrogen bonding function is used for RSDB.

Ligand

The ligand file here is the ligand database, in PDB format. Parameters are automatically generated within the program.

Flexibility

Choose flexibility options. RSDB consists of two simulated annealing stages, the first where the the ligands are treated rigidly. Flexibility for the second stage may be selected here.

Target

This form involves selecting the target protein. Forcefield parameters are also either selected or automatically generated.

Grid

The floating grid file is generated in this form. Generally, this needs to be done only once for each active site region of the protein that is being explored. The floating grid is a three-dimensional array of values determining where the volume occupied by the protein. The "auto bounds" option will generate a grid that will cover the entire protein.However, for large proteins or cases where the active site is well known, the grid may be somewhat smaller, saving the computation time (which may be up to 30 minutes for larger grids, on some computers). In this case, it is often best to select the coordinates of some atom central to the active site as the grid center, and allow the grid to be 25 or more angstroms on each side (this may depend on the size of the ligand and the search site). Once a grid has been generated, then on subsequent runs the grid file may be selected here.

The near grid is used in the hydrogen bonding score function and is generated or loaded in a similar manner to the floating grid.

Options

Here specific parameters which control the behaviour of the docking algorithm are set. A description of each item follows. In most cases, these values correspond directly to values in the job file.

     Constraint
        Build a new constraint file or load a previously built
        one. These determine the search region used for the docking
        by defining a collection of spheres. The spheres define the
        location in space were the center of the ligand will be
        constrained.

     Schedule1
        Build a new annealing schedule or load an exising one. This
        schedule is for the first, rigid search phase of the docking.

     Schedule2
        Build a new annealing schedule or load an exising one. This
        schedule is for the second, refinement phase. The ligand
        may be flexible for this phase.

     Refine Mode
        Turning this on will stop the initial randomization of the
        ligand (in position and orientation) from being performed.
        This is useful in docking from a modelled position or refining
        a previously generated output file. Conformers are still
        generated (to stop this, choose "no conformers" in the
         Flexibility menu).

     Energy Only
        Calculate the initial energy of the input ligand and exit. No
        docking is performed.

     Conformational Energy
        Include intramolecular energy of the ligand in the energy
        calculation. If this is off, only steric clash checking on the
        ligand is performed on the ligand during docking.

     ftol  
        Tolerance value for the floating algorithm. This controls how far
        the algorithm is allowed to "float" an initial ligand with a bad
        contact out of the protein. Normally a value of 1.0 is used.

     ctol 
        Tolerance value for the steric clash detection between the ligand
        and the protein. Normally a value of 1.0 is used.

     ntrials 
        Number of independent Monte Carlo runs to be performed. This
        applies to the rigid docking phase (schedule 1).

     seed 
        Random number seed. If a value of 0 forces the seed to be
        generated from the system clock.

     torconv 
        Scaling factor between maximum allowed deviation for torsion
        angles and rotation angles. If torconv is 2, then the torsion
        angles will be randomly generated with twice the magnitude of
        the (orientational) rotation angles

     pvdw Radius
        A parameter determining the softness of the grid-based steric
        function used for scoring. Larger values make the function softer
        (ie. more tolerant of close contacts).

RSDB

Begin a Research docking run. The button will appear green if all required setup steps have been done, otherwise it will appear red and the run is not allowed to start. If it is red, click on the form buttons that are still red in colour and perform the necessary setup steps.

When the a run is started, any required setup steps are first performed. These sets are reported in a small screen below the parameter form. Once these steps have been successfully completed, DockVision will begin the docking run itself.

Appendix I: Simulation Parameter Files

Research Parameters

The following is a list of keywords used by Research version 2.1 along with a brief description of what each one does.


     forcefield  (string)  
        Determine the type of forcefield to use in the energy calculation.
        Currently legal values are:
           RESEARCH (R or r) 
           MMFF (M or m) 
        The following will be implemented soon:  
           HBOND (H or h) (--- a simple hydrogen-bonding function) 
           AMBER (A or a) 

     atomtable  (string) 
        Specifies the atomtypes and VDW parameters for MMFF94.
        This keyword needs to be specified only if running with the
        MMFF forcefield. Normal value would be "mmff.att". 

     atomlib    (string) 
        Specifies the atomtypes and VDW parameters for RESEARCH FF.
        Required if running the RESEARCH forcefield (the default).
        Normal value would be "stdv2.alib". 

     chargelib  (string) 
        Specify the charge and group parameter files. This format is the
        same for all forcefields, however the files will differ according
        to specific atomtypes and charges. The base name of this file (ie.
        name without the extension) specifies a pair of files, a ".rlib"
        file and a ".glib" file. The ".rlib" file specifies the charges
        for each residue, while the ".glib" file specifies the grouping
        of atoms into charge groups. Normal value would be "ligand.lib"
        or "target.lib".

     chargelib1  (string)
        As above. Specifies additional parameters. 

     chargelib2  (string)
        As above. Specifies additional parameters. 

     grid      (string)
        Specifies the floating grid file. This file is generated prior
        to the simulation run by the program FLGRID. Normal value would
        be "target.grd". Format for the file is the BIOMOL master
        fourier file (MFF) format.) 

     probe     (string)
        Specifies the probe coordinate file in PDB format. Multiple
        ligands can be specified by putting structures in a single file,
        separated by TER cards, but such ligands must be the same actual
        molecule (possibly in different starting positions or
        different conformations). Normal value would be "ligand.pdb".

     target    (string) 
        Specifies the target coordinate file in PDB format. Normal value
        would be target.pdb. 

     schedule   (string) 
        Specifies the annealing schedule for each Monte Carlo trial. See
        Appendix II. 

     constraint (string) 
        Specifies the constraint file. These specify the region in space
        to be searched. The center of the ligand starts in this region
        and is constrained to stay inside during the simulation run.
        See Appendix II. 

     cutoff     (float) 
        Specifies distance cutoff for energy calcs. Usually 8.0 angstroms. 

     ecut       (float) 
        Specifies the energy cutoff for saving output from docking trials.
        Docking trials with energy below this value are saved into the
        output file.  

     ftol       (float) 
        Tolerance value for the floating algorithm. This controls how far
        the algorithm is allowed to "float" an initial ligand with a bad
        contact out of the protein. Generally 1.0 is a safe bet. 

     ctol       (float) 
        Tolerance value for the steric clash detection algorithm. This
        controls how bad an overlap with the protein can be before
        pairwise energy is abandoned for this Monte Carlo trial ligand
        position. Again, 1.0 is a safe bet. Smaller values will make the
        simulation run faster, but at risk of penalizing marginally good
        dockings. 

     ntrials    (int) 
        Specifies the total number of independent Monte Carlo runs to
        perform.  

     seed       (int) 
        Specifies the initial value for the random number seed. A value
        of 0 will generate a seed from the system clock. 

     torconv    (int) 
        Specifies scale factor relating dihedral angle changes to rotation
        angle changes for the ligand during the Monte Carlo simulation.
        This value will default to 1.0 if this keyword is not present. 

     dielectric  (float) 
        Specifies the dielectric constant. (Currently, implemented only
        for MMFF.) 

     conf_energy (TRUE/FALSE) 
        If this is TRUE, the Energy calculation includes the conformational
        energy of ligand. If FALSE, then a simple bump check algorithm is
        performed during the Monte Carlo instead and only the interaction
        energy is used. 

     conformers  (TRUE/FALSE) 
        Randomly select conformations from a conformer file when
        generating initial positions for the Monte Carlo simulation. The
        "conformerfile" keyword must also be specified. 

     conformerfile  (string) 
        Specifies the file holding ligand conformers. These are
        pregenerated using either the genconf program or the confman program.
        See Appendix II. 

     flexible    (TRUE/FALSE) 
        If TRUE, ligand torsions are allowed to change during the docking.
        Requires the topologyfile keyword is set. 

     topologyfile  (string) 
        Specifies the file holding parameters for the ligand topology. This
        can be generated using the otto program. See Appendix II. 

     blankfile   (string) 
        Ignore this for now. (optional). This is for covalent docking
        experiments that are tricky to set up. 

     logfile     (string) 
        Specify the file for holding logging information. 

     outfile      (string) 
        Specify the file for docking coordinate output. 

     refine       (TRUE/FALSE) 
        If TRUE, do not randomize the initial states of the ligands before
        the docking simulation. This can be used to further refine docking
        runs by using previous docking output files as the ligand file. This
        does not turn conformers off, so conformers should be FALSE for the
        ligand positions to remain in the initial position (unless this effect
        is desired). 

     energy_only  (TRUE/FALSE) 
        If TRUE, calculated the energy of the ligand(s) in their starting
        positions and exit. 

Research Command Line Flags

Research may be run within the DockVision interface or as a standalone program. Parameters that have been saved from the interface into a file may be used directly by the standalone version, research-2.1. The useage is:
       research-2.1 [options] -j job_file 
There are a number of optional flags (the -j flag is required) which can alter the behaviour of research. In most cases, the behaviour can also be controlled within the job file itself

      -h           Help  
                   Print out a brief help message  
 
      -v           Verbose 
                   Print logging information onto the screen. All logging  
                   information will still be sent to the log file. 
 
      -r           Refine 
                   Equivalent to setting "refine TRUE" in the job file. 
 
      -l log_file  Logfile 
                   Equivalent to setting "logfile log_file" in the job file. 
 
      -o out_file  Outfile 
                   Equivalent to setting "outfile out_file" in the job file. 
 
      -e           Energy Only 
                   Equivalent to setting "energy_only TRUE" in the job file. 

Gamma Parameters

The following is a list of keywords used by gamma, along with a brief description of what each one does.
  atomlib       (string) 
     Specifies the atomtypes and VDW parameters for RESEARCH FF. 
     Required if running the RESEARCH forcefield (the default). 
     Normal value would be "stdv2.alib". 
 
  chargelib      (string) 
     Specify the charge and group parameter files. This format is the same 
     for all forcefields, however the files will differ according to specific 
     atomtypes and charges. The base name of this file (ie. name without the 
     extension) specifies a pair of files, a ".rlib" file and a ".glib" file. 
     The ".rlib" file specifies the charges for each residue, while the ".glib" 
     file specifies the grouping of atoms into charge groups. Normal value 
     would be "ligand.lib" or "target.lib". 
 
  chargelib1      (string) 
     As above. Specifies additional parameters. 
 
  chargelib2      (string) 
     As above. Specifies additional parameters. 
 
  grid      (string) 
     Specifies the floating grid file. This file is generated prior to the 
     simulation run by the program FLGRID. Normal value would be 
     "target.grd". Format for the file is the BIOMOL master fourier file 
     (MFF) format.) 
 
  probe      (string) 
     Specifies the probe coordinate file in PDB format. Multiple ligands can 
     be specified by putting structures in a single file, separated by TER 
     cards, but such ligands must be the same actual molecule (possibly in 
     different starting positions or different conformations). Normal value 
     would be "ligand.pdb".

  target      (string) 
     Specifies the target coordinate file in PDB format. Normal value would 
     be target.pdb. 
 
  constraint      (string) 
     Specifies the constraint file. These specify the region in space to be 
     searched. The center of the ligand starts in this region and is constrained 
     to stay inside during the simulation run. See Appendix II. 
 
  cutoff       (float) 
     Specifies distance cutoff for energy calcs. Usually 8.0 angstroms. 
 
  ftol       (float) 
     Tolerance value for the floating algorithm. This controls how far the 
     algorithm is allowed to "float" an initial ligand with a bad contact 
     out of the protein. Generally 1.0 is a safe bet. 
 
  ctol      (float) 
     Tolerance value for the steric clash detection algorithm. This controls how 
     bad an overlap with the protein can be before pairwise energy is abandoned 
     for this Monte Carlo trial ligand position. Again, 1.0 is a safe bet. Smaller 
     values will make the simulation run faster, but at risk of penalizing 
     marginally good dockings. 
 
  n_mc_steps      (int) 
     Specifies the number of Monte Carlo steps to run in the initial population 
     generation and Monte Carlo operators. 
 
  seed       (int) 
     Specifies the initial value for the random number seed. A value of 0 will 
     generate a seed from the system clock. 
 
  torconv        (int) 
     Specifies scale factor relating dihedral angle changes to rotation angle 
     changes for the ligand during the Monte Carlo simulation. This value will 
     default to 1.0 if this keyword is not present. 
 
  selection_method       (RANK/TOURN)  
     Specify the method by which organisms are selected for the next generation.
     RANK   --- take the best of the population. 
     TOURN   --- randomly select using tournament selection method. 

  tourn_size       (int)  
     Size of tournament population in tournament selection method.

  biased_selection       (TRUE/FALSE)  
     if TRUE, weight choice of organisms selected for breeding by fitness.   

  gen_op_biased_sel       (TRUE/FALSE)  
     if TRUE, weight choice of genetic operators selected to perform breeding 
     by genetic operator fitness.   

  biased_sel_type       (ROULETTE/LINEAR/QUADRATIC)  \
     Choose biased selection method.   

  num_gens       (int)  
     Number of generations to run in the simulation. 

  maxpopsize       (int)  
     The maximum allowed population size. At each generation, new organisms
     will be added to the population until the total population size is equal
     to this number.   

  minpopsize       (int)  
     The minimum population size at each generation. After breeding, the
     population is pruned down to this size before starting the next generation.
     In general, there will be (maxpopsize - minpopsize) organisms generated at
     each breeding step. 

  elitism       (TRUE/FALSE)  
     if TRUE, don't keep offspring that are less fit than their parents. 

  initial_screen       (int)  
     Generate this number of initial organisms by Monte Carlo annealing.  

  outlinefile       (string)  
     Specifes the molecular outline of the probe if outline operators are used. 

  fitness_function       (int)  
     Determine the fitness function.   
     1   - unscaled (normal) interaction energy. 
     2   - contact score function. 
     3   - scaled interaction energy. 
     4   - hbond score function.  

  number_of_demes       (int)  
     Number of deme subpopulations per environment.   
     A deme is a subpopulation where the fitness is weighted by the
     subpopulation size.

  number_of_envirs       (int)  
     Number of subpopulations. Fitness is not weighted.  

  deme_migration_frequency   (int)  
     How often to allow organisms to migrate between deme subpopulations. A
     deme frequency of N would allow migration every N generations. 

  envir_migration_frequency   (int)  
     How often to allow organisms to migrate between subpopulations. A frequency
     of N would allow migration every N generations. 

  topologyfile      (string) 
     Specifies the file holding parameters for the ligand topology. This can be 
     generated using the otto program. See Appendix II. 
 
  conformerfile      (string) 
     Specifies the file holding ligand conformers. These are pregenerated using
     either the genconf program or the confman program. See Appendix II. 
 
  flexible       (TRUE/FALSE) 
     If TRUE, ligand torsions are allowed to change during the docking. In this
     case, the topologyfile keyword must be TRUE also.
 
  conformers       (TRUE/FALSE) 
     Randomly select conformations from a conformer file when generating initial 
     positions for the Monte Carlo simulation. The "conformerfile" keyword must 
     also be specified. 
 
  bump_check       (TRUE/FALSE)  
     Include steric clash penalty in score function. 

  conf_energy      (TRUE/FALSE)
     If this is TRUE, the Energy calculation includes the conformational energy
     of ligand. If FALSE, then a simple bump check algorithm is performed during
     the Monte Carlo instead and only the interaction energy is used. 

  bump_penalty        (float)  
     Value of steric clash penalty. 

  target2_file      (string)  
     Target to be used for hbond function. Can be the same as target. 

  neargrid      (string) 
     Specifies the neargridfile. This is used to calculated a "pseudo-VDW" 
     score function that represents the steric outline of the target. This file 
     is generated prior to the simulation run by the program SURFGRID. 
     Normal value would be "target_near.grd". Also in BIOMOL MMFF format. 
  
  pvdwrd      (float) 
     This parameters controls the softness of the steric score function. The default 
     value is 2.0 but larger values can be used to soften the steric function. 
     Larger values make the function softer, smaller values make it harder. 
 
  pvdw_factor       (float)  
     Relative weighting of the pvdw score.   

  hb_factor       (float)  
     Relative weighting of the hbond score.   

  energy_scale       (float)  
     Scale the energy by this factor. 

  organismfile       (string)  
     Read initial organisms from this file. (optional)  

  genopfile       (string)  
     Read the genetic operators (genops) from this file.   

  organismfile_out       (string)  
     Output organisms into this file.   

  genopfile_out       (string)  
     Output genetic operators (genops) to this file.  

  outfile       (string)  
     Output PDB coordinates to this file.

  logfile       (string)  
     Logging info for each generation here.

  infofile       (string)  
     Some other info here.

  demefile       (string)  
     Deme population data output into this file.

  histfile       (string)  
     Output histogram data into this file.

  genedbgfile       (string)  
     Debugging information.

  snapshotfile       (string)  
     Snapshot of each generation here.

  bestfile       (string)  
     Output fitness of best members for each generation.

  outctrlfile       (string)  
     Parsing and other logging information to this file.

RSDB Parameters

The following is a list of keywords used by RSDB, along with a brief description of what each one does.
  atomlib       (string) 
     Specifies the atomtypes and VDW parameters for RESEARCH FF. 
     Required if running the RESEARCH forcefield (the default). 
     Normal value would be "stdv2.alib". 
 
  chargelib      (string) 
     Specify the charge and group parameter files. This format is the same 
     for all forcefields, however the files will differ according to specific 
     atomtypes and charges. The base name of this file (ie. name without the 
     extension) specifies a pair of files, a ".rlib" file and a ".glib" file. 
     The ".rlib" file specifies the charges for each residue, while the ".glib" 
     file specifies the grouping of atoms into charge groups. Normal value 
     would be "ligand.lib" or "target.lib". 
 
  chargelib1      (string) 
     As above. Specifies additional parameters. 
 
  chargelib2      (string) 
     As above. Specifies additional parameters. 
 
  grid      (string) 
     Specifies the floating grid file. This file is generated prior to the 
     simulation run by the program FLGRID. Normal value would be 
     "target.grd". Format for the file is the BIOMOL master fourier file 
     (MFF) format. 
 
  neargrid      (string) 
     Specifies the neargridfile. This is used to calculated a "pseudo-VDW" 
     score function that represents the steric outline of the target. This file 
     is generated prior to the simulation run by the program SURFGRID. 
     Normal value would be "target_near.grd". Also in BIOMOL MMFF format. 
  
  probe      (string) 
     Specifies the probe coordinate file in PDB format. Multiple ligands can 
     be specified by putting structures in a single file, separated by TER 
     cards, but such ligands must be the same actual molecule (possibly in 
     different starting positions or different conformations). Normal value 
     would be "ligand.pdb". 
 
  target      (string) 
     Specifies the target coordinate file in PDB format. Normal value would 
     be target.pdb. 
 
  schedule1      (string) 
     Specifies the annealing schedule for the first stage of rigid Monte Carlo 
     trials. See Appendix II. 
 
  schedule2      (string) 
     Specifies the annealing schedule for the second stage of Monte Carlo 
     trials. These trials refine the best docking found in the first stage. 
  constraint      (string) 
     Specifies the constraint file. These specify the region in space to be 
     searched. The center of the ligand starts in this region and is constrained 
     to stay inside during the simulation run. See Appendix II. 
 
  ftol       (float) 
     Tolerance value for the floating algorithm. This controls how far the 
     algorithm is allowed to "float" an initial ligand with a bad contact 
     out of the protein. Generally 1.0 is a safe bet. 
 
  ctol      (float) 
     Tolerance value for the steric clash detection algorithm. This controls how 
     bad an overlap with the protein can be before pairwise energy is abandoned 
     for this Monte Carlo trial ligand position. Again, 1.0 is a safe bet. Smaller 
     values will make the simulation run faster, but at risk of penalizing 
     marginally good dockings. 
 
  ntrials      (int) 
     Specifies the total number of independent Monte Carlo runs to perform.  
 
  seed       (int) 
     Specifies the initial value for the random number seed. A value of 0 will 
     generate a seed from the system clock. 
 
  torconv        (int) 
     Specifies scale factor relating dihedral angle changes to rotation angle 
     changes for the ligand during the Monte Carlo simulation. This value will 
     default to 1.0 if this keyword is not present. 
 
  pvdwrd      (float) 
     This parameters controls the softness of the steric score function. The default 
     value is 2.0 but larger values can be used to soften the steric function. 
     Larger values make the function softer, smaller values make it harder. 
 
  conf_energy      (TRUE/FALSE)
     If this is TRUE, the Energy calculation includes the conformational
     energy of ligand. If FALSE, then a simple bump check algorithm is
     performed during the Monte Carlo instead and only the interaction
     energy is used. 
 
  flexible       (TRUE/FALSE) 
     If TRUE, ligand torsions are allowed to change during the docking. The program 
     generates the topology automatically (ie. no topology file is required). 
 
  logfile       (string) 
     Specify the file for holding logging information. 
 
  outfile       (string) 
     Specify the file for docking coordinate output. 
 
  refine      (TRUE/FALSE) 
     If TRUE, do not randomize the initial states of the ligands before the docking 
     simulation. This can be used to further refine docking runs by using previous 
     docking output files as the ligand file. This does not turn conformers off, so 
     conformers should be FALSE for the ligand positions to remain in the initial 
     position (unless this effect is desired). 
 
  energy_only        (TRUE/FALSE) 
     If TRUE, calculated the energy of the ligand(s) in their starting positions and 
     exit. 

RSDB Command Line Flags

Like Research, RSDB may also be run as a standalone program. Parameters are saved from the DockVision interface or by an editor in the job file. Useage for RSDB:
     rsdb-2.0 [options] -j job_file 
There are a number of optional flags (the -j flag is required) which can alter the behaviour of RSDB. In most cases, the behaviour can also be controlled within the job file itself
      -h           Help  
                   Print out a brief help message  
 
      -v           Verbose 
                   Print logging information onto the screen. All logging  
                   information will still be sent to the log file. 
 
      -r           Refine 
                   Equivalent to setting "refine TRUE" in the job file. 
 
      -l log_file  Logfile 
                   Equivalent to setting "logfile log_file" in the job file. 
 
      -o out_file  Outfile 
                   Equivalent to setting "outfile out_file" in the job file. 
 
      -e           Energy Only 
                   Equivalent to setting "energy_only TRUE" in the job file. 

Appendix II: File Formats and Utility Programs

PDB Coordinate files

PDB coordinate files for the probe and target should conform to PDB standards. The coordinate file reading code has been made to be as robust as possible and it should accept many of the commonly used PDB variants. However, in order for library charge/atomtype assignment to work properly, atom names MUST be distinct within a given residue.

The majority of errors typically encountered in running DockVision has to do with PDB coordinate files. In particular, if an error occurs in either reading the coordinate file or in assignment of charge parameters, this is most often the cause. The primary culpret is improper use of the atom name field in the PDB file. If an error occurs, view either or both of the target and probe coordinate files can check the columns for the atom names.

According to PDB standard, columns 13-16 are reserved for the atom name, although many programs will allow 4 letter atom names to extend into column 17 (ie. 14-17). DockVision will support either style of atom names, provided the alphabetic component of the name in column 13 and 14 corresponds to the element of that particular atom (again, according to PDB standard). Hence, if the atom name, starting from column 13, was " HE11" then the element would be "H" (hydrogen). Similarly if the name was (again, from column 13) "1HE1 " (since the numerical components are ignored in determining elements). However, if the name was (from column 13) "HE11 ", then the element would be assigned incorrectly to be "HE" (helium). This would lead to errors in the charge and atomtype assignment algorithms. Similarly, an atom name "CA " would have element "CA" (calcium), while " CA " would be " C" (carbon).

Adding polar hydrogens

Before running a simulation, polar hydrogens must be placed on the target protein and the ligand. This can be done with a number of commercial and academic programs, and in many cases the user will already have hydrogen positions assigned. Full hydrogens may also be used in DockVision.

Alternatively, poloar hydrogens may be added by using the "hydroman" program. To use this program, go

      hydroman [ -p  pH ]  coordfile outfile  
where the pH is optional. It will assign hydrogens according to standard geometries, but does no optimization.

Grid files

All DockVision docking modules utilizes the floating algorithm, which requires a precalculated grid that defines the inside and outside of the protein. This grid is generated using the FLGRID program, which uses as input a pdb coordinate file and as output produces a grid file in MFF format.

There are two methods for getting the parameters, the automatic way and the manual way. The program GRIDMAN can be used to automatically determine a set of parameters which will cover the complete protein with a grid. If the protein is very large and you are searching a small part of it, you may choose the manual method, below. To use GRIDMAN and FLGRID together, go:

    gridman -F  target | flgrid -o  gridfile target 
To generate grids without using gridman, run flgrid and type the appropriate values into the command line. To use:
    flgrid -o  grid target 
You are then prompted for parameters which determine the position, stepsize, extension of the grid region as well as a "probe" size. Generally, the grid should encompass all regions which the probe could possibly go (see constraint file, below) and that might overlap with the protein. The safest bet is to generate a cube which includes the entire protein, plus maybe 4 angstroms on each side. For very large proteins, this is might not be feasable.

Once you decide where you want the grid, run flgrid and you will prompted for:

  grid center x 
  grid center y  
  grid center z 
      Position of the center of the box, in angstroms. 

  grid stepsize 
     This is the distance between gridpoint in x,y,z directions. 
     A value of .5 is generally used. 

  number of gridsteps x 
  number of gridsteps y 
  number of gridsteps z 
     Number of total number of grid spaces (gridpoint minus one). 

  probe size 
     A value of 3.0 is recommended.

Annealing schedule

The annealing schedule for the Monte Carlo runs is specified in the annealing file. To make things clear, we use the following terminology:
   step 
    a single random trial and metropolis test.
   stage 
    a sequence of steps with the same parameters: temperature, etc). 
   trial 
    execution of a complete sequence of stages. 
The annealing schedule specifies exactly how a single trial will run. The following is the format for the .sched file:
  keyword 
  nstages 
  temperature    nsteps    max_rotn   max_transl 
  ... 
  temperature    nsteps    max_rotn   max_transl 
The following is a description of the keywords
   keyword 
      There are two legal keywords, which determine the type of 
      schedule to be run.
      FIXED 
        - annealing schedule with a fixed number of steps for each stage.
      TESTCOUNT 
        - annealing schedule in which each stage continues until a fixed 
          number of accepts or rejects take place.

   nstages 
      The number of stages to be run. There should be exactly  nstages 
      lines following this number, specifying the parameters for each stage. 

   temperature 
      The temperature in degrees Kelvin.

   nsteps 
      The number of steps in this stage.

   max rotn 
      The maximum magnitude for a random rotation (in degrees). 

   max transl 
      The maximum magnitude for a random translation (in angstroms). 
Here is an example annealing schedule. The initial temperature is 300K. The rotation angle is 16 degrees, and the translation is 4 angstroms. There are 100 steps at each stage, and all other parameters are decreased by a factor of 2 at each successive stage. (The # indicate comments.)
  # -- begin 
  FIXED 
  300.0   100  16.0   4.0 
  150.0   100   8.0   2.0 
   75.0   100   4.0   1.0 
   37.5   100   2.0   0.5 
  # -- end

Constraint File

This file defines the search region for the docking run. The constraint file represents a set of spheres, which may overlap, and which serve to define the allowed region for the *center* of the probe in all docking steps. There are two types of spheres, 'include' spheres, which specify an allowed region and 'exclude' spheres, which specify a forbidden region. In general, the actual allowed region is the union of all the include spheres, minus (ie. intersection with the complement of) the union of all the exclude spheres. There must be at least one include sphere.

The format for the file is any number of lines of the form:

     keyword X Y Z R 
keyword can either be "include" or "exclude". X Y Z are the coordinates of a point in space and R is a positive value, representing the radius of a sphere centered at these coordinates.

Topology files

The topology file is designed along the lines of the XPLOR topology file, although at present this implementation is much less general. The topology file gives the connectivity and other bond information for the ligand and at present must be generated "by hand".

The two main keywords are the bond and dihedral keywords. The bond keyword specifies that two atoms are to be bonded. Each bond in the molecule must be represented exactly once. The dihedral keyword specifies which bonds are to be rotatable, how the angles are to be measured, how to calculate torsion energy, and how the torsion is to be identified. Unlike bonds, dihedrals do not need to be defined: leaving out a naturally occuring dihedral has the effect of "freezing" that rotatable bond.

Before the bond and dihedral keywords, there must be at least one residue keyword. This keyword specifies the residue name and number to be the default.

   nbtype type 
      Set the type of nonbonded interactions. This will determine 
      which type of interaction will be excluded: 1-2, 1-3 or 1-4. 
      The default is 1-4. Of course, 1-4 will exclude 1-3 and 1-2 
      interactions as well. The following are the legal values  
      for type: 1-2, onetwo, 1-3, onethree, 1-4, onefour. 
 
      * Bump checks are always performed onefour, regardless of what 
        value nbtype is set. 
 
   dhtype  name n_mins chi_0  E_max 
      Define a dihedral type.  Name is the type name,  n_mins is 
      the number of minima through 360 degrees,  chi_0 is the 
      angle of the first minimum in degrees, and  E_max is the 
      maximum energy barrier ( the minima have energy = 0). 
 
      * MMFF dihedrals are implemented automatically from a library 
        and are not controlled by the topology file. 
  
   residue  name number 
      This specifies that the bond and dihedral input applies to 
      this specific residue. This keyword MUST proceed all 'bond' 
      and 'dihedral' keywords. 
  
   end 
      End input.  
  
   bond name1 name2 
      Form a bond between this two atoms. See note below regarding 
      atom names. 
  
   dihedral name1 name2 name3 name4 dihedral_type_name dihedral_name direction 
      Form a dihedral. The first four strings are the atom names  
      of the dihedral, in order. The next string is the name  
      of the dihedral type, which must be read in earlier using  
      the 'dhtype' keyword. The last string is an arbitrary name  
      for the specific dihedral and should be unique. 

     "direction" is optional and can be FORWARD, REVERSE or DEFAULT. 
     This controls which end of the dihedral bond is actually rotated, 
     and which part is fixed. FORWARD forces the last atom to be 
     moved, REVERSE the first, and DEFAULT selects whichever has the 
     fewest atoms which would be moved due to the dihedral rotation. 
The field name in 'bond' and 'dihedral' can be of two possible forms. First, they can be simply a single word, in which case they are assumed to belong to the current residue (name and number). The other form includes the residue name and number explicitly, and overrides the default values set by the 'residue' keyword. The two forms are:
   atomname 

   atomname:resname,res#  

Notes

The two forms can be freely mixed, allowing bonds and dihedrals to be made between different residues. Also, even if all names explicitly include residue names and numbers, the 'residue' keyword must be included.

Topology files usually have the .top extension.

Conformer files

Conformer files are used by research to select alternate conformations for a molecule. Here, we are only concerned with the description of the conformer file format. For the generation and use of conformer files, see the documentation for the specific programs, such as genconf, genmole, pareconf, and research.

A conformer file consists of a sequence of lines of the form:

  conformer 
  dvalue  dname  angle 
  ... 
  end 
  ... 
  conformer  
  dvalue  dname  angle 
  ... 
  end 
The conformer file must have at least one 'conformer' keyword and be followed by at least one 'end' keyword. In between two instances of such keywords, there can be a list of any number of 'dvalue' keywords, including none at all. Each dvalue keyword represents an action, being that the specified keyword should have it's angle set to the given angle in degrees. Dihedrals that are not listed are left unchanged. The dname field refers to the name given to the specific dihedral in the topology file (the last argument of the 'dihedral' keyword, see above).

When more than one 'conformer' ... 'end' sequence is present in a conformer file, the file represents a set of conformers alternate conformers for the molecule.

The trivial conformer file consisting only of the lines:

  conformer 
  end 
is legal and is the "dummy" conformer file. It's effect is to leave the molecule unchanged.

Conformer files usually have the .conf extension.

Appendix III: Covalent docking

DockVision may be used to perform covalent docking. The following steps give an outline for the procedure.

1) 
    Modify the ligand structure to include the entire residue on the
    protein to which it is covalently bound. Thus if "L" is the
    ligand and it is covalently bound through the O_gamma on a Ser
    (eg. an ester), then you would build the ligand structure:


        N
        |
        CA-CB-OG-L
        |
        C

    It is best to leave the carbonyl oxygen off of the new structure.
    This new structure is the "probe" for DockVision.

2)
    Modify the target protein so that this residue is replaced by a Gly.
    You don't have to rename the residue, just delete the sidechain atoms.


3)
    Superimpose the modified ligand onto the modified residue of the target.
    You can superimpose by matching the atoms N,CA,C of the ligand with
    N,CA,C of the modified target. Use a program to do this (most modelling
    packages can do this) or do it graphically.

4)
    Generate a topology file for the modified ligand. If the the coordinate
    file for this is "mod_lig.pdb", then go

      otto -o mod_lig.top mod_lig.pdb

5)
    Edit the topology file. The important thing to change is the direction
    with which the dihedrals act. When rotating a dihedral, there are two
    ways it can act, according to which end is considered fixed and which 
    end is allowed to move. By default, DockVision assigns dihedrals so the
    fewest number of atoms are moved. But for covalent docking, we want to 
    keep the superimposed part always fixed and allow the free end to move.
    The topology file allows us to control each dihedral separately. The
    dihedrals specified by the "dihedral" keyword, for example:

    dihedral O:TFA,256 C:TFA,256  CT:TFA,256  F1:TFA,256 CC3 DH1 DEFAULT

    The first 4 strings after the keyword specify the 4 atoms defining the
    dihedral (atoms are specified as "atomname:residuename,residuenumber").
    The next specifies the dihedral type (here CC3), then a label identifying
    this particular dihedral (here DH1... used by the conformer sets). Finally,
    the optional keyword DEFAULT determines which end of the dihedral will
    be actually rotated when the angle changes. DEFAULT means that whichever
    end has the fewest atoms will be changed. However, there are two other legal
    values: FORWARD and REVERSE. FORWARD means rotate the end with the last
    atom (here F1:TFA,256), REVERSE means rotate the end with the first atom
    (ie. O:TFA,256).

    For covalent docking, you must change the values from DEFAULT to either
    FORWARD or REVERSE, so that the end of the ligand with the covalent
    modification, that is the end with the bound residue on the modified
    ligand, is always fixed. You may need to sketch the ligand out to do this.

6)
    Build a blank file. A blank file determines atoms in the ligand which will
    be invisible to the protein. These must include atoms which would be a
    1-3 interaction or less from any atom on the covalently bound ligand (since
    DockVision normally includes 1-4 interactions). In the example in 1),
    we would exclude all atoms that form the modification. Assuming we call
    these atoms to be SER 1, then our blank file should be:

      N:SER,1
      CA:SER,1
      C:SER,1
      CB,SER,1
      OG,SER,1

    The blank file should just be these line in a file.


7)

    Build a special annealing schedule. You can use your favourite schedule,
    but we have to turn the translation and rotation searches off. To do this,
    we use a work-around. Suppose your favourite schedule was:

      #
      FIXED
      4
      500  100     12.0     2.0
      200  100      6.0     1.0
      50   200      5.0     1.0
      10   400      2.5     0.5

    Then you want to change this to:

      #
      FIXED
      4
      500   100     12.0e-6     0.0
      200   100      6.0e-6     0.0
      50    200      5.0e-6     0.0
      10    400      2.5e-6     0.0

    This makes the translational changes zero and the rotational searches
    negligible. The torsional search will be handled below. 

8)

    Now start DockVision. You want to run Research in refine mode. You will
    want to use the modified ligand as the probe, and use the topology file
    you already made (under Flexibility:Topology you want to "chose file").
    You may use conformers if you like. Alternatively, you may have modelled
    the ligand in and want to refine. The automatically generated conformers
    should work okay.

9)

    Chose your modified target (with the covalent residue changed to a gly).

10)

    The grid doesn't matter. If you already generated one, use that. If not,
    then generate a new one but type in some bounds that are small. We will
    work around using the grid (Note: in the official release, DockVision 
    will not demand that a grid be present. But for now, we use a work-around).
    No near grid.

11)

    Options. 

    Constraint: something big doesn't matter

    Annealing schedule: pick the one you made (see 7)

    Blankfile: chose the file you made earlier (see 6)

    Choose "refine mode" and probably "conformation energy"

    ftol, ctol: something large: 1000.0 for each should do
    (this has the effect of turning the grid off completely)

    torconv: 1.0e+6
    (torconv is a factor, which is multiplied by the rotational scale factor
    in the annealing schedule to give the magnitude of changes to torsions
    during the Monte Carlo. Since the rotational magnitudes were 12.0e-6, etc,
    this will give a net change to torsions of 12.0 degrees, etc)

    other options: as normal for your system (if you're using "conformational 
    energy", then you probably want ecut > 0)

12)
    Start the docking. This will dock the ligand in the initial position you
    placed it (after superposition), changing only the dihedral angles.


  If things go wrong:

    You may find the dockings do not have the covalently attached part of the
    ligand in the correct place. In this case, you need to check:

      - that the original modified ligand structure was positioned correctly
        (look at your modified ligand with your modified target)

      - you made ftol and ctol big (use larger values, if necessary 1.0e+12,etc)

      - check that your dihedrals FORWARD and REVERSE values are assigned 
        correctly.

      - that you're using the correct topology file

    If nothing seems to be moving around:

      - check your annealing schedule

      - check your torconv value

      - check your blank file

Appendix IV: DockVision Errors

When an error occurs in DockVision, an "error box" will usually pop up reporting the program or routine in which the error occured, a brief message describing the nature of the error and a number corresponding to the error type. While the brief message in some cases gives a sufficiently complete explanation so the problem can be fixed, more detailed information can sometimes be required. In this appendix, a more complete description of the error and possible work-arounds are described. This appendix is organized according to program/routine and error number for easy reference when looking up an error.

Select the program that generated the error (this will be indicated in the dialogue box):

otto
chargeit
confman
surfgrid
flgrid
Research
Gamma
RSDB

Otto

   Error 1

      Improper use of flags/command line.

      The most likely problem is the wrong version of otto is being run.

      From a shell, type

         which otto

      to determine if the correct version of otto is being used. If an earlier
      version of DockVision has been installed on your system, this could be
      the problem. The solution is to change you path so the location of the
      newest version is earlier in the path than the older version, or remove
      the old version from your path completely.

   Error 2

      Input (pdb) file doesn't exist or is unreadable.

      An error has occurred reading the PDB coordinate file. Either the
      coordinate file doesn't exist, has an incorrect format or has been
      corrupted (or is empty). Check the PDB file being read (usually the
      ligand coordinate file).

   Error 3

      The output or status file cannot be opened.

      Most likely cause of this problem is the current directory does not have
      ownership or permission correctly set to open a file for writing (output).
      Check the ownership and permission on the current directory. Another
      possibility is that the disk being written to is full.

   Error 4

      The bond data file cannot be opened or read.

      The most likely cause is the DockVision environment variables are not
      properly set. From a shell, type

         printenv

      and check that the DOCKVISION_DATA environment variable has been set. If
      not, then you need to source the DockVision startup file, which should
      set this environment variable correctly. If this variable is set, then
      perform an 'ls' on the directory referred to by the variable, and check
      that the directory exists and contains an 'otto.dat' file. You should be
      able to view (read) this file.

      If the environment variable is set but there is no corresponding directory,
      then the DockVision startup file has not been properly configured. Edit it
      set the DOCKVISION_HOME variable to the base level directory containing the
      DockVision package.

   Error 5

      The molecule fails connectivity test.

      There are two possibilities: either the molecule is not completely connected
      (in which case the attempt to assign a topology fails) or there is a
      problem with the coordinate file. In the former case, it is likely the
      molecule has unusual bond lengths or consists of more than one distinct
      molecule. In the latter, it is likely there are no atoms read in or there
      was a problem with the format of the file.

Chargeit

   Error 1

      Improper use of flags/command line.

      The most likely problem is the wrong version of chargeit is being run.

      From a shell, type

         which chargeit

      to determine if the correct version of otto is being used. If an earlier
      version of DockVision has been installed on your system, this could be
      the problem. The solution is to change you path so the location of the
      newest version is earlier in the path than the older version, or remove
      the old version from you path completely.

   Error 2

      The output or status file cannot be opened.

      Most likely cause of this problem is the current directory does not have
      ownership or permission correctly set to open a file for writing (output).
      Check the ownership and permission on the current directory. Another
      possibility is that the disk being written to is full.

   Error 3

      Input (pdb) file doesn't exist or is unreadable.

      An error has occurred reading the PDB coordinate file. Either the
      coordinate file doesn't exist, has an incorrect format or has been
      corrupted (or is empty). Check the PDB file being read (usually the
      ligand coordinate file).

   Error 4

      The bond data file cannot be opened or read.

      The most likely cause is the DockVision environment variables are not
      properly set. From a shell, type

         printenv

      and check that the DOCKVISION_DATA environment variable has been set. If
      not, then you need to source the DockVision startup file, which should
      set this environment variable correctly. If this variable is set, then
      perform an 'ls' on the directory referred to by the variable, and check
      that the directory exists and contains an 'otto.dat' file. You should be
      able to view (read) this file.

      If the environment variable is set but there is no corresponding directory,
      then the DockVision startup file has not been properly configured. Edit it
      set the DOCKVISION_HOME variable to the base level directory containing the
      DockVision package.

Confman

   Error 1

      Improper use of flags/command line.

      The most likely problem is the wrong version of confman is being run.

      From a shell, type

         which confman

      to determine if the correct version of otto is being used. If an earlier
      version of DockVision has been installed on your system, this could be
      the problem. The solution is to change you path so the location of the
      newest version is earlier in the path than the older version, or remove
      the old version from you path completely.

   Error 2

      Input (pdb) file doesn't exist or is unreadable.

      An error has occurred reading the PDB coordinate file. Either the
      coordinate file doesn't exist, has an incorrect format or has been
      corrupted (or is empty). Check the PDB file being read (usually the
      ligand coordinate file).

   Error 3

      The input topology (.top) file doesn't exist or is unreadable.

      Either the appropriate topology file doesn't exist or has an incorrect
      format (or is empty). Make sure in the flexibility form that
      a correct topology file has been generated or selected.

   Error 4

      The output or status file cannot be opened.

      Most likely cause of this problem is the current directory does not have
      ownership or permission correctly set to open a file for writing (output).
      Check the ownership and permission on the current directory. Another
      possibility is that the disk being written to is full.

Surfgrid

   Error 1

      Improper use of flags/command line.

      The most likely problem is the wrong version of surfgrid is being run.

      From a shell, type

         which surfgrid

      to determine if the correct version of otto is being used. If an earlier
      version of DockVision has been installed on your system, this could be
      the problem. The solution is to change you path so the location of the
      newest version is earlier in the path than the older version, or remove
      the old version from you path completely.

   Error 2

      The output or status file cannot be opened.

      Most likely cause of this problem is the current directory does not have
      ownership or permission correctly set to open a file for writing (output).
      Check the ownership and permission on the current directory. Another
      possibility is that the disk being written to is full.

   Error 3

      The atom parameter file has failed to be opened and read.

      The most likely cause is the DockVision environment variables are not
      properly set. From a shell, type

         printenv

      and check that the DOCKVISION_PARAMETERS environment variable has been set.
      If not, then you need to source the DockVision startup file, which should
      set this environment variable correctly. If this variable is set, then
      perform an 'ls' on the directory referred to by the variable, and check
      that the directory exists and contains an 'stdv2.alib' file. You should be
      able to view (read) this file.

      If the environment variable is set but there is no corresponding directory,
      then the DockVision startup file has not been properly configured. Edit it
      set the DOCKVISION_HOME variable to the base level directory containing the
      DockVision package.

   Error 4

      The input parameter file doesn't exist or is unreadable.

      A required molecular parameter file doesn't exist or has incorrect format.
      Ensure that the appropriate parameter has been selected or generated in
      the Target form.

   Error 5

      Input (pdb) file doesn't exist or is unreadable.

      An error has occurred reading the PDB coordinate file. Either the
      coordinate file doesn't exist, has an incorrect format or has been
      corrupted (or is empty). Check the PDB file being read (here the target
      coordinate file).

Flgrid

   Error 1

      Improper use of flags/command line.

      The most likely problem is the wrong version of surfgrid is being run.

      From a shell, type

         which surfgrid

      to determine if the correct version of otto is being used. If an earlier
      version of DockVision has been installed on your system, this could be
      the problem. The solution is to change you path so the location of the
      newest version is earlier in the path than the older version, or remove
      the old version from you path completely.

   Error 2

      The data input file doesn't exist or is unreadable.

      This error should not normally occur when running within the GUI interface.

   Error 3

      The output file cannot be opened.

      Most likely cause of this problem is the current directory does not have
      ownership or permission correctly set to open a file for writing (output).
      Check the ownership and permission on the current directory. Another
      possibility is that the disk being written to is full.

   Error 4

      The status file cannot be opened.

      Most likely cause of this problem is the current directory does not have
      ownership or permission correctly set to open a file for writing (output).
      Check the ownership and permission on the current directory. Another
      possibility is that the disk being written to is full.

   Error 5

      Input (pdb) file doesn't exist or is unreadable.

      An error has occurred reading the PDB coordinate file. Either the
      coordinate file doesn't exist, has an incorrect format or has been
      corrupted (or is empty). Check the PDB file being read (here the target
      coordinate file).

   Error 6

      Illegal grid parameters have been assigned.

      Check the grid parameters selected in the Grid form. This error could
      be due to selecting a non-positive number for the X, Y or Z grid steps, or
      a non-positive number for the gridsize.

   Error 7
      An internal error has occurred. Please report this to:
      support@dockvision.com

Research

   Error 1

      The output file cannot be opened.

      Most likely cause of this problem is the current directory does not have
      ownership or permission correctly set to open a file for writing (output).
      Check the ownership and permission on the current directory. Another
      possibility is that the disk being written to is full.

   Error 2

      Failure in building potential function.

      The force field (potential function) requires a set of atom and molecular
      library files to be read. The most likely problem is that one of the
      required files has an improper format or doesn't exist. Go back and
      re-run the simulation making sure the "Generate Automatically" option
      has been selected in both the Ligand and the Target form.
      Another possibility is the DOCKVISION_PARAMETERS environment variable is not
      correctly set.

   Error 3

      Illegal or unimplemented option or value used.

      This error may occur if a job parameter file with an incorrect format
      is read.

   Error 4

      Input (pdb) file doesn't exist or is unreadable.

      An error has occurred reading the PDB coordinate file. Either the
      coordinate file doesn't exist, has an incorrect format or has been
      corrupted (or is empty). Check the PDB file being read (usually the
      ligand coordinate file).

   Error 5

      Failure in molecule parameter assignment.

      The ligand or target molecule failed in assigning of van der Waals
      or charge parameters from the potential function. Go back and
      re-run the simulation making sure the "Generate Automatically" option
      has been selected in both the Ligand and the Target form.

      If MMFF forcefield has been selected, then automatic parameter
      generation will not work. The MMFF parameters must be generated
      using the forcefield builder written by Richard Gillilan to the
      Cornell Theory Centre.

   Error 6

      The input topology (.top) file doesn't exist or is unreadable.

      Either the appropriate topology file doesn't exist or has an incorrect
      format (or is empty). Make sure in the flexibility form that
      a correct topology file has been generated or selected.

   Error 7

      Failure in building conformers.

      The conformer file was either doesn't exist or is in an incorrect format.
      Make sure in the flexibility form that the "Generate Automatically"
      option is selected (this will generate the necessary conformer file).

   Error 8

      Failure in assigning dihedral parameters.

      A rare error you will not likely encounter. Please report this to:
      support@dockvision.com

   Error 9

      Failure to open/read grid file.

      The input grid file either doesn't exist or is in an incorrect format
      (including possibly an empty file). The file may possibly be empty due
      the premature termination of the grid-generating program (flgrid or
      surfgrid). Re-run the "Auto Generate Grid" options in the Grid form.
      Make sure to unselect "Auto Bounds" if you wish to use the parameters
      entered on the right side of the form.

   Error 10

      Failure to open/read annealing file.

      The file doesn't exist, is unreadable or is in an incorrect format. Either
      select another file in the Options form or generate a new file.

   Error 11

      Failure top open/read constraint file.

      The file doesn't exist, is unreadable or is in an incorrect format. Either
      select another file in the Options form or generate a new file.

   Error 12
   
      Initial ligand position not in constraint set.

      In refine mode, the initial position of the ligand is outside the current
      set of contraints. This is a manifestation of a known bug which will be
      corrected in a future release. The current work-around is to create a larger
      constraint set specifically for doing refine mode calculations.

Gamma

   Error 1

      The output file cannot be opened.

      Most likely cause of this problem is the current directory does not have
      ownership or permission correctly set to open a file for writing (output).
      Check the ownership and permission on the current directory. Another
      possibility is that the disk being written to is full.

   Error 2

      Illegal or unimplemented option or value used.

      This error may occur if a job parameter file with an incorrect format
      is read.

   Error 3

      User interupt.

      Not technically an error, this error number appears when the user has
      issued a kill signal (control-C) in the DockVision window.

   Error 4

      Input (pdb) file doesn't exist or is unreadable.

      An error has occurred reading the PDB coordinate file. Either the
      coordinate file doesn't exist, has an incorrect format or has been
      corrupted (or is empty). Check the PDB file being read (usually the
      ligand coordinate file).

   Error 5

      The input topology (.top) file doesn't exist or is unreadable.

      Either the appropriate topology file doesn't exist or has an incorrect
      format (or is empty). Make sure in the flexibility form that
      a correct topology file has been generated or selected.

   Error 6

      Failure in building conformers.

      The conformer file was either doesn't exist or is in an incorrect format.
      Make sure in the flexibility form that the "Generate Automatically"
      option is selected (this will generate the necessary conformer file).

   Error 7

      Failure to open/read grid file.

      The input grid file either doesn't exist or is in an incorrect format
      (including possibly an empty file). The file may possibly be empty due
      the premature termination of the grid-generating program (flgrid or
      surfgrid). Re-run the "Auto Generate Grid" options in the Grid form.
      Make sure to unselect "Auto Bounds" if you wish to use the parameters
      entered on the right side of the form.

   Error 8

      Failure in reading system data file.

      Make sure the DOCKVISION_DATA environment variable is correctly set.

   Error 9

      Failure in reading genop file.

      The file doesn't exist, is unreadable or is in an incorrect format. Either
      select another file in the Options form or generate a new file.

   Error 10

      Failure in reading organism file.

      The file doesn't exist, is unreadable or is in an incorrect format. Either
      select another file in the Options form or generate a new file.

RSDB

   Error 1

      The output file cannot be opened.

      Most likely cause of this problem is the current directory does not have
      ownership or permission correctly set to open a file for writing (output).
      Check the ownership and permission on the current directory. Another
      possibility is that the disk being written to is full.

   Error 2

      Failure in reading system data file.

      Make sure the DOCKVISION_DATA environment variable is correctly set.

   Error 3

      Failure in building potential function.

      The force field (potential function) requires a set of atom and molecular
      library files to be read. The most likely problem is that one of the
      required files has an improper format or doesn't exist. Go back and
      re-run the simulation making sure the "Generate Automatically" option
      has been selected in the Target form. Another possibility is the
      DOCKVISION_PARAMETERS environment variable is not correctly set.

   Error 4

      Input (pdb) file doesn't exist or is unreadable.

      An error has occurred reading the PDB coordinate file. Either the
      coordinate file doesn't exist, has an incorrect format or has been
      corrupted (or is empty). Check the PDB file being read (usually the
      ligand coordinate file).

   Error 5

      Failure to open/read grid file.

      The input grid file either doesn't exist or is in an incorrect format
      (including possibly an empty file). The file may possibly be empty due
      the premature termination of the grid-generating program (flgrid or
      surfgrid). Re-run the "Auto Generate Grid" options in the Grid form.
      Make sure to unselect "Auto Bounds" if you wish to use the parameters
      entered on the right side of the form.

   Error 6

      Failure to open/read annealing file.

      The file doesn't exist, is unreadable or is in an incorrect format. Either
      select another file in the Options form or generate a new file.

   Error 7

      Failure top open/read constraint file.

      The file doesn't exist, is unreadable or is in an incorrect format. Either
      select another file in the Options form or generate a new file.

   Error 8

      Illegal or unimplemented option or value used.

      This error may occur if a job parameter file with an incorrect format
      is read.

Appendix V: Converting Databases

In order to perform database screening with DockVision, the databases must first be converted into PDB format. A utility program is included with DockVision to convert two of the more common formats, SDF and MOL. This program is called dbconvert.
   useage: 
      dbconvert -F format [-d -H -i input_file -o output_file]
      dbconvert -h
      Flags: 
         -i                   specify input file (default stdin)
         -o                   specify output file (default stdout)
         -d                   print debugging information
         -H                   add polar hydrogens to output
         -F                   specify input format
                              -F mol
                              -F sdf
         -h                   print help
The -i flag specifies the database file to convert from, the -o flag specifies the name of the new PDB file to be created. The -F flag chooses the format to convert from, either SDF or MOL. Note that the input database must be a 3D structural database: dbconvert will not assign 3D structure from 2D information. The optional flag -H can be specified if the original database does not include polar hydrogens on the structures and these are desired. When screening databases with DockVision, polar hydrogens should normally be added if they are not present. Of course, full hydrogen representations can also be used.

Appendix VI: Handling Large Output Files

DockVision can generate large output files, particularly in database screening applications where the output file and be almost as large as the original database. The program outman can be used to select subsets of the output files for further analysis. This program takes as input a multiple PDB file generated by DockVision and generates a new multiple PDB file, ordered by lowest energy first. It can be used to reduce the size of a database screening or other large docking run, generating a new file that is more managable in size. The flag -n allows the size of the output file to be limited to a maximum size, so that only the top number of dockings are saved to the output file.
   useage: outman [flags] -i input_file -o output_file
           outman -h
   User flags:
     -n         specify a maximum number of molecules for the
                        output file
     -i           name of input file
     -o           name of output file
   
   Examples
   
      Take the 100 best dockings from a database screen:
   
         outman -i database.out -n 100 -o best.pdb