There are three validation sets available for download:
The Astex Non-native Set consists of the following:
Sixty-five directories, one for each native protein selected from the Astex Diverse Set.
Each "native" directory contains directories for each non-native structure. For example Astex Diverse Set
entry 1gm8 has four non-native structures: 1fxh, 1fxv, 1gkf and 1gm7. The top level directory is called 1gm8 and has 4 subdirectories called
1fxh (non-apo), 1fxv (non-apo), 1gkf (apo), 1gm7 (non-apo).
Each subdirectory will contain some or all of the following:
- protein.mol2 file
- ligand001.pdb file
The extracted co-crystallised active site ligand from the non-native protein, only for non-apo structures.
Note that the ligand files have not been prepared for docking i.e. bond types and protonation states are likely to be
incorrect.
- ligand_other.pdb file
Any other non active site ligands from the protein file.
- other.pdb file
Contains disordered atom coordinates, any other co-factors other than ligands or water.
- water.pdb file
Contains the extracted water atoms.
To download the Astex Non-native Set simply click on the link below:
astex_non_native_set.tar.gz (~185Mb)
Note: to obtain the native protein and ligand files for the Non-native Set you will also need to download the Astex Diverse Set below.
The Astex Diverse Set consists of the following files:
- protein.mol2 file
- ligand.mol file
This is both the input and the reference file.
- protein_opt_h_gs.mol2 file
A SYBYL MOL2 file for the protein for which the flexible hydrogen atoms
on Ser/Thr/Tyr/Lys residues have been optimised with the GoldScore function.
- protein_opt_h_cs.mol2 file
A SYBYL MOL2 file for the protein for which the flexible hydrogen atoms on
Ser/Thr/Tyr/Lys residues have been optimised with the ChemScore function.
To download the Astex Diverse Set simply click on the link below:
astex_diverse_set.tar.gz (~56Mb)
|
CCDC/Astex Validation Set
|
The CCDC/Astex set consists of the following files:
-
protein.mol2 file
-
ligand_reference.mol2 file
This contains the ligand pose as found in the PDB entry. Entries with
multiple binding modes, such as 1abe, are stored as follows: ligand_reference1.mol2,
ligand_reference2.mol2, with the accompanying protein files protein1.mol2 and
protein2.mol2.
-
ligand_reference_min.mol2 file
This file contains a 'normalised' version of the ligand_reference; a
short minimisation run was performed to clean up bond lengths and bond
angles. It is the input file used for the docking experiments.
-
gold.conf file
The GOLD configuration file can be used with the GOLD docking program. It also contains the centre and radius of the binding site. For covalently-bound ligands, a flag is set in this file and atom numbers of the link are stored.
-
water.mol2 file
This file is available for those PDB entries that include a water set;
it is currently only available for entries that were not included in the previous GOLD validation set.
To download the CCDC/Astex validation set simply click on the link below:
ccdc_astex_set.tar.gz (~32Mb)
|
Original GOLD Validation Set
|
GOLD was originally validated on a two phase test, initially on a set of
100 complexes and later on an additional 34 complexes
as a check against over-training. A file containing coordinates for all 134
test complexes is also available to download:
original_set.tar.gz (~9mb)
|