In the field of life sciences the lack of data often makes testing new
algorithms difficult, the only way to ensure that a method works is
by extensive application to test cases. Validation is invaluable for
achieving an insight into the strengths and deficiencies of programs
both during their development and on an ongoing basis.
A simple test of the effectiveness of a docking program is to take
a protein-ligand complex from the Protein Data Bank and extract the
ligand. The docking program can then be used to predict the binding
mode of the ligand and a comparison made with the crystallographically
observed position. This methodology has been used to validate GOLD on three occasions.
The Original Test Set
GOLD was originally validated on a two phase test, initially on a set
of 100 complexes and later on an additional 34 complexes as a check
against over-training.
The CCDC/Astex Test Set
This set is the result of a CCDC collaboration with Astex and comprises of 305 protein-ligand
complexes that were used to validate both GOLD and SuperStar.
The Astex Diverse Set
A set of 85 diverse, high quality protein-ligand complexes selected from the PDB
using newly developed analysis and classification techniques.
The Astex Non-native Set
A set of 1112 non-native structures for 65 targets, built on The Astex Diverse Set above.
The non-native set is an extensive set for assessing docking performance against non-native protein
conformations.
All test sets are freely available from the CCDC website.
|