The CCDC has maintained internal research and collaborations in the field of protein structure and modelling and has co-developed a sophisticated and highly versatile pharmacophore query tool called CSD-CrossMiner. It delivers an interactive search experience across databases of protein-ligand binding sites from the PDB as well as small molecule crystal structures from the CSD, with application areas including intermolecular interaction searching, scaffold hopping and the identification of novel fragments for specific protein environments.
The high-quality structural information available in the Cambridge Structural Database (CSD) can again be used to good effect when studying protein geometry and non-bonded interactions in proteins. Finding ways to use this information effectively is a primary research area for the CCDC.
Other recent research includes the development and delivery of methodologies for non-sequence-based cavity searching. The CSD Python API now contains three different approaches for cavity and pocket searching in proteins which allow a trade-off between very high speed and very high accuracy.

Research Plans

One research area is the development of methods to validate ligand models that are refined from X-ray data from protein-ligand complexes. Such models often appear highly strained and it is an important open question how much of this is due to errors in the refinement process and how much is truly low strain. Similarly, intermolecular interactions in proteins can be validated against CSD data. Another research interest is to develop knowledge based assessments of the quality of a ligand-protein interaction, which may prove more reliable than using other computational techniques. This can be used in approaches to predict the binding affinity of a given ligand.