CSD Molecular Complementarity Tool Domain of Applicability
November 29, 2022
The Molecular Complementarity component is used to assess the likelihood of two molecules forming a co-crystal. We have recently validated this approach with a new dataset, and have found limitations to the model’s applicability. Here we explain the validation process, and advise when this model should be used.
What Does the Molecular Complementarity Analyser Do?The Molecular Complementarity component in Mercury, as part of CSD-Materials, is used to help identify molecules most likely to form co-crystals with one or more candidate active molecules. This means that the list of possible co-formers for a given target can be reduced before running more effective but more computationally-demanding approaches like motif search or multi-component hydrogen bond propensity calculations.
How Was the Molecular Complementarity Component Developed?The original research and validation on the Molecular Complementarity component in CSD-Materials were performed on small, neutral molecules with a dataset containing only positive observations (i.e. experimentally observed co-crystals), rather than negative observations (i.e. experimentally observed failures to co-crystallize). For co-crystallization to be likely, five key molecular descriptors were identified for which the difference between the values for the two co-crystal components should fall below determined threshold values. These descriptors are; the fraction of nitrogen and oxygen atoms, the dipole moment and three simple shape descriptors based on a molecular bounding box – the length of the short axis, the short/long axis ratio, and the medium/long axis ratio. Full details are available in Laszlo Fabian’s original (2009) co-crystal research into molecular complementarity here: https://doi.org/10.1021/cg800861m The CCDC later incorporated this Molecular Complementarity approach into a workflow of knowledge-based approaches to co-crystal design. This recommended workflow primarily utilizes the molecular complementarity component within CSD-Materials as an early stage in the co-former screening process to remove those co-formers which are highly unlikely to form co-crystals. Learn more in this 2014 publication on a workflow of knowledge-based approaches to co-crystal design using CSD software and data: https://doi.org/10.1039/C4CE00316K
How Was the Latest Validation Performed?For the latest validation exercise, CCDC researchers identified a dataset of approximately 2.5K co-crystal experiment observations from the literature. From this full list of observations, the team created a dataset of 45 APIs (Active Pharmaceutical Ingredients), which all had >15 co-formers in the screen, totalling about 1,500 observations overall. This list included both positive and negative observations. The 45 APIs were then used to validate several different co-crystal screening methods, including molecular complementarity. As seen in the table below, 39 of the APIs predicted experimental forms in at least one of the co-crystal screening methods. Precision was below 0.5 for most screens. Accuracy was variable, with some highly accurate screenings and some with low accuracy, particularly in APIs with a molecular weight greater than 300 Da.
|Dataset||% screens with Accuracy >0.5 (at least half the observed co-crystals are predicted)||% of screens with Precision >0.5 (at least half predicted co-crystals are correct)||% with F1 > 0.5|
|39 screens that predicted experimental forms||64%||38%||38%|
|8 screens that had MW > 300 screens||0%||50%||12.5%|