Solution

​The similarity calculation in WebCSD is based on molecular fingerprints that are calculated using the chemical features of the molecule such as atom types, bond types and bonded paths through the molecule. When a molecule is drawn in the similarity sketcher, the molecular fingerprint for this molecule is calculated and then it is compared to pre-calculated fingerprints of all the structures in the CSD. The fingerprint comparison is performed using either of the Tanimoto or Dice coefficients, this effectively gives a measure of the similarity between the molecules based on their fingerprints. Each of the coefficients will produce a similarity value in the range of 0 to 1, with 0 being completely dissimilar and 1 being identical, always in terms of fingerprints (i.e. 1 not always translates to identical structures). In order to produce a manageable set of similar structures a cut-off value for the similarity coefficient is used, below which value matches are discarded (the default for this is 0.7 for Tanimoto and 0.975 for Dice).

N.B. The two types of similarity coefficient are not directly comparable, so calculated similarity values cannot be compared between the two types in a quantitative fashion. 


« Return to search results