How to use the new Subsets in the 2021.3 CSD Release to improve your structural science research

In the 2021.3 release, we launched four new subsets: Electron Diffraction, Polymorphs, Hydrates, and High Pressure. We're also updating the navigation and API access to all CSD Subsets. In this blog, we look at real-world use cases for the new CSD Subsets and tips for making the most of the new functionalities.

 

We’re excited to add the following new subsets to our CSD Subset collection:

  • Electron Diffraction
  • Polymorphs
  • Hydrates
  • High Pressure

Uses for the Electron Diffraction Subset

Routine structure solution from X-ray diffraction requires well-ordered, single crystals—ideally 100 µm or larger. However many known active pharmaceutical ingredients are available only as crystalline powders that do not easily form measurable single crystals. Electron diffraction is an emerging field that provides a way to solve such structures. As of the 2021.3 CSD Release (December 2021), the CSD contains about 150 structures solved via electron diffraction. However, typically the data quality isn’t as high as in single-crystal X-ray diffraction. So these valuable structures may be missed by researchers filtering across the CSD for “typical” structures. This subset will help keep these valuable structures easily accessible.

Uses for the Hydrate and Polymorph Subsets

Hydrates and polymorphs are key areas of concern for drug design. These subsets can help pharmaceutical researchers identify polymorphic families and the kinds of molecules that form hydrates. This is an important step to de-risking a new drug candidate and focused subsets for such research have real-world applications. In fact, the hydrate and polymorph subsets originated from work with our Crystal Form Consortium (CFC). The CFC consists of chemists from industry-leading pharmaceutical companies around the world who collaborate with CCDC software, database, and research experts on potential product improvements.

Uses for the High Pressure Subset

We define a high-pressure structure as one measured at 0.1 gigapascals (GPa) or higher. As with electron diffraction, sometimes high-pressure structures appear lower quality than typical structures in the CSD—since the apparatus required can complicate data collection and the pressure distorts the structure. This can make the structures difficult to find, and historically it has required advanced sorting. Leveraging customer feedback, we created the High Pressure Subset to help researchers focus their analyses on the most relevant structures. Identifying polymorphs that form under high-pressure conditions is a key use for such structures. The structures are also highly relevant to researching MOFs for gas storage.

Key new functionality for the CSD Subsets

Accessing the new and existing subsets will be easier than ever. We updated the Mercury interface to improve Subset navigation. As shown in the image below, the Subsets are now organized under the CSD-Core tab through dropdowns.

 

Updates to the CSD Python API

We also updated the CSD Python API to make writing queries easier. If you’re using our Python API, you’ll now see prompts to autofill your code with the correct Subset name.

Read more

Learn about additional updates and new features in the 2021.3 CSD Release.

Read more about our CSD MOF Collection, which is freely accessible for academic research.

Learn more about CSD-Core, an essential search, visualization, and analysis suite that delivers knowledge from the Cambridge Structural Database (CSD).