CSD data update 2021.3 - December 2021

We are pleased to announce the launch of the 2021.3 CSD Data Release containing 1,161,919 entries and 1,138,368 unique structures. Here we explore the data changes over this year.

This is an increase of over 63,000 entries this year; including over 400 historical entries converted from hardcopy data. Alongside all this new data over 25,000 existing CSD entries have been improved and enhanced through our annual CSD Improvements programme, and a series of new CSD subsets included with your CSD Software Portfolio.

 

What can you expect to find in our new data release?

Your latest release contains four new CSD subsets, as described in our recent blog: How to use the new subsets in the 2021.3 CSD release. We hope the latest additions to the subsets available within the CSD Software Portfolio will help more researchers to find structures of interest quickly and easily by providing a convenient starting point for their queries. An example from one of the new subsets is given below, the CSD entry OZEMIS published in 2021 by Omar Yaghi and co-workers is an electron diffraction refinement of the metal-organic framework MOF-303. The structure is of interest for the separation of the noble gases Xenon and Krypton.

 

Structure of MOF-303 determined by electron diffraction - from the CSD entry OZEMIS

CSD Entry OZEMIS, the structure of MOF-303 determined by electron diffraction

 

As always, the new structures that have been added to the CSD this year are diverse in chemistry and of interest to a number of different research areas. CSD Entry USAZOG has a highly-unusual linear tri-magnesium unit. The structure is described in an article published in Nature, and the authors suggest the complex may help to explain the formation of Grignard reagents.

 

Magnesium complex with unusual triple magnesium coordination as published in Nature

CSD Entry USAZOG, a magnesium complex with an unusual Mg(i)-Mg(0)-Mg(i) coordination

 

Alongside structures associated with scientific articles this release increases the number of CSD Communications shared through the database to over 43,000 entries in total and over 5,000 for 2021 releases. Alongside our efforts to convert data not previously shared electronically these structures help increase the amount of data available to scientists worldwide and help to increase the insights that can be derived from the CSD. As described in the blog for our September 2021 data release, this year we have manually converted several hundred hardcopy datasets, including the CSD entry EXOKIO, a structure of interest as a laser dye. This structure was published in the journal Russian Chemical Bulletin over 30 years ago, with the data now available in an electronic format for the first time.

 

CSD Entry EXOKIO, from literature data manually added to the CSD

 

What data has been improved in this release?

As well as releasing new entries, the CCDC has undertaken extensive work to enhance existing datasets. This year over 25,000 existing entries have been enhanced, and improvements include:

  • Standardisation and improvements to the pressure field
  • Improved labelling of non-standard structures including
    • Synchrotron structures
    • Electron diffraction structures
  • Continued improvements to melting point data
  • Increasing the amount of bioactivity/natural source info available
  • Ensuring polymorphs in the CSD Drug subset have appropriate labels
  • Providing more information in entries with large void spaces

 

See what else is new in this update, and learn how to update, here.

Stay up to date on every new release - register for our monthly newsletter here.