You may be wondering why we’ve added the structures in two separate databases. Here at the CCDC we believe CSD users want two things: crystal data available as soon as possible, and a database that contains consistent, high quality structural data. To achieve these two things we provide two separate solutions, which can be used together or independently.
CSD X-Press, as the name suggests, contains the most recent data available. This includes structures that are available online but not yet formally published in a journal (sometimes referred to as ‘Advance Articles’, ‘Early View’ or ‘Articles ASAP’), and the most recently published structures. To enable us to make these data available so rapidly, structures are added to CSD X-Press automatically. By working closely with major journal publishers, such as the ACS, RSC and IUCr, we can ensure that data are made freely available to CSD users as quickly as possible via CSD X-Press. However, this approach comes with certain caveats, which is why crystal data treated this way, as shown in the screenshot below, is in a separate CSD X-Press database.
An example of a CSD X-Press database entry, with provisional CSD refcode LODVEH00, first published as an Early View article on the 3rd March and currently without a full publication reference.
Automatic validation of crystallographic data is a challenging process. Structures may exhibit complex disorder, including cases where some atoms (especially hydrogen) may not be modelled at all. Situations like this can result in an automatic determination of the structure’s chemistry that may be uncertain. Whilst the systems developed here at the CCDC can successfully handle the majority of structures, some entries in CSD X-Press will not contain data found in a normal CSD entry, such as a compound name or 2D diagram.
Structures added to the CSD have been assessed by an expert Scientific Editor, here at the CCDC. This important quality control stage allows us to compare the new, recently added data with structures already in the CSD; by doing this we can add related structures (e.g. polymorphs) to refcode families, and include additional information, such as cross-referencing stereoisomers. Using the expertise of our Scientific Editors we can also, in consultation with the published manuscript, accurately assign the chemistry of structures in complex cases. In practise this means that after publication crystallographic data flows to CSD X-Press and then on to the CSD, as we undertake processes to assess and validate the chemistry and crystallography of each crystal structure.
CSD entry LOBSEC, one of the >700,000th structures, which is a compound extracted from the roots of Stemona tuberosa, a herb that is commonly used in traditional Chinese medicine.
By using a combination of automatic and manual validation procedures and regular CSD data updates, the CCDC delivers the highest quality crystal data to users as soon as possible. In the latest WebCSD update you’ll find our 700,000th example of this!