Glossary of Cambridge Structural Database (CSD) Terms
What is a CSD refcode? What is a CSD deposition number? What’s the difference between a CIF and a GCD file? Here we present a quick glossary of key terms to help your work with the Cambridge Structural Database, or CSD.
This collection of over 1.2 million small-molecule organic and metal-organic crystal structures, curated for use in cheminformatics and computational chemistry work, is the result of the global scientific community’s contributions since 1965. The data are used by scientists around the world, in commercial and academic research.
Specialist terms are used to reference individual structures, talk about data at different stages, or define properties of structures in the CSD. This glossary defines many of the technical terms used when working with the CSD. If you think of a term that we haven’t covered, contact us here and we can update the list!
Top tip: a lot of technical terms and definitions are also included in the ConQuest User guide.
CSD Technical Terms Definitions
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Term | Definition |
CCDC |
|
CCDC Number |
|
checkCIF |
|
CIF |
CIF is also sometimes referred to as a Crystallographic Information Framework, which reflects that a CIF has dictionaries and rules that enable many aspects of an experiment to be meaningfully (i.e. semantically) captured to enable reuse by researchers and machines. |
Compound name |
|
CSD |
|
CSD Communication |
|
Density (CCDC) |
Density of the crystal, calculated from the reported chemical formula and unit cell data, using the relationship: Density = (1.66 x formula weight x Z) / unit cell volume where Z is the number of molecules in the unit cell. |
Deposition number |
Used to connect datasets with articles. Remains the same if the dataset is updated up to the point of publication. Is persistent, unique and can be resolved via identifiers.org - e.g. https://identifiers.org/ccdc:631927 |
DOI |
DOIs can be resolved through the Digital Object Identifier System and when a DOI is minted, relevant metadata is shared through the DOI provider. |
Entry |
|
ORCID iD |
|
Refcode |
Each CSD entry is assigned a unique identifier comprising of 6 letters, sometimes followed by an additional 2 digits (see refcode family). Provides a way to quickly find an entry within the CSD. For example, the structure of acetaminophen or paracetamol has the refcode COTZAN. Early Refcodes aimed to reflect the Compound Name associated with the structure, but new substances are now assigned a new randomly generated refcode. See our blog - a potted history of the CSD refcode about how refcodes have evolved. Reseachers are encouraged to quote CSD Refcodes when referencing entries in the CSD. Refcodes can be resolved via identifiers.org - e.g. https://identifiers.org/csd:COTZAN |
Refcode family |
Families do not group together:
Families can provide a convenient way to identify different polymorphs of the same structure. |
Remarks/Chemical Notes |
|
Subset |
CSD subsets are targeted collections of structures that are a convenient starting point for research into a particular field. See: |
Structure |
|
Teaching Subset |
|
Tags
CSD (106)
Database (19)