A snapshot of the CSD at 50 shows roughly exponential growth with 783,501 entries at the time of starting to write this blog (more by the time I finish). It isn’t just the number of entries that are on the rise; we are also seeing increasing complexity in the structures being submitted. A simple way to see this is by looking at how the molecular weight of structures has been changing over the years.
When the CSD first started our editors had to work hard to manually abstract data from publications and depositors would sometimes send us hardcopy data to be typed up. Since then things have changed considerably and as you can see from our depositor map we are now receiving electronic files from all over the world.
Depositors now have the option of sharing their otherwise unpublished structures through the CSD at the point of deposition. Last year saw a record number of depositors choosing to do this and 2015 looks set to break the records again. For some of our depositors, this can result in huge contributions to the CSD that may otherwise be lost to the world. Indeed, the race is on to be the first to share over 1,000 otherwise unpublished structures through the CSD. You can see from our wordle that Frank Fronczek is currently leading with over 900.
We are predicting that around 70,000 structures will have been added to the CSD this year but that really does depend on how many you publish or share with us - we are ready and waiting! You could also help us share one million structures with the world earlier than we are currently predicting, by following Frank’s lead and digging out your unpublished structures, too.
One of the latest structures added to the CSD this week is OJEGIV
a bicyclo[3.2.1]octadienone scaffold in naturally occurring octaketide dimers (Yoshio Ando, Subaru Hori, Takumi Fukazawa, Ken Ohmori, Keisuke Suzuki, Angewandte Chemie, International Edition, DOI: 10.1002/anie.201503442
). The corresponding publication highlights a new method for constructing the bicyclo[3.2.1]octadienone structure which serves as the core unit in the naphthocyclinone class of dimeric natural products.
As the diversity of the structures we receive increases so we see other trends as we process structures into the CSD. For example did you know that the number of structures containing atoms modelled over disordered sites is still increasing steadily? If you carry on at present rates then 50% of the structures deposited by 2055 will be disordered. One thing that does seem to be constant now is the average R-factor for new structures. Since the 1990’s, this has been roughly constant at 5%, so it appears you have collectively agreed on what an acceptable R-factor is!
Thank you to everyone who has contributed their data over the past half century, helping to create a resource of enormous value. Let’s raise a glass to the next 50!
You can download our CSD50 commemorative Newsletter and check out the recorded presentations from CSD50 symposium at www.ccdc.cam.ac.uk/CSD50
Cambridge Structural Database Manager