CSD in Action: Advancing Metal Organic Frameworks (MOFs) using Big Data and High Throughput Screening

Back To Discover

Written by

Michael Francis

Posted on

August 15, 2022

Here we highlight MOF research from the Sarkisov group at the University of Manchester on calculating pore volume and accessible surface area from crystal structures and comparing to and analysing using the 12K MOF structures in the Cambridge Stuctural Database (CSD). [1]

This is part of our series highlighting examples of the Cambridge Crystallographic Data Centre (CCDC) tools in action by scientists around the world.


Metal organic Frameworks (MOFs) have seemingly endless practical applications including gas purification, storage and adsorption, sensors, semiconductors, carbon capture, and desalination. This wide variety of applications presents the drawback of a seemingly endless landscape of potential structural candidates – over 90K MOFs have been synthesised with over 500K potential structures predicted. Many potential candidates combined with many potential applications calls for smart ways to focus research efforts.  

Focusing Research Efforts

High throughput screening using experimental data from existing structures is being used to overcome this ‘nice problem to have’ and scientists are now beginning to unlock the full potential of MOFs. Materials informatics tools that mine and analyse existing MOF structural data are being used to focus MOF experiments by predicting future successes and risks and identifying patterns. However, MOF structural data presents unique challenges including:
  • No clear consensus on inclusion of 1D, 2D and 3D networks and no IUPAC definition of organic[2]
  • Unbound solvents and labile solvent ligands must be properly handled to enable analysis
  • Data avalanche. New MOFs are characterised constantly with an almost exponential rise in last 40 years.[3]

Turning the MOF Structural Data Challenges into Opportunities

We present a study where materials informatics, including use of the Cambridge Structural Database (CSD), has been used to turn the structural data challenges into opportunities. Researchers can systematically search and analyse the existing MOF landscape, including over 100K validated and curated MOF crystal structures held in the Cambridge Structural Database (CSD). Established in 1965 with historical structures dating back to the 1920s, the CSD now contains over 1.1M accurate 3D structures with data from X-ray and neutron diffraction analyses and additional curation from the CCDC. The database is used by researchers across the pharmaceutical, agrochemical and fine chemicals industries to predict and guide future discoveries. Total flexibility is provided by a Python API that gives custom programmatic search and analysis.  

Calculating Pore Volume and Accessible Surface Area

Reporting in Chemistry of Materials, [1] Lev Sarkisov and team at the University of Manchester used the CSD and the CSD-Python API to calculate the structural properties of porous materials. The researchers updated the open-access code PoreBlazer that was developed to calculate properties from structures and then applied it to 12K MOF structures from the CSD. Non-bonded solvents were then filtered out using the CSD Python API. The results of the calculated properties compared to those that can be measured experimentally were presented, showing unexpected and interesting correlations between MOF geometric properties.    


This work shows that materials informatics can greatly aid MOF research and discovery, identifying many potential candidates from the plethora of predicted MOF structures for further investigation in wide ranging applications. By focusing experimental work, predicting results and searching the known landscape, materials informatics will increasing allow these materials to reach their full potential.  

Next Steps

Download the Advancing MOFs R&D with Materials Informatics ebook Read about optimizing metal-organic frameworks for the recovery of volatile organic compound emissions. Read more case studies about CCDC data and software being used in industry and academia. Learn more about the CSD and the CSD Python API.  


[1]Lev Sarkisov, Rocio Bueno-Perez, Mythili Sutharson, and David Fairen-Jimenez, Chem. Mater. 2020, 32, 23, 9849–9867; https://pubs.acs.org/doi/10.1021/acs.chemmater.0c03575. [2]R. Batten et al. CrystEngComm, 2012, 14, 3001-3004; Coordination polymers, metal–organic frameworks and the need for terminology guidelines - CrystEngComm (RSC Publishing). [3]Moghadam et al. Chem. Mater, 2017, 29, 7, 2618-2625; Development of a Cambridge Structural Database Subset: A Collection of Metal–Organic Frameworks for Past, Present, and Future | Chemistry of Materials (acs.org).