Using Materials Informatics For Porous Materials Development
December 13, 2023
This blog is based on the workshop “Analysing Porous Materials Using Mercury and ConQuest”. It provides tips and tricks on how to perform advanced searches for porous materials in ConQuest, and how to visualize and analyse polymeric bonds in Mercury. The full workshop can be accessed from the link here.
The Importance of Porous Materials
The Cambridge Structural Database (CSD) contains over 25,000 2D and 28,000 3D coordination polymers. As a lot of these compounds have voids or channels, the development of advanced and reliable tactics to analyse porous structures is becoming highly important.
Porous materials are used for a variety of industrial applications, such as gas storage and separation, catalysis, carbon capture, and for the development of energetic materials, batteries, and semiconductors. Materials informatics can be used in these industrial applications to support materials development. The data can help to focus experimental efforts, predict future successes and risks, identify patterns for further investigation, and systematically search the existing landscape.
The CCDC Search Software: ConQuest
ConQuest is our desktop search software. It is composed of file menus, tab options and a number of different search options including “Draw”, which can be seen in Figure 1. Once the search has been constructed, the list of results can be seen on the right-hand side, and different options are available to visualize the details of each of the structures.
To learn more about the basics of ConQuest, a self-guided workshop is available at the QR code in Figure 1.
Tips and Tricks for Finding MOFs With ConQuest
- When searching the CSD for a polymeric motif, it is always better to search for a small repeat unit and update or combine searches later, if needed.
As the CSD is a database of crystal structures and not chemical structures, the polymeric units that can be found in it depend upon the crystallographic symmetry of the structure. As a consequence, the repeat unit chosen can vary between structures with similar overall connectivity, and if a substructure search is drawn in full some motifs may not appear in the results.
- Use existing chemical diagrams from drawing programs such as ChemDraw, or CSD diagrams as a query.
ConQuest allows the user to copy and paste simple chemical diagrams from other drawing programs, or to use a specific diagram in the CSD, such as the molecule of caffeine (Figure 2) as a starting point to run the query. Different options can be selected to specify the contents of the query, such as the inclusion or exclusion of hydrogen atoms, and the chemical units of interest.
- Select a variable or generic bond type when performing the search.
Using “Any” bond can help finding more complicated structures such as polymers, delocalised and pi-bonded systems. The search can always be filtered down later.
Figure 3 shows three ways to perform the search for copper surrounded by four oxygens and which new types of structures these different approaches would find. By choosing the default “Single” bond type, the user is more likely to obtain substructures that look like the one on the top right in Figure 3; selecting the less specific bond type “Any”, gives instead access to the wider variety of hits, including all the ones reported on the right in Figure 3.
- Use a combination of searches to help gain insights into results and to better target a query.
As well as setting up an individual query, on ConQuest the user can also combine searches and results. There are three ways to do this: the “Store” tab is used for storing queries that can be combined later in the same search; the “Combine Queries” tab allows the user to separate the queries in “Must have”, “Must not have” and “Must have at least one of” tabs; lastly, the “Manage Hitlists” tab can be used to manage the results for the performed searches.
Reported in Figure 4 (left) is the “Search Setup” tab that opens when a search on ConQuest is run. Here, various options are available, including the possibility to select a specific subset as a starting point for the search. This subset could be the list of results of a previous search that the user ran in the session, or a predefined hit list from one of the CSD subsets available (Figure 4, right), such as the CSD MOF subset.
- Refcode families may not correspond to different frameworks.
When a new structure is added to the CSD, it is assessed against all the existing ones. If the new structure is considered the same as one that is already present in the database, it will be grouped together with the existing one into the same refcode family. This is the case for structures with the same formula, but that were collected at different temperatures or pressures, that are reported in different publications by independent research groups, or for polymorphs.
While this method works well for most of the systems, it can be more challenging for MOFs and porous materials. Figure 5 (left) reports an example of a structure of an aluminium formate MOF where researchers were looking at CO2 adsorption and collected a number of datasets with increasing CO2 pressure. In this case, the structures were grouped in the same refcode family, despite presenting different CO2 contents, and hence different formulas. The example in Figure 5 (right) instead involves HKUST-1 and reports two different publications where scientists were unable to identify the solvent molecules present within the MOF pores. As it can’t be said for sure that the two structures are identical, they were assigned to two different refcode families, despite presenting the same 3D model (Figure 5, right).
For analysis of framework properties e.g. void space, multiple refcode families may hence give the same results. In this case, further investigation of a results hitlist using the CSD Python API may help performing an in-depth analysis.
Explore Polymeric Bonds With Mercury
While ConQuest is ideal to perform advanced 3D searches of structures in the CSD, Mercury is our software for crystal structure visualization, exploration and analysis.
To learn more about how the use of ConQuest and Mercury can support the analysis of porous materials, follow the link here. To find out more about the most recent functionalities of Mercury for the investigation of porous materials, including the new Pore Analyser, follow the link here.
Some options to visualize and analyse polymeric frameworks efficiently on Mercury will now be presented.
- Polymeric bond representations and editing
Each of the structures in the CSD are edited and enhanced by editors at the CCDC to ensure they have the correct chemical connectivity. However, if the users open their own CIF file with Mercury, the bond information will be missing. The recommendation in this case is to go to Edit > Auto Edit Structures and use the “Identify polymeric bonds” functionality. It should be noted that in Mercury, the polymeric bonds are drawn alternating long and short lines, as reported in Figure 6.
- Polymer expansion component
Since the structure of polymers can be complex, getting information on how a polymer expands is beneficial to understand how it is constructed. Following the path Edit > Polymer Expansion, the polymeric bonds can be expanded by “Sub unit” (Figure 7) or by “Whole unit” (this will add an additional repeat of the initial crystal chemical unit to the structure). In both the cases, it is possible to control the polymer expansion by clicking on the atoms shared with the next unit or sub unit.
- Crystal packing feature search
The crystal packing feature is a functionality that allows the users to perform a substructure search in Mercury in a similar way to ConQuest. It can be found in the CSD-Materials tab (Figure 8) and can be used to investigate conformations of molecules or bonded fragments, and search for non-covalent interactions such as π-π or hydrogen bond interactions. It can also be used to search for particular spatial arrangements of functional groups or molecules.
Featured Questions and Answers Asked at the Workshop
Is it possible to search for COF structures using ConQuest?
Owing to the numerous challenges encountered when collecting single-crystal X-ray diffraction data on COFs, those structures are quite rare in the CSD. The tips presented in this blog are, however, applicable to search for COFs as well, when they are present.
Section 8.2.2 of the ConQuest manual might help with this as well. It contains information on different chemical classes, and the user can refer to the class of “organic polymer” as a kind of ‘COF subset’.
Can we visualize a polymer structure like polyaniline, and combine it with a metal oxide structure, such as in Co3O4/polyaniline composites?
The software won’t function as an energy minimization, and therefore the chemistry may not be accurate. However, two different crystal structures can be overlaid using the “Multiple Structures” option.
How can we calculate the distance between dimers?
You can use the measuring picking mode in Mercury to measure the distances between atoms and molecules. There is a self-guided workshop on how to measure objects in Mercury here.
Functional Materials (6)