The second script “SI_MOF_solvent_remover.py” works in a similar way via the command line. This command line script has a couple of extra features – mainly the ability to be run over multiple structures from a CSD refcode list. A refcode list can be generated from ConQuest; once a search has been run the results can be exported via File > Export Entries as… and selecting the option “Refcode: CSD entry identifier list. This file is simply a list of the CSD refcodes and is stored as a .gcd format.
A second option is the ability to choose a list of solvents to remove from a framework (the default is the CCDC list that comes with the standard CSD installation). This solvent list should be in mol2 format. There is also an option to simply remove any monodentate entity from a framework. It should be stressed that the analysis carried out as part of the above paper only used these scripts on selected subsets of MOF entries where solvent removal would be chemically reasonable – although the scripts will in theory work on any metal-containing structure.
Hopefully the scripts may be of wider interest too – the ability to manipulate a structure and output a modified cif could potentially be useful in other projects.
This first script "SI_MOF_solvent_remover_Hg_version.py" is intended to be used via CSD Python API menu in Mercury. Navigating to CSD Python API > Options... shows the location of built-in scripts in Mercury and also allows you to include additional locations to host your own scripts and specify where output files are to be written. By adding this script to either the built-in scripts or your own file location will allow you to use it from the Mercury menu.
The script will run on whatever structure is currently displayed in the main Mercury visualiser. The script will identify any monodentate solvent molecules (based on a smiles comparison with a list of solvents present within Mercury), remove these from the MOF framework and output a CIF with only this solvent-stripped main residue present for use in third-party analysis tools.
We've recently published a Chemistry of Materials perspective "Development of a Cambridge Structural Database Subset: A Collection of Metal–Organic Frameworks for Past, Present, and Future". http://dx.doi.org/10.1021/acs.chemmater.7b00441
The Supporting information for the paper includes two scripts using the CSD Python API as part of the MOF analysis. The scripts aim is to ‘clean up’ metal-organic frameworks i.e. removing solvent molecules and labile ligands from the framework to allow pore-size and surface area analysis to be carried out; I'll add the scripts to this thread where I can include a bit more information.
Hi Marc - that's a really interesting question!
The ability to create a spreadsheet containing database entry names and selected metal-ligand bond lengths is certainly possible through the normal ConQuest graphical interface and the Mercury Data Analysis Module.
From your Search Query in ConQuest you can select the 'ADD 3D' option to record the bond distance between the metal and phosphorus atom. From the resulting hitlist click the 'Analyse Hitlist' button at the top of the list and choose 'Analyse Data'. This will then be exported into the Mercury Data Analysis Module where the values can be investigated further or output as a spreadsheet.
The second part of your question about making a list of coordinates excluding particular atoms is more challenging, but should be achievable using the CSD Python API. I'd suggest a script that took this sort of approach:
From your hitlist (this can be exported from ConQuest as a CSD entry identifier list in .gcd format) loop through each molecule of the structure to select only those contain a metal atom (i.e. ignoring any solvent molecules). You could also do the substructure search directly through the API, but by this point I’m assuming you already have a hitlist!
For the selected molecule the bonds from the metal atom can then be analysed to determine the neighbouring atom. If this atom isn't the phosphorus, the bond can then be removed.
By removing the bond you now have a new set of molecular components, so as above you could again loop through these to find the part with the metal atom. This single component can then be re-written into a cif file (or other format you require) to give you a list of coordinates.
I hope that's clear - do let me know if you'd like more detail on any of this information!
I've had some feedback on the script, and following that I'm adding a revised version. The script works by using the CSD refcode for the structure currently viewed in Mercury to create a URL to the Access Structures page for that structure. Whilst this works well, the previous version of the script would create the URL for *any* structure viewed in Mercury, so could create QR codes with non-valid URLs. To try and fix this I've added a check in the script to make sure the identifier of the structure viewed in Mercury is 'CSD-like'. If the identifier doesn't look like a CSD refcode, the script will warn you accordingly!
I think you're after entry.chemical_name and/or entry.synonyms, as in ConQuest the name of the entry is found in the compound field.
So, for example
>>> from ccdc import io
>>> csd_reader = io.EntryReader('CSD')
>>> entry_abebuf = csd_reader.entry('ABEBUF')
>>> print entry_abebuf.chemical_name
5H,11H-Dibenzo(b,f)(1,5)diazocine-6,12-dione pyridine clathrate
>>> print entry_abebuf.synonyms
As in this case entry ABEBUF only has one name - if there are multiple trivial/common names these will appear in the synonyms field.
I hope that's what you're after!
I hope this script will be useful for posters and teaching materials. It uses a separate Python QR code generating package https://github.com/lincolnloop/python-qrcode that I installed easily using pip.
It's designed to be used via the Mercury scripts menu, and (hopefully!) for any CSD entry you are looking at will generate a QR code that links to the entry in the CCDC's Access Structures service, where anyone view and retrieve structures in the CSD. From here you can download the original cif and see the compound name, 2D diagram, 3D structure etc. The QR code is displayed via a pop-up window, but also saved as a .png file in your specified output directory.
I'd be really interested to hear any comments or feedback!
For a bit more of an in-depth explanation:
To install the Python QR code generating package: Download the .zip file, from a command prompt (I'm using a Windows system here) type "pip install <location of the zip file you've just downloaded>". (Pip is installed automatically as part of recent Python versions). If this produces an error saying something like "'pip' is not recognized as an internal or external command" you need to modify the environment variable path to point to your Python scripts folder e.g. set PATH=%PATH%;C:\Python27;C:\Python27\Scripts
To make the script accessible from Mercury: The script should be saved into your scripts folder - for a typical Windows installation this will be C:\Program Files (x86)\CCDC\CSD_2016\mercury 3.8\scripts. The script name will then appear in the 'CSD Python API' menu of Mercury, and will generate the QR code for the structure currently in the visualiser.
I've attached a script that I've found very useful - the main purpose is to take a set of chemical diagrams (in mol format) and run a similarity search as available in WebCSD.
The script takes an .sdf file as the input, and this means it can also be used to search any other attributes present in the .sdf file. In this case if I can't find an example of a similar molecule in the CSD from the chemical diagram I'm running a second check on the compound name.
I'd be really interested if anyone has any comments or feedback - I hope somebody finds it useful!