• Script to check your structure prior to publication (from 2016 Spring BCA & 2016 ACA meetings)

    This is a script to check some of the geometric and crystallographic features of a structure before publishing it (including publishing it as a CSD-Communication). It's intended to act as a complement to CheckCIF and flag features that are worth taking a second look. It was presented by CCDC's Pete Wood at the BCA Spring Meeting 2016 in Nottingham, UK and by myself (CCDC's Paul Sanschagrin) during the "Would You Publish This?" session at the 2016 ACA Conference in Denver, Colorado, USA. To use it, you must have the CSD Python API installed. Then, add a new folder to the Mercury CSD Python API menu via options, if you don't have a script folder already added there, and place the script there. We welcome feedback, criticisms, and suggestions for improvement.

    Thanks,
    Paul

  • Providing input to a Mercury called script

    I am looking to write a script to be called via the Mercury scripts menu that will ask the user for some input and use this to do something with a structure loaded in Mercury. Any ideas on best way to do this? PyQT vs. Tkinter?

    Thanks,

    Paul

  • RE: Output search results as program runs

    Fio,

    Unfortunately, it isn't possible to have the API produce output as it goes. It would be possible to break your search into two searches, one for your donor patterns and then a second using the hits from that for your acceptor patterns. This may not be any faster as I've never tested it but you would get an intermediate hit list. If you also defining distance and/or angle constraints, you could run the second search with the same setup as you originally had but only run against the filtered set.

    Paul

  • RE: Technical Help FAQs

    We have seen some cases in Windows where the pip installation of the CSD Python API fails due to an error building the lxml Python package. The end of the installation output will look something like:

    *********************************************************************************
        Could not find function xmlCheckVersion in library libxml2. Is libxml2 installed?
    *********************************************************************************
        error: command 'C:\\Users\\someuser\\AppData\\Local\\Programs\\Common\\Microsoft\\Visual C++ for Python\\9.0\\VC\\Bin\\cl.exe' failed with exit status 2

        ----------------------------------------
    Command "C:\Python27\python.exe -c "import setuptools, tokenize;__file__='c:\\users\\someuser\\appdata\\local\\temp\\pip-build-bznxq6\\lxml\\setup.py';
    exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))"
    install --record c:\users\db-ser~1\appdata\local\temp\pip-9laak8-record\install-record.txt
    --single-version-externally-managed --compile" failed with error code 1 in
    c:\users\db-ser~1\appdata\local\temp\pip-build-bznxq6\lxml

    (The last line may be wrapped differently as it is very long.)

    The simplest solution is to download the binary wheel package from here:
    http://www.lfd.uci.edu/~gohlke/pythonlibs/chfyvn4n/lxml-3.5.0-cp27-none-win32.whl
    (If that link doesn't work, you can go to here:
    http://www.lfd.uci.edu/~gohlke/pythonlibs/#lxml
    and download the file named lxml‑3.5.0‑cp27‑none‑win32.whl . )

    You would then install it using pip install lxml‑3.5.0‑cp27‑none‑win32.whl. Once this is done, rerun the CSD API pip install.

  • RE: smiles search

    Geoff,

     

    I agree it would be a worthwhile addition. I have added it to our system as an enhancement request. However, it won't be included in the next release in November as that is currently undergoing final bug fixing. Hopefully next year.

  • RE: smiles search

    Geoff,

     

    It's not possible to do an exact SMILES search at the moment. The best option is likely to do the substructure search and then check each resulting hit for the expected atom count. You do have to take care to consider if you want to include hydrogens in the count and how you want to handle matches in multi-component systems. The hit.molecule object will include all components. There is also a hit.matched_components list which will contain only those molecular components which match the search so you would check these. If you want to match complete systems, you can just check the hit.molecule. To quickly count the atoms use the following:

    len([mol.atoms) # all atoms
    len([a for a in mol.atoms if a.atomic_number > 1]) # non-hydrogens

    Paul

  • Downloading the SQLite Database

    The API documentation has the wrong link to download the SQLite DB. The correct location/method to obtain this is to request a set of download links from the main CSDS download page (you will need your site number and confirmation code):

    http://www.ccdc.cam.ac.uk/SupportandResources/Downloads/Pages/CSDS-Downloads.aspx

    There will be links in this email to the SQLite download.

  • Script to generate Mogul data from for a series of compounds

    This is a script to generate Mogul data for a series of compounds from the command line. A complete usage message can be displayed by using the -h or --help options. Basic usage is:

    python mogul_data.py test_molecules.mol2 mogul_data.csv

    The output file will contain a header line followed by lines of Mogul data for each bond, bond angle, torsion angle, and ring of each molecule in the input file.

  • Script to obtain PubChem information for a CSD entry

    This script will generate a simple HTML formatted output file of PubChem information for a CSD entry. This makes use of RDKit to generate an InChi key which is used to query UniChem to obtain the PubChem ID. The PubChem ID is then used in a PubChem REST query to obtain the information. Note, the path to RDKit must be set in the top of the script (line 32).

    Example usage:

    python pubchem_info.py aabhtz aabhtz_info.html

    Please note, at least some versions of Firefox may have issues reading the image (diagram) image from the default temporary location. This can be changed by including the dir= option to the tempfile.mkstemp call on line 145.

  • Script to filter molecular conformers with unusual torsion angles

    This is a script to filter a set of conformers of a molecule or set of molecules based on the number of unusual torsion angles. Running the script with the -h or --help option will give a full usage description, but the simplest usage is the following:

    python .\conformer_filter_density.py .\2uwd_conformers.sdf .\2uwd_passed.sdf