• RE: "A bond is not in the molecule"

    The problem is that crystal.molecule returns a fresh copy of a molecule object and so the bonds in crystal.molecule.bonds are in a different molecule from that on which you are calling remove_bonds.

    The solution is to extract the molecule once, then re-assign it to the crystal, i.e.

    mol = crystal.molecule

    mol.remove_bonds(b for b in mol.bonds if b.bond_type == 'Unknown')

    crystal.molecule = mol

    You may also consider using molecule.assign_bond_types('Unknown') which will deduce bond types where possible.

    I think the API might be a little nicer if the crystal object had methods to remove atoms and bonds.  I shall certainly consider it for a  future release.

    Best wishes

    Richard

     

  • RE: why the default h-bond distance range is negative (-5.0, 0.0)?

    This is because the HBond may come from different symmetry transformed copies of the unit cell molecule.  The API seeks to provide only a single copy of each HBond, for example O1-H21-N2, though this behaviour can be changed.  To see all HBonds,

    hbonds = cry.hbonds(distance_range=(0, 5),  unique=False)
    for hb in hbonds:
        print hb, hb.length, vdw_correct(hb), hb.symmetry_operators[-1]

    So the O1-H21-N2 has a symmetry operator of  1-x,1-y,1-z.

    In the first example with a distance range of (-5, 0) O1-H21-N2 has a symmetry operator of x,y,z so they are the unit cell atoms.  It does not appear in the second list because the vdw-corrected distance is less than 0,

    from ccdc.descriptors import MolecularDescriptors as MD
    mol = cry.molecule

    MD.atom_distance(mol.atom('O1'), mol.atom('N2)) - mol.atom('O1').vdw_radius - mol.atom('N2').vdw_radius

    -0.11999

    You can check other distances by instantiating the symmetry transformed molecule:

    symm_mol = cry.symmetric_molecule('1-x,1-y,1-z')

     

    Best wishes

    Richard

     

  • RE: why the default h-bond distance range is negative (-5.0, 0.0)?

    One of the options to the detection of HBonds either in the crystal or in the molecule is whether or not the distance range is absolute or Van der Waals corrected, i.e. a distance minus the sum of the radii of the atoms concerned.  This is the default.  We can see this more clearly in the following script:

    from ccdc import io

    csd = io.EntryReader('csd')
    cry = csd.crystal('TEHREG')

    def vdw_correct(hb):
        return hb.length - hb.atoms[0].vdw_radius - hb.atoms[-1].vdw_radius

    hbonds_vdw = cry.hbonds(distance_range=(-5, 0))
    for hb in hbonds_vdw:
        print hb, hb.length, vdw_correct(hb)

    hbonds_vdw = cry.hbonds(distance_range=(0, 5))
    for hb in hbonds_vdw:
        print hb, hb.length, vdw_correct(hb)

    If you prefer, you may calculate HBonds with the absolute distance:

    hbonds_abs = cry.hbonds(distance_range=(0, 5), vdw_corrected=False)

    I hope this is helpful

    Best wishes

    Richard

  • RE: Python3 Version of the Python API

    We do not currently support Python3 but intend to do so during 2019.

    Best wishes

    Richard

  • RE: Change spacegroup to subgroup

    Could you post it here as well, please Jonas.  It seems the sort of script that other people would be interested in.

    Thanks

    Richard

  • RE: Change spacegroup to subgroup

    Sorry, Aurora, it's not available yet.  I'll make a note that you would like it and see if I can get it into a release sometime.

    Best wishes

    Richard

  • RE: Writing crystals

    Hi Puck,

    always happy to help.  Let me know if there are any other problems.

    Best wishes
    Richard

  • RE: Writing crystals

    Hi Puck,

    it sounds as if you are trying to write the refcodes directly to the CrystalWriter instance, but this needs a crystal structure.  If you have the refcodes in a list you can create the crystals from the CSD as follows:

    csd = io.EntryReader('csd')
    for refcode in list_of_refcodes:
        with io.CrystalWriter('/path/to/pdb/directort/%s,pdb' % refcode) as writer:
            writer.write(csd.crystal(refcode))

    Alternatively, if you have the refcodes in a file, they can be read through a CrystalReader instance:

    with io.CrystalReader('/path/to/refcode/file', format='identifiers') as reader:
        for crystal in reader:
            with io.CrystalWriter('/path/to/pdb/directory/%s.pdb' % crystal.identifier) as writer:
                writer.write(crystal)

    Hope this is helpful.  Please let me know if I've misunderstood your problem.

    Best wishes
    Richard

  • RE: Compare Molecule Similarity Pseudocode

    Hi Hope,

    Can you tell me which methods you have used and for which you would like documentation and pseudocode?  Is it the similarity searching that Alex used, where the reference given above should be sufficient?

    Best wishes

    Richard

  • RE: Downloading a long list of structures

    Dear Anthony,

    I'm very sorry about the long delay in getting back to you.

    It sounds as if you need to do a substructure search on the CSD which for 22,000 structures is going to take some time.

    This script will do something like what you want.  It will process your target file, search the CSD for structures containing your target molecule and will write the hits to a results file.  The hits will contain matches where your target molecule is a proper substructure of the matched structure; if you need exact matches, then please let me know.

    --


    import time

    from ccdc import io, search, utilities

    target_file = '/path/to/your/file/of/structures.mol2'
    hit_file = '/path/to/where/you/want/to/put/results.gcd'

    def one_search(m):
        searcher = search.SubstructureSearch()
        searcher.add_substructure(search.MoleculeSubstructure(m))
        hits = searcher.search()
        print 'CSD contained %d hits for %s' % (len(hits), m.identifier)
        return [h.identifier for h in hits]

    reader = io.MoleculeReader(target_file)
    start = time.time()
    with open(hit_file, 'w') as gcd_writer:
        for i, m in enumerate(reader):
            gcd_writer.write('\n'.join(one_search(m)))
            utilities.Timer.progress(start, i+1, len(reader), '')
    --

    As I say, 20,000 structures is a lot of searching.  It may be advisable to split the input file into smaller chunks and process them separately.

    Once again, I am terribly sorry about the delay in answering.

    If you need any more help, please get back in touch.

    Best wishes

    Richard