you can generate molecules outside the unit cell with the translate parameter of the symmetric_molecule() method if you want explicit control of the symmety operator, or you can generate expanded representations of the crystal through methods such as packing_shell() and molecular_shell(). So, for example, you could use the molecular shell:
mol = crystal.molecular_shell()
atoms_of_interest = [a for a in mol.atoms if a.label == 'Te1']
min_dist = min(MolecularDescriptors.atom_distance(a, b) for a in atoms_of_interest for b in atoms_of_interest if a != b)
or by hand you can calculate all translated symmetric molecules:
expansions = [
crystal.symmetric_molecule(symmop, (i, j, k))
for symmop in cry.symmetry_operators
for i in range(-2, 3)
for j in range(-2, 3)
for k in range(-2, 3)
min_dist = min(
for i in range(len(expansions)) for j in range(len(expansions))
if i != j
Is either of these solutions what you are after?
for this I think you will need to know the symmetry operators of the crystal, and to know which of the symmetry operators you are interested in using. For example,
from ccdc import io, descriptors
csd = io.EntryReader('csd')
crystal = csd.crystal('AABHTZ')
base_mol = crystal.molecule
symm_mol = crystal.symmetric_molecule(crystal.symmetry_operators)
print descriptors.MolecularDescriptors.atom_distance(base_mol.atom('N2'), symm_mol.atom('N2')
Hope this is helpful; please let me know if you would like any more help.
Thanks for the suggestions, Paul, I'll certainly consider them for the next release.
firstly you are right to spot that TextNumericSearch doesn't take a settings parameter. I shall fix this for the next release.
Secondly, you can exclude entries with specific elements using the search_settings class:
search_settings.must_not_have_elements = [
'Ar', 'K', ...
from ccdc.molecule import Atom
ats = 
for i in range(18, 93):
ats[-1].atomic_number = i
search_settings.must_not_have_elements = [a.atomic_symbol for a in ats]
This is slightly clumsy because there is no atomic_number keyword for Atom creation.
Thirdly you don't have to use a bogus search to extract all entries of a database matching specific search criteria. You can set up the search_settings as above, then iterate over the csd:
from ccdc import io
csd = io.EntryReader('csd')
for e in csd:
Fourthly the year range is a pair of numbers, interpreted as an inclusive range, rather than the list of values you have given. The first two values of the range have been used as the inclusive range, so you are getting hits from 1970-1971. The query should be written:
It would have been helpful if the API had made this clear.
Lastly, it is perhaps counter-intuitive that a TextNumericSearch with no criteria returns no hits. It would probably be better to raise an exception as the other classes do. I shall consider this for the next release.
Thank you for your questions; it is feedback like this that helps me to make the API better.
thinking about your search requests I realise that an SQL derived database really is overengineering.a solution to the problem. Since the ReducedCellSearch is very fast, and the number of hits returned is very small a simple filtering of the hits is more than fast enough. I've attached a simple example script which performs a couple of queries of the sort you are describing.
I'm sure you'll have no difficulty adjusting the script for your purposes, but if you do, please raise the issue here.
Okey-doke, Dean, I'll rustle something up. Might take me a couple of days, so please be patient.
we don't have any methods to search by spacegroup symbol or formula, so iterating over hit structures would be the only way to do it. If you have to do many of these searches it would be fairly simple to make an SQLite database containing terms of interest, then to join the results of a ReducedCellSearch with a query of this database.
If you like I can provide a prototype of how to go about this.
I'm not entirely clear what you are trying to do here. Let me know if I've got the wrong end of the stick:
You run a reduced cell search on the CSD, or another database of structures, retrieving some hits. You then wish to filter these results according to further criteria, e.g. chemical formula, or space groups.
You can do a simple filter of the hits, assuming there are not too many of them, simply by iterating over the hits:
for h in hits:
c = h.crystal
if c.spacegroup_symbol == ...
Alternatively you can use any of the search classes except TextNumericSearch on an individual crystal structure.
Hope this is helpful; if not please ask again.
The HBond in CATKIT is not found because the default path length range for detecting hydrogen bonds is set to (4, 999), so excluding contacts between separate components of the molecule. You can include such contacts by setting the path_length_range to (-1, 999), i.e:
from ccdc import io
csd = io.MoleculeReader('csd')
catkit = csd.molecule('CATKIT')
print catkit.hbonds(path_length_range=(-1, 999))
The value -1 is used to cope with both options to the 'require_hydrogens' parameter of the hbonds() method. I appreciate that this is not clear from the documentation, and this will be rectified in a forthcoming release.
I think the default behaviour is somewhat counterintuitive; I shall discuss with colleagues whether the default should be made more permissive.
Hope this is helpful.
I agree - or rather a friendly chemist agrees - that the structure is a bit rubbish. The first kekulize misassigns the double bonds in the carbon you mentioned, so the second aromatic assignment does not regard these bonds as aromatic, then the second kekulize does not operate on the same structure as the first. I agree that this is not ideal behaviour, but it is comprehensible.
The only solution I can think of is to assign all bond types:
where the double bond to the phosphorus is detected, the five membered ring is no longer aromatic and the kekulisation works as expected.
I have mailed the database group to see if they want to fix the bonds in the structure, but this will be too late for the forthcoming November release.