I don't know if you can easily read your molecule from your pd dataframe. check for supported file format (https://downloads.ccdc.cam.ac.uk/documentation/API/descriptive_docs/io.html#id2)
How do you create this dataframe?
Thus, the must complicated should be to create molecule object understandable by ccdc API after that, it is relatively easy to use functions mimicking auto-edit structure and export in mol2 using for instance :
Add hydrogen atoms to the molecule.
|Parameters:||mode – ‘all’ to generate all hydrogens (throws away existing hydrogens) or ‘missing’ to generate hydrogens deemed to be missing.|
|Raises:||RuntimeError if any heavy atom has no site.|
|Raises:||RuntimeError if any atoms are of unknown type.|
|Raises:||RuntimeError if any bonds are of unknown type.|
Assign bond types to the molecule.
|Parameters:||which – may be ‘all’ or ‘unknown’|
|Raises:||ValueError if an unrecognised
The same for me. Python results should reflect results you could obtain with conquest or Mercury. In both, there is no CCDC number for your structures. The reason why there is a number in web CSD and not in 2020.1 CSD - I do not know! (a database update problem?) but I never tried with previous version of CSD.
Maybe, the CCDC staff should be helpful on this question?
you can use the "formula" and "chemical_name" attribute.
+1 because I have also some issues about chirality assessment.
For your structures BOBDON, BOSMOM, LAWCOC I think the probleme is related to disorder. If you carrefully look on the molecules you can notice that two atoms occupy the same position. For instance in BOBDON: in the asymmetric unit (AU) (the half of one molecule) look at C3U and N1U atoms, when you reconstruct the whole molecule, because of disorder, C3 is generated on the same position of N1 and N1 on the position of C3. Then by applying the routine mol.assign_bond_type(), I think it will probably generate 4 bonds for carbon atoms. You can check that with atom.bonds function. e.g: for C2 atom we have (Bond(Single Atom(Br2) Atom(C2)), Bond(Single Atom(C1) Atom(C2)), Bond(Single Atom(C2) Atom(C3)), Bond(Single Atom(N1B) Atom(C2)).
Thus it is assign to chiral center... Maybe developpers can add some exceptions for this particular case (when two atoms are on the same coordinates?)
For HODKER results are the same in python and Mercury. The difference you noticed is probably because in auto-edit in Mercury, the default assign.bond type is set on "Unknown" and in your routine you assign to "All" bonds... compare with mol.assign_bond_types('Unknown')....
For my case I noticed that many structures containing Boron, Phosphorus, or Nitrogen atoms detected chiral in Mercury are not detected chiral in python API. For instance: XONPUO, XONMOF, YOWQIM. It seems occur when mol.add.hydrogens() is applied (it seems remove some H atoms! for instance only 8 hydrogens for YOWQIM after mol.add.hydrogens() instead of 10 without this function...)
Why this behavior difference between Mercury "add Missing H" and python "mol.add.hydrogens()"?
what should be the best procedure to determine molecular point group from CSD data?
I currently proceed as follow:
Nevertheless, I feelit is not the best procedure because the point group of a lot of molecules is uncorrectly determined (a lot of them fall in C1). I guess it is due to coordinates of molecules in the solid state. How to add some tolerance on these coordinates?
I would like to have more information about the function MolecularDescriptors.rmsd
In the documentation we have that description:
static MolecularDescriptors.rmsd(mol1, mol2, atoms=None, overlay=False, exclude_hydrogens=True, with_symmetry=True)
Return the RMSD of two molecules.
Both molecules should have the same atoms if atoms is None.
My question is how to use the keyword "invert"?, when I tried I obtained a TypeError...
thank you very much. It is "crystal clear" now!
I added the code and it's working.
I am pretty new in python language and I am trying to run script on a large dataset (from gcg file) and using the functions:
Sometimes I get the message: IndexError: list index out of range or a RuntimeError (principally due to the complexity of the structure, I guess)
How to handle it and to pass to the next structure if the error occurs?
Thank you !