I am currently using the CSD API to search for hydrogen bonds. A lot of data is found and the job is long. At the moment, I define my donors and acceptors as SMARTSsubstructures, add them to the search and run the search.search() command. Output only starts once the search is finished.

Would it be possible to write the code so that it treats each hit as it finds it and outputs it? It would be ideal if one could restart the search at a specific point (for example giving the last outputted identifier) when the job crashes.

Thank you!



Unfortunately, it isn't possible to have the API produce output as it goes. It would be possible to break your search into two searches, one for your donor patterns and then a second using the hits from that for your acceptor patterns. This may not be any faster as I've never tested it but you would get an intermediate hit list. If you also defining distance and/or angle constraints, you could run the second search with the same setup as you originally had but only run against the filtered set.


You must be signed in to post in this forum.