• RE: empty result for what seem a reasonable query

    Ok, all sorted. I attached the python script for the curious. Beware, I am crawling the IUCR website and that is probably against their T&C.

    Here is what I am doing. Not nicely written but does the job:

    I am using the CSD to search for structures containing a Cp* ligand. With each hit, i look for the publication link.
    Then I crawled the website (I only keep Acta papers) to download the cif file.
    From the matched atoms, I also write a script file for CRYSTALS to calculate a TLS model on these atoms.

    The idea is to support the common sense that Cp* ligand are rigid and can be refined as such.

    So if there are plans to include ADPs in the database that would be great :)

     

  • RE: empty result for what seem a reasonable query

    Thanks!

    How did you get the smart string? I tried to use it but could find any tool to get the string from a 2d drawing.

    Is there any way to save the result a search in a file and load it later? something similar to pickle but for a ccdc object.

  • RE: empty result for what seem a reasonable query

    Hi,

    Thanks, it works. However, it takes more than 2 hours to do the search????

     

  • empty result for what seem a reasonable query

    Hi,

    I am looking for Cp* in the database via the API and I got zero result. I tried in conquest and I have several.

    Did I made a mistake somewhere?


    #from mercury_interface import MercuryInterface
    from ccdc.search import TextNumericSearch
    from ccdc.io import EntryReader
    from ccdc.search import SubstructureSearch, QuerySubstructure, ConnserSubstructure
              
    s = SubstructureSearch()
    cps_substructure = QuerySubstructure()
    c1 = cps_substructure.add_atom('C')
    c2 = cps_substructure.add_atom('C')
    c3 = cps_substructure.add_atom('C')
    c4 = cps_substructure.add_atom('C')
    c5 = cps_substructure.add_atom('C')
    b1 = cps_substructure.add_bond('Any', c1, c2)
    b2 = cps_substructure.add_bond('Any', c2, c3)
    b3 = cps_substructure.add_bond('Any', c3, c4)
    b4 = cps_substructure.add_bond('Any', c4, c5)
    b5 = cps_substructure.add_bond('Any', c5, c1)

    c11 = cps_substructure.add_atom('C')
    c12 = cps_substructure.add_atom('C')
    c13 = cps_substructure.add_atom('C')
    c14 = cps_substructure.add_atom('C')
    c15 = cps_substructure.add_atom('C')
    b11 = cps_substructure.add_bond('Single', c1, c11)
    b12 = cps_substructure.add_bond('Single', c2, c12)
    b13 = cps_substructure.add_bond('Single', c3, c13)
    b14 = cps_substructure.add_bond('Single', c4, c14)
    b15 = cps_substructure.add_bond('Single', c5, c15)

    h11 = cps_substructure.add_atom('H')
    h12 = cps_substructure.add_atom('H')
    h13 = cps_substructure.add_atom('H')
    h21 = cps_substructure.add_atom('H')
    h22 = cps_substructure.add_atom('H')
    h23 = cps_substructure.add_atom('H')
    h31 = cps_substructure.add_atom('H')
    h32 = cps_substructure.add_atom('H')
    h33 = cps_substructure.add_atom('H')
    h41 = cps_substructure.add_atom('H')
    h42 = cps_substructure.add_atom('H')
    h43 = cps_substructure.add_atom('H')
    h51 = cps_substructure.add_atom('H')
    h52 = cps_substructure.add_atom('H')
    h53 = cps_substructure.add_atom('H')
    bh11 = cps_substructure.add_bond('Single', c11, h11)
    bh12 = cps_substructure.add_bond('Single', c11, h12)
    bh13 = cps_substructure.add_bond('Single', c11, h13)
    bh21 = cps_substructure.add_bond('Single', c12, h21)
    bh22 = cps_substructure.add_bond('Single', c12, h22)
    bh23 = cps_substructure.add_bond('Single', c12, h23)
    bh31 = cps_substructure.add_bond('Single', c13, h31)
    bh32 = cps_substructure.add_bond('Single', c13, h32)
    bh33 = cps_substructure.add_bond('Single', c13, h33)
    bh41 = cps_substructure.add_bond('Single', c14, h41)
    bh42 = cps_substructure.add_bond('Single', c14, h42)
    bh43 = cps_substructure.add_bond('Single', c14, h43)
    bh51 = cps_substructure.add_bond('Single', c15, h51)
    bh52 = cps_substructure.add_bond('Single', c15, h52)
    bh53 = cps_substructure.add_bond('Single', c15, h53)

    s.add_substructure(cps_substructure)    

    hits = s.search()#([h.identifier for h in texthits])  

    print hits
  • RE: Combine queries?

    I think just one search with the substructure and then filter on the text during processing is the best.

     

  • RE: Combine queries?

    The 2 queries independently are ok but when I combine then it is extremely slow.

    I have done a few searches with a limit on the first one:
    1000hits: 10s
    2000hits: 21s
    4000hits: 42s

    The first search return ~39000 so it would take more than 5min...
    If I do the second search on the full database it takes 20s.

     

        print("Text search...")
        text_numeric_search = TextNumericSearch()
        text_numeric_search.add_citation(journal='Acta Crystallogr.,Sect.E:Struct.Rep.Online')
        #text_numeric_search.settings.max_hit_structures = 1000
        texthits=text_numeric_search.search()

        s = SubstructureSearch()
        cf3_substructure = QuerySubstructure()
        c = cf3_substructure.add_atom('C')
        F1 = cf3_substructure.add_atom('F')
        b1 = cf3_substructure.add_bond('Single', c, F1)
        F2 = cf3_substructure.add_atom('F')
        b2 = cf3_substructure.add_bond('Single', c, F2)
        F3 = cf3_substructure.add_atom('F')
        b3 = cf3_substructure.add_bond('Single', c, F3)
        c1 = cf3_substructure.add_atom('C')
        b4 = cf3_substructure.add_bond('Single', c, c1)

        search_settings = s.Settings()
        search_settings.has_3d_coordinates = True
        search_settings.max_r_factor = 5
        search_settings.no_errors = True
        search_settings.no_disorder = True    
        search_settings.no_powder = True

        s.add_substructure(cf3_substructure)    
        s.settings=search_settings
        print("Substructure search...")
        hits = s.search([h.identifier for h in texthits], max_hit_structures=500, max_hits_per_structure=1)    
        print(len(hits))
        sys.exit()

  • RE: Combine queries?

    Thanks.

    So if I understand well, I can pass a list of identifiers to do the search on them?

    At the moment, I am just manually testing the text field in the loop when processing which is more or less the same approach.

  • Combine queries?

    Hi,

    Is it possible to combine queries?

    I have a TextNumericSearch:
        text_numeric_search = TextNumericSearch()
        text_numeric_search.add_citation(journal='Acta Crystallogr.,Sect.E:Struct.Rep.Online')
        text_numeric_search.add_citation(year=range(2013, 2017))

    And a smartsearch:
        pattern=['[CH2][CH2][CH2][CH2][CH2]']
        q = SMARTSSubstructure(pattern[0])
        s = SubstructureSearch()
        s.add_substructure(q)    

    Can I combine the 2 queries that look for entries that sastifies both of them? Like in conquest.