Rdkit: ringMatchesRingOnly=True produces a SMARTS query that return no substructure matches

Created on 2 Oct 2020  路  3Comments  路  Source: rdkit/rdkit

Configuration:


  • RDKit Version: 2020.03.5
  • Operating system: Linux
  • Python version (if relevant): 3.8.5
  • Are you using conda? Yes
  • If you are using conda, which channel did you install the rdkit from? conda-forge
  • If you are not using conda: how did you install the RDKit? --

Description:

We are showcasing the MCS code in our TeachOpenCADD notebooks. We query for an MCS and then highlight the matched substructure. When ringMatchesRingOnly is enabled, the returned SMARTS string produces no substructure match in none of the molecules present in the sample.

from rdkit import Chem
from rdkit.Chem import rdFMCS

smiles = [
 'Brc1cccc(Nc2ncnc3cc4ccccc4cc23)c1',
 'CCOc1cc2ncnc(Nc3cccc(Br)c3)c2cc1OCC',
 'CN(C)c1cc2c(Nc3cccc(Br)c3)ncnc2cn1',
 'CNc1cc2c(Nc3cccc(Br)c3)ncnc2cn1',
 'Brc1cccc(Nc2ncnc3cc4[nH]cnc4cc23)c1',
 'Cn1cnc2cc3ncnc(Nc4cccc(Br)c4)c3cc21',
 'Cn1cnc2cc3c(Nc4cccc(Br)c4)ncnc3cc21',
 'COc1cc2ncnc(Nc3cccc(Br)c3)c2cc1OC',
 'C#CCNC/C=C/C(=O)Nc1cc2c(Nc3ccc(F)c(Cl)c3)c(C#N)cnc2cc1OCC',
 'C=CC(=O)Nc1ccc2ncnc(Nc3cc(Cl)c(Cl)cc3F)c2c1'
]

mols = [Chem.MolFromSmiles(smi) for smi in smiles]

print("ringMatchesRingOnly=True")
mcs = rdFMCS.FindMCS(mols, ringMatchesRingOnly=True, threshold=0.8)
print("SMARTS:", mcs.smartsString)

for m in mols:
    print(m.GetSubstructMatch(Chem.MolFromSmarts(mcs.smartsString)))

print("\nringMatchesRingOnly=False")
mcs2 = rdFMCS.FindMCS(mols, ringMatchesRingOnly=False, threshold=0.8)
print("SMARTS:", mcs2.smartsString)

for m in mols:
    print(m.GetSubstructMatch(Chem.MolFromSmarts(mcs2.smartsString)))

returns:

ringMatchesRingOnly=True
SMARTS: [#6&R]:&@[#6&R]:&@[#6&R]1:&@[#6&R](-&!@[#7&!R]-&!@[#6&R]2:&@[#6&R]:&@[#6&R]:&@[#6&R]:&@[#6&R](:&@[#6&R]:&@2)-&@[#35&R]):&@[#7&R]:&@[#6&R]:&@[#7&R]:&@[#6&R]:&@1:&@[#6&R]
()
()
()
()
()
()
()
()
()
()

ringMatchesRingOnly=False
SMARTS: [#6]:[#6]:[#6]1:[#6](-[#7]-[#6]2:[#6]:[#6]:[#6]:[#6](:[#6]:2)-[#35]):[#7]:[#6]:[#7]:[#6]:1:[#6]
(18, 19, 20, 7, 6, 5, 4, 3, 2, 1, 21, 0, 8, 9, 10, 11, 12)
(20, 19, 18, 9, 10, 11, 12, 13, 14, 15, 17, 16, 8, 7, 6, 5, 4)
(3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 13, 15, 16, 17, 18, 19)
(2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 13, 12, 14, 15, 16, 17, 18)
(17, 18, 19, 7, 6, 5, 4, 3, 2, 1, 20, 0, 8, 9, 10, 11, 12)
(21, 20, 19, 10, 11, 12, 13, 14, 15, 16, 18, 17, 9, 8, 7, 6, 5)
(4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 14, 16, 17, 18, 19, 20)
(19, 18, 17, 8, 9, 10, 11, 12, 13, 14, 16, 15, 7, 6, 5, 4, 3)
()
()
bug

Most helpful comment

@jaimergp @AndreaVolkamer This is fixed now. Thank you very much for reporting this with a reproducible example. Apparently so far no unit test covered the combination of ring queries on atoms and bonds in combination with Threshold < 1.0 - now I added one with your molecules. Thanks again!

All 3 comments

Hi @jaimergp thanks for reporting this and for providing an example. I'll look into this over the weekend.

@jaimergp @AndreaVolkamer This is fixed now. Thank you very much for reporting this with a reproducible example. Apparently so far no unit test covered the combination of ring queries on atoms and bonds in combination with Threshold < 1.0 - now I added one with your molecules. Thanks again!

Wow, that was quick! Thank you so much! 馃コ

Was this page helpful?
0 / 5 - 0 ratings