Skip to content

Could validate_smirks fall back to an RDKit/OE backend instead of or addition to chemper? #373

@lilyminium

Description

@lilyminium

Description

I think I'm running into MobleyLab/chemper#99 or a variant when I try to create the following:

from openff.bespokefit.schema.smirnoff import AngleSMIRKS
AngleSMIRKS(
    smirks="[H][C:2]~1~[C:1](~N(~C(~[N:3]1)[H])[H])C([H])([H])[C]([H])([C]=O)[N][H]",
    attributes={"k", "angle"}
)

raises

Warning: Unable to parse SMARTS
Warning: [H]-,:[C:2]~[C:1](-,:[C](-,:[C](-,:[C]=[O])(-,:[H])-,:[N]-,:[H])(-,:[H])-,:[H])~[N](~[C](-,:[H])~[N:3]1)-,:[H]
Warning:                                                                                                               ^

---------------------------------------------------------------------------
SMIRKSParsingError                        Traceback (most recent call last)
Cell In[2], line 1
----> 1 AngleSMIRKS(
      2     smirks="[H][C:2]~1~[C:1](~N(~C(~[N:3]1)[H])[H])C([H])([H])[C]([H])([C]=O)[N][H]",
      3     attributes={"k", "angle"}
      4 )

File ~/micromamba/envs/openff-param-fit/lib/python3.10/site-packages/pydantic/main.py:339, in pydantic.main.BaseModel.__init__()

File ~/micromamba/envs/openff-param-fit/lib/python3.10/site-packages/pydantic/main.py:1074, in pydantic.main.validate_model()

File ~/micromamba/envs/openff-param-fit/lib/python3.10/site-packages/pydantic/fields.py:895, in pydantic.fields.ModelField.validate()

File ~/micromamba/envs/openff-param-fit/lib/python3.10/site-packages/pydantic/fields.py:1154, in pydantic.fields.ModelField._apply_validators()

File ~/micromamba/envs/openff-param-fit/lib/python3.10/site-packages/pydantic/class_validators.py:304, in pydantic.class_validators._generic_validator_cls.lambda4()

File ~/micromamba/envs/openff-param-fit/lib/python3.10/site-packages/openff/bespokefit/schema/smirnoff.py:64, in BaseSMIRKSParameter._check_smirks(cls, value)
     62 @validator("smirks")
     63 def _check_smirks(cls, value: str) -> str:
---> 64     return validate_smirks(value, cls._expected_n_tags())

File ~/micromamba/envs/openff-param-fit/lib/python3.10/site-packages/openff/bespokefit/schema/smirnoff.py:23, in validate_smirks(smirks, expected_tags)
     18 def validate_smirks(smirks: str, expected_tags: int) -> str:
     19     """
     20     Make sure the supplied smirks has the correct number of tagged atoms.
     21     """
---> 23     smirk = ChemicalEnvironment(smirks=smirks)
     24     tagged_atoms = len(smirk.get_indexed_atoms())
     26     assert tagged_atoms == expected_tags, (
     27         f"The smirks pattern ({smirks}) has {tagged_atoms} tagged atoms, but should "
     28         f"have {expected_tags}."
     29     )

File ~/micromamba/envs/openff-param-fit/lib/python3.10/site-packages/chemper/graphs/environment.py:561, in ChemicalEnvironment.__init__(self, smirks, label, replacements)
    559 # Check that the created Environment is valid
    560 if not self.is_valid():
--> 561     raise SMIRKSParsingError("Input SMIRKS (%s), converted to %s \
    562             is now invalid" % (smirks, self.as_smirks()))
    564 return

SMIRKSParsingError: (SMIRKSParsingError(...), 'Input SMIRKS ([H][C:2]~1~[C:1](~N(~C(~[N:3]1)[H])[H])C([H])([H])[C]([H])([C]=O)[N][H]), converted to [C:1](~[C:2]-,:[H])(-,:[C](-,:[C](-,:[C]=[O])(-,:[H])-,:[N]-,:[H])(-,:[H])-,:[H])~[N](~[C](-,:[H])~[N:3]1)-,:[H]                     is now invalid')

Since the validation just checks that the correct number of atoms is tagged, would it be possible to substitute that with a fuller cheminformatics toolkit backend or even a regex?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions