Skip to content

Understand performance of string comparisons #196

@nsmith-

Description

@nsmith-

For corrections that use string input fields extensively, string comparison could have a significant performance impact. It may be the case that the strings are relatively static (e.g. constant for an entire dataset or subset of dataset), so one would expect branch prediction to do a good job in amortizing the expense. Nevertheless, some profiling to understand the extent of the issue would be useful.
There are a few improvements we could make to reduce string comparison:

  • Add an API that allows to pre-create some integer token that represents the string and pass that as an argument in the Correction::evaluate call, which internally would then do a faster lookup
  • Project out the string dimension and return a reduced correction, as discussed in Partially evaluated correction object #38
  • Provide a context manager in which certain nodes the correction's evaluation tree are frozen to pre-defined values (c.f. @arizzi)

Metadata

Metadata

Assignees

No one assigned

    Labels

    evaluatorIssues related to the evaluatorhelp wantedExtra attention is needed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions