Skip to content

NestedFrame.query() should handle mixed base and nested columns without erroring #154

Open
@gitosaurus

Description

@gitosaurus

Today, NestedFrame.query checks the expression it's given and raises ValueError if it mixes nested and base columns. This is because in order to handle such expressions correctly, it would need to tease out the sub-expressions that are strictly against the nested columns (by traversing the abstract syntax tree of the input expression), apply and re-pack into an intermediate result, and then apply the base column expressions to this intermediate result.

In an expression like a > 2 & nested.flux > 50, for example, the user would expect the resulting NestedFrame to have no a values which were <= 2 and no nested.flux values which were <= 50. And in an expression like a > 2 | nested.flux > 50, the user would still expect to retain rows where a <= 2 so long as it had some nested.flux > 50, but within those rows, they wouldn't expect to see any nested.flux <= 50. For those rows where a > 2, though, they'd expect to see all the nested.flux rows. In other words, as soon as there is mixed-level expression, the nested rows sometimes need to be queried and repacked before continuing, or at least that should be the final effect.

Logically, if there was a method to unpack all nests and broadcast all base columns across them, then we would take the result of self.eval(expr) and do something like self.flatten_all().loc[result].repack_all(), but this would likely not be performant.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions