Skip to content

feat: Execute on Ibis backend via Narwhals #193

@zilto

Description

@zilto

Prework

Proposal

Currently

Currently, Pointblank handles Ibis tables and columns by converting them to polars or pandas. This means:

  1. Data is on backend (DuckDB, Snowflake, etc.)
  2. Data is loaded into memory via Ibis to Polars
  3. Checks are evaluated eagerly in-memory via Polars
  4. Checks report is available in-memory

Proposed feature

#192 requests LazyFrame support for Polars lazy execution. Once that's supported, it should be trivial to add lazy support for Narwhals. Via the recent Narwhals-Ibis integration, it should be possible to execute checks on-backend. Meaning we would have:

  1. Construct a query plan from all lazy checks via Narwhals (polars-compatible API)
  2. Pass the Ibis Table reference to the query plan
  3. The query plan is executed on-backend via Ibis; the full dataset never leaves memory
  4. The query plan results are loaded into memory (i.e., checks report)

This Narwhals-Ibis bridge enables on-backend compute for most features (e.g., DataScan)

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions