-
Notifications
You must be signed in to change notification settings - Fork 16
Open
Description
Prework
- Read and abide by the Pointblank code of conduct and contributing guidelines.
- Search for duplicates among the existing issues (both open and closed).
Proposal
Currently
Currently, Pointblank handles Ibis tables and columns by converting them to polars or pandas. This means:
- Data is on backend (DuckDB, Snowflake, etc.)
- Data is loaded into memory via Ibis to Polars
- Checks are evaluated eagerly in-memory via Polars
- Checks report is available in-memory
Proposed feature
#192 requests LazyFrame support for Polars lazy execution. Once that's supported, it should be trivial to add lazy support for Narwhals. Via the recent Narwhals-Ibis integration, it should be possible to execute checks on-backend. Meaning we would have:
- Construct a query plan from all lazy checks via Narwhals (polars-compatible API)
- Pass the Ibis Table reference to the query plan
- The query plan is executed on-backend via Ibis; the full dataset never leaves memory
- The query plan results are loaded into memory (i.e., checks report)
This Narwhals-Ibis bridge enables on-backend compute for most features (e.g., DataScan)