Skip to content

Releases: unionai-oss/pandera

v0.26.1: Multi-index, `@check_types` Bugfixes

26 Aug 16:48
f8384ae
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.26.0...v0.26.1

v0.26.0: Add support for Python 3.13

13 Aug 01:12
24fe938
Compare
Choose a tag to compare

⭐️ Highlight

📣 Pandera now supports Python 3.13! Now go forth and use bare forward reference types to your hearts content 🤗

What's Changed

New Contributors

Full Changelog: v0.25.0...v0.26.0

v0.25.0: 🦩 Support Ibis table validation

08 Jul 19:19
c49b18f
Compare
Choose a tag to compare

⭐️ Highlight

Pandera now supports Ibis 🦩! You can now validate data on all available ibis backends using the pandera.ibis module.

In-memory table example:

import ibis
import pandera.ibis as pa

class Schema(pa.DataFrameModel):
    state: str
    city: str
    price: int = pa.Field(in_range={"min_value": 5, "max_value": 20})

t = ibis.memtable(
    {
        'state': ['FL','FL','FL','CA','CA','CA'],
        'city': [
            'Orlando',
            'Miami',
            'Tampa',
            'San Francisco',
            'Los Angeles',
            'San Diego',
        ],
        'price': [8, 12, 10, 16, 20, 18],
    }
)
Schema.validate(t).execute()

Sqlite example:

con = ibis.sqlite.connect()
t = con.create_table(
    "table",
    schema=ibis.schema(dict(state="string", city="string", price="int64"))
)

con.insert(
    "table",
    obj=[
        ("FL", "Orlando", 8),
        ("FL", "Miami", 12),
        ("FL", "Tampa", 10),
        ("CA", "San Francisco", 16),
        ("CA", "Los Angeles", 20),
        ("CA", "San Diego", 18),
    ]
)

Schema.validate(t).execute()

What does this mean?

This release unlocks in database validation in some of the most widely used data platforms, including PostGres, Snowflake, BigQuery, MySQL, and more ✨. It means that you can validate data at scale, on your database/data framework of your choice, before fetching it for downstream analysis/modeling work.

Naturally, this also means that you can develop your schemas locally on a duckdb or sqlite backend and then use the same schemas in production on a remote database like postgres.

Learn more about the integration here.

What's Changed

  • Add Polars pydantic integration with format support and native JSON schema generation by @halicki in #1979
  • exclude python 3.12 and pyspark combo in ci by @cosmicBboy in #2005
  • Delete previously-added foo.txt and new_example.py by @deepyaman in #2013
  • Pin PySpark due to test failures/incompatibilities by @deepyaman in #2010
  • Temporarily pin polars due to test failure in CI by @deepyaman in #2011
  • Replace event_loop removed in pytest-asyncio 1.0 by @deepyaman in #2014
  • Fix typehint in unique_values_eq (issue #1492) by @AhmetZamanis in #2015
  • fix pyarrow string issue, fix docs failing issues by @cosmicBboy in #2026
  • bugfix: PANDERA_VALIDATION_ENABLED=False should disable validation by @cosmicBboy in #2028
  • Expect Python slice index errors after Python 3.10 by @deepyaman in #2033
  • Ibis dev by @deepyaman in #2040
  • handle dataframe-level failure cases: convert row to dict by @cosmicBboy in #2050
  • bugfix/1927 by @Jarek-Rolski in #2019
  • [🐻‍❄️ polars] Limit reported failure cases if Check.n_failure_cases is defined. by @cosmicBboy in #2055
  • [🦩 ibis] Limit reported failure cases if Check.n_failure_cases is defined. by @cosmicBboy in #2056
  • Add link to the documentation about Ibis datatypes by @deepyaman in #2057
  • Test column presence, mark other features not impl by @deepyaman in #2060
  • Run pre-commit on all files to fix linter issues by @deepyaman in #2063
  • Implement regex option and add additional checks by @deepyaman in #2061
  • Implement binary and boolean types (and test them) by @deepyaman in #2064
  • Add unit test suite for Ibis components, fix a bug by @deepyaman in #2065
  • bugfix: fix format_vectorized_error_message to properly format nested pyarrow failed cases by @AndrejIring in #2036
  • handle empty dataframes with PydanticModel: show warning by @cosmicBboy in #2066
  • bugfix/2031: Allow strict='filter' and coerce='True' at the same time for PySpark schemas by @gfilaci in #2032
  • Set validation scope for pandas run_checks methods by @amerberg in #2003
  • DataFrameSchema.update_index correctly sets title, description, and metadata by @cosmicBboy in #2067
  • [ibis 🦩] remove inplace=True in column validate call by @cosmicBboy in #2068
  • [ibis 🦩] check backend: use positional join for duckdb and polars, fix ibis DataFrameModel.validate types by @cosmicBboy in #2071

New Contributors

Full Changelog: v0.24.0...v0.25.0

v0.25.0rc0: Support ibis table validation

07 Jul 00:34
ad8f08d
Compare
Choose a tag to compare

What's Changed

  • Add Polars pydantic integration with format support and native JSON schema generation by @halicki in #1979
  • exclude python 3.12 and pyspark combo in ci by @cosmicBboy in #2005
  • Delete previously-added foo.txt and new_example.py by @deepyaman in #2013
  • Pin PySpark due to test failures/incompatibilities by @deepyaman in #2010
  • Temporarily pin polars due to test failure in CI by @deepyaman in #2011
  • Replace event_loop removed in pytest-asyncio 1.0 by @deepyaman in #2014
  • Fix typehint in unique_values_eq (issue #1492) by @AhmetZamanis in #2015
  • fix pyarrow string issue, fix docs failing issues by @cosmicBboy in #2026
  • bugfix: PANDERA_VALIDATION_ENABLED=False should disable validation by @cosmicBboy in #2028
  • Expect Python slice index errors after Python 3.10 by @deepyaman in #2033
  • Ibis dev by @deepyaman in #2040
  • handle dataframe-level failure cases: convert row to dict by @cosmicBboy in #2050
  • bugfix/1927 by @Jarek-Rolski in #2019
  • [🐻‍❄️ polars] Limit reported failure cases if Check.n_failure_cases is defined. by @cosmicBboy in #2055
  • [🦩 ibis] Limit reported failure cases if Check.n_failure_cases is defined. by @cosmicBboy in #2056
  • Add link to the documentation about Ibis datatypes by @deepyaman in #2057
  • Test column presence, mark other features not impl by @deepyaman in #2060
  • Run pre-commit on all files to fix linter issues by @deepyaman in #2063
  • Implement regex option and add additional checks by @deepyaman in #2061
  • Implement binary and boolean types (and test them) by @deepyaman in #2064
  • Add unit test suite for Ibis components, fix a bug by @deepyaman in #2065
  • bugfix: fix format_vectorized_error_message to properly format nested pyarrow failed cases by @AndrejIring in #2036
  • handle empty dataframes with PydanticModel: show warning by @cosmicBboy in #2066
  • bugfix/2031: Allow strict='filter' and coerce='True' at the same time for PySpark schemas by @gfilaci in #2032
  • Set validation scope for pandas run_checks methods by @amerberg in #2003
  • DataFrameSchema.update_index correctly sets title, description, and metadata by @cosmicBboy in #2067
  • [ibis 🦩] remove inplace=True in column validate call by @cosmicBboy in #2068

New Contributors

Full Changelog: v0.24.0...v0.25.0rc0

v0.24.0

15 May 14:07
88bb609
Compare
Choose a tag to compare

✨ Highlights ✨

Import pandera.pandas to define schemas for pandas objects

🚨 Breaking Change

pandera==0.24.0 has dropped the dependency on pandas and numpy and has introduced a pandas extra. This will break any users who relied on pandas as a the transitive dependency of pandera to install pandas. To remediate this, do the following:

Install pandas explicitly or use the pandas extra

pip install 'pandera[pandas]'  # recommended
# or
pip install pandas pandera

Change your import to pandera.pandas

All pandas-specific symbols that were exposed by the top-level pandera module are now defined in the pandera.pandas module.

# old import
import pandera as pa

# new import
import pandera.pandas as pa

Importing pandera as pa for defining pandas schemas will still be available but will raise a warning. This will raise an ImportError in 5 minor releases (0.29.0).

What's Changed

New Contributors

Full Changelog: v0.23.1...v0.24.0

v0.24.0rc0: Drop pandas and numpy dependency, introduce pandas extra

25 Apr 02:00
c32293a
Compare
Choose a tag to compare

✨ Highlights ✨

Import pandera.pandas to define schemas for pandas objects

🚨 Breaking Change

pandera==0.24.0 has dropped the dependency on pandas and numpy and has introduced a pandas extra. This will break any users who relied on pandas as a the transitive dependency of pandera to install pandas. To remediate this, do the following:

Install pandas explicitly or use the pandas extra

pip install pandas pandera
# or
pip install 'pandera[pandas]'

Change your import to pandera.pandas

All pandas-specific symbols that were exposed by the top-level pandera module are now defined in the pandera.pandas module.

# old import
import pandera as pa

# new import
import pandera.pandas as pa

What's Changed

New Contributors

Full Changelog: v0.23.1...v0.24.0

v0.23.1

08 Mar 02:24
88ee1bb
Compare
Choose a tag to compare

What's Changed

New Contributors

Special shoutout to the new contributors!

Full Changelog: v0.23.0...v0.23.1

v0.23.0: Improve pydantic compatibility, add `json_normalize`, bugfixes

01 Mar 21:18
d3f80e6
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.22.1...v0.23.0

v0.23.0b2: Testing new pypi publishing system

01 Mar 18:48
f59edfc
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.22.1...v0.23.0b2

Release v0.22.1: Fix `check_input` decorator regression

26 Dec 21:21
Compare
Choose a tag to compare

What's Changed

  • bugfix: check_input decorator handles functions with kwargs by @cosmicBboy in #1888

Full Changelog: v0.22.0...v0.22.1