Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ExpectColumnValueLengthsToBeBetween is raising exception in DBR 15.4 LTS #10947

Open
suchintakp5 opened this issue Feb 17, 2025 · 0 comments
Open

Comments

@suchintakp5
Copy link

Describe the bug
ExpectColumnValueLengthsToBeBetween is failing and raising exception in DBR 15.4 LTS and Unity Catalog enabled cluster.
To Reproduce

import great_expectations as gx
import great_expectations.expectations as gxe

# Retrieve your Data Context
data_context = gx.get_context(mode="ephemeral")
# Define the Data Source name
data_source_name = "source_system_name_spark_dataframe"
# Add the Data Source to the Data Context
data_source = data_context.data_sources.add_spark(name=data_source_name)
# Define the Data Asset name
data_asset_name = "dataset_name"
# Add a Data Asset to the Data Source
data_asset = data_source.add_dataframe_asset(name=data_asset_name)
# Define the Batch Definition name
batch_definition_name = "dataset_batch_definition"

# Add a Batch Definition to the Data Asset
batch_definition = data_asset.add_batch_definition_whole_dataframe(
    batch_definition_name
)

df = <A pyspark dataframe>

batch_parameters = {"dataframe": df}
# Get the dataframe as a Batch
batch = batch_definition.get_batch(batch_parameters=batch_parameters)

test = gxe.ExpectColumnValueLengthsToBeBetween(column=<column_name>, min_value=1, max_value=5, catch_exceptions=True)
# Test the Expectation
validation_results = batch.validate(test, result_format="COMPLETE")

Expected behavior
No exception should be raised

Environment (please complete the following information):

  • Operating System: [Azure Databricks cluster with DBR 15.4 LTS and Unity Catalog enabled]
  • Great Expectations Version: [1.3.6]
  • Data Source: [Pyspark dataframe]
  • Cloud environment: [Azure Databricks]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant