-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Open
Description
I used to use this data souce to connect to a redshift server and it works flawless (Used the redshift with SQLAlchemy 1.4):
CONNECTION_STRING = (
"postgresql+psycopg2://${REDSHIFT_USERNAME}:${REDSHIFT_PASSWORD}"
f"@{REDSHIFT_HOST}:5439/master?sslmode=allow"
)
datasource_name = "redshift_datasource"
datasource = context.data_sources.add_or_update_postgres(
name=datasource_name, connection_string=CONNECTION_STRING
)
Then, I've updated to use Redshift GX fork with SQLAlchemy 2.0 with the following data source:
redshift_connection_details = RedshiftConnectionDetails(
user="${REDSHIFT_USERNAME}",
password="${REDSHIFT_PASSWORD}",
host=REDSHIFT_HOST,
port=5439,
database="master",
sslmode=RedshiftSSLModes.ALLOW,
)
datasource_name = "redshift_datasource"
datasource = context.data_sources.add_or_update_redshift(
name=datasource_name,
connection_string=redshift_connection_details,
)
If I use it and test with a batch:
batch_definition = table_asset.get_batch_definition(batch_name)
batch = batch_definition.get_batch()
print("batch.head():", batch.head())
print("batch.columns():", batch.columns())
It will return the dataframe with batch.head(). However, the batch.columns will return []
This is breaking any expectation I try to test.
Versions:
great-expectations (1.4.4)
gx-sqlalchemy-redshift (0.8.20)
Metadata
Metadata
Assignees
Labels
No labels
Type
Projects
Status
To Do