-
Notifications
You must be signed in to change notification settings - Fork 43
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Allow users to disable schema check and creation on
load_file
(#1922)
Support running `load_file` without checking if the table schema exists or trying to create it. Recently a user reported that the cost of checking if the schema exists is very high for Snowflake: "I have a (`load_file`) task that took 1:36 minutes to run, and it was 1:30 running the information schema query." This is likely happening for other databases as well. Introduce two ways of disabling schema checks: 1. On a per-task basis, by exposing the argument `schema_exists` in `aql.load_file` When this argument is `True`, the SDK will not check if the schema exists or try to create it. It is `False` by default, and the Python SDK will behave as of 1.6 (running schema check and, if needed, trying to create the schema) 2. Globally, by exposing the Airflow configuration `load_table_schema_exists` in the `[astro-sdk]` section. This can also be set using the environment variable `AIRFLOW__ASTRO_SDK__LOAD_TABLE_SCHEMA_EXISTS`. The global configuration can be overridden per task, using [1]. Closes: #1921
- Loading branch information
Showing
9 changed files
with
117 additions
and
14 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
import importlib | ||
import os | ||
from unittest.mock import patch | ||
|
||
import astro | ||
from astro import settings | ||
from astro.files import File | ||
|
||
|
||
def test_settings_load_table_schema_exists_default(): | ||
from astro.sql import LoadFileOperator | ||
|
||
load_file = LoadFileOperator(input_file=File("dummy.csv")) | ||
assert not load_file.schema_exists | ||
|
||
|
||
@patch.dict(os.environ, {"AIRFLOW__ASTRO_SDK__LOAD_TABLE_SCHEMA_EXISTS": "True"}) | ||
def test_settings_load_table_schema_exists_override(): | ||
settings.reload() | ||
importlib.reload(astro.sql.operators.load_file) | ||
load_file = astro.sql.operators.load_file.LoadFileOperator(input_file=File("dummy.csv")) | ||
assert load_file.schema_exists |