From 96d525da6f6b2e09b50dddd46d11a364212df1d4 Mon Sep 17 00:00:00 2001 From: John Mount Date: Tue, 12 Sep 2023 10:44:10 -0700 Subject: [PATCH] edit --- Examples/data_schema/schema_check.ipynb | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/Examples/data_schema/schema_check.ipynb b/Examples/data_schema/schema_check.ipynb index d70e7a0d..76467a59 100644 --- a/Examples/data_schema/schema_check.ipynb +++ b/Examples/data_schema/schema_check.ipynb @@ -8,7 +8,7 @@ "\n", "However, a common missing component remains: a general \"Pythonic\" [data schema](https://en.wikipedia.org/wiki/Database_schema) definition, documentation, and invariant enforcement mechanism.\n", "\n", - "It turns out it is quite simple to add such functionality using Python decorators. This isn't particularly useful for general functions (such as `pd.merge()`), where the function is supposed to support arbitrary data schemas. However, it can be *very* useful in adding checks and safety to specific applications and analysis workflows built on top such generic functions. In fact, it is a good way to copy schema details from external data sources such as databases or CSV into enforced application invariants. Application code that transforms fixed tables into expected exported results can benefit greatly from schema documentation and enforcement.\n", + "It turns out it is quite simple to add such functionality using Python decorators. This isn't particularly useful for general functions (such as `pd.merge()`), where the function is supposed to support arbitrary data schemas. However, it can be *very* useful in adding checks and safety to specific applications and analysis workflows built on top such generic functions. In fact, it is a good way to copy schema details from external data sources such as databases or CSV into enforced application invariants. Application code that transforms fixed tables into expected exported results can benefit greatly from such schema documentation and enforcement.\n", "\n", "I propose the following simple check criteria for both function signatures and data frames that applies to both inputs and outputs:\n", "\n", @@ -123,7 +123,7 @@ "source": [ "The decorator defines the types schemas of at least a subset of positional and named arguments. Declarations are either values (converted to Python types), Python types, or sets of types. A special case is dictionaries, which specify a subset of the column structure of function signatures or data frames. \"return_spec\" is reserved to name the return schema of the function.\n", "\n", - "We are deliberately concentrating on data frames, and not the inspection of arbitrary composite Python types. This is because we what to enforce data frame or table schemas, and not inflict an arbitrary runtime type system on Python. Schemas over atomic types is remains a sweet spot for data definitions.\n", + "We are deliberately concentrating on data frames, and not the inspection of arbitrary composite Python types. This is because we what to enforce data frame or table schemas, and not inflict an arbitrary runtime type system on Python. Schemas over tables of atomic types is remains a sweet spot for data definitions.\n", "\n", "Our decorator documentation declares that `fn()` expects at least:\n", "\n", @@ -208,7 +208,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Or, and this is where we start to get benefits, we can call with a wrong argument type." + "Or, and this is where we start to see benefits, we can call with a wrong argument type." ] }, { @@ -241,7 +241,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "And we show that this checking pushes down into the structure of data frame arguments! In our next example we see the argument is missing a required column.\n" + "And, we show that this checking pushes down into the structure of data frame arguments! In our next example we see an argument is missing a required column.\n" ] }, { @@ -590,7 +590,7 @@ "\n", "A downside is, the technique *can* run into what I call \"the first rule of meta-programming\". Meta-programming only works as long as it doesn't run into other meta-programming (also called the \"its only funny when I do it\" theorem). That being said, I feel these decorators can be very valuable in Python data science projects.\n", "\n", - "This documentation and demo can be found [here](https://github.com/WinVector/data_algebra/tree/main/Examples/data_schema).\n" + "This documentation and demo can be found [here](https://github.com/WinVector/data_algebra/tree/main/Examples/data_schema)." ] }, { @@ -749,7 +749,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "In conclusion: the `SchemaCheck` decoration is simple and effective tool to add schema documentation and enforcement to your analytics projects." + "In conclusion: the `SchemaCheck` decoration is a simple and effective tool to add schema documentation and enforcement to your analytics projects." ] }, {