[Package] Semantic Types (Typed Entities) #4370
Replies: 3 comments 3 replies
-
Beta Was this translation helpful? Give feedback.
-
Hello, I am new to join the hackathon and really interested in contributing to the Semantic Types! I have submitted a PR yesterday to expect column values to be valid IPv4 address. It's a trivial feature and I am also trying to learn the contribution process from it. I guess it could exist many problems and your review can help me understand and fix them. @austiezr @kyleaton |
Beta Was this translation helpful? Give feedback.
-
Hey Hackathon People! As we come into the final hours of the Great Expectations Hackathon, we wanted to briefly highlight what we believe would be some quick wins in the Semantic Types category enabled by Regex-Based Column Map Expectations and Set-Based Column Map Expectations. Regex-Based Column Map Expectations
Set-Based Column Map Expectations
Creating expectations of these types can be as simple as identifying a valid regex or value-set and writing a couple of tests; the new |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
What Are Semantic Types?
That's a great question! Semantic Types categorize data by the type of information it represents. For example, consider a column with STRING data types. How do you know what the data means? You could be looking at a list of streets, cities, counties, etc. Without background information, it can be hard to know what you're looking at. That's where Semantic Types come in. The entity’s Semantic Type defines how it should be interpreted.
Why should we test on Semantic Types?
Testing on these Semantic Types allows us to create more explicit, fit-for-purpose tests, unlocking questions like
Does this column contain US State abbreviations?
instead of askingDoes this column contain strings in the set [AL, AL, AR, AZ...]
How do you know when data entities are good candidates for Semantic Types testing?
Any data that you can tie to a real-world category or reference is ideal for this kind of test, e.g., phone numbers, ZIP codes, countries, coordinates, URLs, email addresses, etc.
Linked below are a number of Expectations on Semantic Types that have already been contributed to Great Expectations!
Beta Was this translation helpful? Give feedback.
All reactions