-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Often the values of a column are publicly known, and there is no reason to suppress them.
I want to be able to tag these columns as such in the API, and then declare them value-safe in the microdata class.
I want this to work for all column types. A side-effect of this is that, if a numeric or datetime column is declared value-safe, then only those values will be used in microdata creation. I'm tempted to implement this by using the numeric for sorting, then converting to string. This is a bit weird. My concern is that, when converting to string, any sense of numeric value is lost---a value might be pushed to a very far-away value simply because there is a gap in the continuous sequence.
Another approach for continuous data would be to build continuous microdata, but then afterwards force each value to the nearest safe value. A possible problem here is that we could have several outliers in a given range, and we'd need to pick one. But this doesn't seem much of a problem ... since assignment in a range is random, then effectively we'd be picking a random value, more or less. This sounds like the right behavior.