Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dataLoadStatus and dateTimes #30

Open
mobb opened this issue Feb 26, 2020 · 3 comments
Open

dataLoadStatus and dateTimes #30

mobb opened this issue Feb 26, 2020 · 3 comments

Comments

@mobb
Copy link
Contributor

mobb commented Feb 26, 2020

Background: Our list of preferred dateTimeFormatString is based on ISO

there are datetime strings in this list that pg does not understand. So a plain to_timestamp('2018-08-08 09:00-08','' does not always work.

example dataset: https://portal-s.edirepository.org/nis/reportviewer?packageid=knb-lter-ble.9.1
with the dateTimeFormatString = YYYY-MM-DDThh:mm-hh The last -hh is the offset to UTC, and is correct.
the error you get is

ERROR: conflicting values for "mm" field in formatting string Detail: This value contradicts a previous setting for the same field type.

To see what strings posgtres allows, see
https://www.postgresqltutorial.com/postgresql-to_timestamp/

@mobb
Copy link
Contributor Author

mobb commented Feb 26, 2020

To test what strings postgres handles, we used the pg-gui:

select to_timestamp('2018-08-08 09:00-08','YYYY-MM-DDThh:mm-hh' 

datasets with dateTimeFomratStrings we may want to test further
ble.9.1
sbc.5001.8

@mobb
Copy link
Contributor Author

mobb commented Feb 26, 2020

No solution right now. Handling all dateTimes in the preferred list could be not worth the effort. but the dataLoadStatus check is valuable for checking typing. (see #25)

@mobb
Copy link
Contributor Author

mobb commented Feb 26, 2020

also, the error msg from that check could be friendlier, although that would have to be customize for different types of failures.

In the case of ble.9.1, here is Mark's response:
it appears that the warning message is the result of PASTA's inability to transform your datetime format into one that is acceptable by PostgreSQL. PASTA uses PostgreSQL to validate datatypes during the quality check, and the large variety of datetime/timestamp formats are quite difficult to support in this manner. Until a better type checker is implemented, this warning will likely persist. Sorry.

Maybe a general response from dataLoadStatus could be prefixed, e.g.:
PASTA uses PostgreSQL in the dataLoadStatus quality check. One purpose is data type checking. Something about your data table caused this step to fail. The error message was ...

we want to keep something like dataLoadStatus, however. People have asked for checks to confirm bounds and coverage, and loading into something with functions like min/max/mean would be necessary for that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant