You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We're currently using parts of the private pandas API in parse.fix_dates, specifically the parsing.parse_time_string function which was renamed to parsing.parse_datetime_string_with_reso in
pandas v2:
Now, in pandas v2.2 (which we don't yet support, not even v2 but I was looking into doing that right now) the import suddenly breaks - no surprise there, it's not in the public API so that can happen anytime.
While we can change the import for this to continue working, it's probably a good idea to move away from using an internal function like this.
However, it's hard to find a replacement. I've searched for quite a bit but haven't found something that is a drop in replacement, sadly.
Some things I looked into:
https://github.com/ixc/python-edtf can parse wide range of strings into a rich internal representation that we can use to derive our -XX uncertainty strings. Unfortunately, the parser doesn't seem to be configurable regarding dayfirst/monthfirst.
For now we can hope that pandas doesn't completely remove the function, but the moving around is a warning that we cannot rely on this indefinitely. I also don't think we promise total reliability of the date parsing, so maybe a bit of breaking change here would be ok.
Would be good to unit test this a little more in any case - it's a perfect example of something that's very well unit testable. Here's what we currently have:
I think all the date formats suppported by augur parse are also supported in other subcommands such as augur filter, yet there are no calls to the pandas API there.
There's a case to be made that augur parse should be more lenient because it's supposed to take a broad range of input files and then normalize them to something more standard.
We could see however how often augur parse is actually used in practice. I have a feeling it's not used that often anymore, at least not in our workflows.
Current Behavior
We're currently using parts of the private pandas API in
parse.fix_dates
, specifically theparsing.parse_time_string
function which was renamed toparsing.parse_datetime_string_with_reso
inpandas v2:
augur/augur/parse.py
Lines 36 to 42 in 8731851
Now, in pandas v2.2 (which we don't yet support, not even v2 but I was looking into doing that right now) the import suddenly breaks - no surprise there, it's not in the public API so that can happen anytime.
While we can change the import for this to continue working, it's probably a good idea to move away from using an internal function like this.
However, it's hard to find a replacement. I've searched for quite a bit but haven't found something that is a drop in replacement, sadly.
Some things I looked into:
-XX
uncertainty strings. Unfortunately, the parser doesn't seem to be configurable regarding dayfirst/monthfirst.For now we can hope that pandas doesn't completely remove the function, but the moving around is a warning that we cannot rely on this indefinitely. I also don't think we promise total reliability of the date parsing, so maybe a bit of breaking change here would be ok.
Would be good to unit test this a little more in any case - it's a perfect example of something that's very well unit testable. Here's what we currently have:
augur/tests/test_parse.py
Lines 12 to 33 in 8731851
The text was updated successfully, but these errors were encountered: