Fix/from native update version#3515
Conversation
in tests/translate/from_native_test.py test_init_already_narwhals and test_init_already_narwhals_unstable have the exact same source code, when the former should verify behavior in a stable version. This function used to test this, but seemed to have been mistakenly edited.
from_native on a narwhals.{DataFrame,Series} would always return
the same object even if the DataFrame._version and the version
passed into from_native disagreed.
Now, if the versions mismatch we create a new narwhals object
with the appropriate version. If the versions match, we short cut
and return the same object.
761f14c to
f1cf356
Compare
| ``` | ||
| """ | ||
|
|
||
| _version: ClassVar[Version] = Version.MAIN |
There was a problem hiding this comment.
Hoping this has been an oversight and wasn't excluded for a strong reason. This change makes the DataFrame/LazyFrame api a bit more consistent internally.
There was a problem hiding this comment.
Yeah nice one, thanks @camriddell!
If you dig through the blame (I think) I added these on Series & DataFrame at the same time as the from_* constructors.
LazyFrame didn't get any new methods, so it slipped through 😅
|
@MarcoGorelli I believe the nightly failure isn't due to this change but perhaps something that needs to be accounted for in pandas. Shall I open up an Issue with this? Or do you believe it is relevant to this PR? |
| pl_frame = pl.DataFrame({"a": [1, 2, 3]}).lazy() | ||
| nw_frame = nw_v1.from_native(pl_frame) | ||
| with pytest.raises(AttributeError): | ||
| with pytest.raises(AssertionError): |
There was a problem hiding this comment.
IIRC, there may be a comment somewhere mentioning the error?
(Ignore me if this is too vague 😂)
There was a problem hiding this comment.
I found this comment in the comparable tests for DataFrames
# NOTE: (#2629) combined with passing in `nw_v1.DataFrame` (w/ a `_version`) into itself changes the error
with pytest.raises(AssertionError):
nw_v1.DataFrame(nw_frame, level="full")
The LazyFrame had an AttributeError because the LazyFrame.__init__ would try to check the ._version attribute which didn't exist. This change more closely aligns this LazyFrame with its DataFrame and Series counterparts since they all successfully fail with the same error now.
|
@camriddell I should probably check what was mentioned on discord - but from reading the PR I'm a bit confused on the problem that's being solved. (#3515 (comment)) is all good - but something seems strange to me that calling FWIW, I don't think |
|
thanks @camriddell ! on board with the suggestions, i think this just needs addressing the nightly failure is just due to pandas-dev/pandas#64749 and is unrelated to your pr |
The primary concern here is surprise that the
I agree with this from a structurally-correct perspective. However, conveniences are nice for DX especially since our primary audience is dataframe-agnostic libraries. This code let's them treat any incoming frame the same way without having to manually check if it is already a narwhals DataFrame. Likely their end users won't be passing in narwhals DataFrames, but they could have multiple entry points may need to translate to a narwhals object and our "recursive" type support means they don't have to write explicit |
b596ed5 to
835c683
Compare
|
Thanks for the context @camriddell! I have a couple more questions, are you okay if we sit on this until later today? |
Absolutely! I'm not in a rush to merge. |
|
@camriddell thanks for your patience 😄 1. Similar issueDid you know about this one? 2. What's the worst that could happen?I'm curious if we can reveal cases downstream that are (unknowingly?) relying on this behavior. As an example, if I change Show diff
diff --git a/tests/v1_test.py b/tests/v1_test.py
index 330211162..59ccdae61 100644
--- a/tests/v1_test.py
+++ b/tests/v1_test.py
@@ -447,7 +447,7 @@ def test_with_row_index(constructor: Constructor) -> None:
pytest.skip()
data = {"abc": ["foo", "bars"], "xyz": [100, 200], "const": [42, 42]}
- frame = nw_v1.from_native(constructor(data))
+ frame = nw.from_native(nw_v1.from_native(constructor(data)))
What happens if you try running the CI but change the let's see who this really is to raising an exception? 3. Is this discoverable?Part of what I was getting at in (#3515 (comment)) is that this looks like 2 user errors to me: # (1) Mixed versions
# (2) Asking for something you already have
nw.from_native(nw_v1.DataFrame(...))If you think the feature to switch versions is something we need, wouldn't this be a better fit?: nw_v1.DataFrame(...).to_version("main")
nw_v1.DataFrame(...).to_version("v1")
nw_v1.DataFrame(...).to_version("v2")There's less ambiguity with something along those lines IMO - and it avoids defining new behavior for |
Description
Calling
from_native(nw.DataFrame(…), version=…)would always return the passed DataFrame/Series object.This ignores the version that is specified in the version argument.
Instead, for a given narwhals DataFrame/Series
from_nativenow returns the same object if the versions match and creates a new object if the versions do not match.Note, 2 clean up items are also added in this PR.
Before:
After:
reported by hoxbro (Holoviews) on Discord #help channel
What type of PR is this? (check all applicable)
Related issues
Checklist