Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restrict the type of Data.List.NonEmpty.unzip #86

Closed
mixphix opened this issue Sep 8, 2022 · 57 comments
Closed

Restrict the type of Data.List.NonEmpty.unzip #86

mixphix opened this issue Sep 8, 2022 · 57 comments
Labels
approved Approved by CLC vote base-4.20 Implemented in base-4.20 (GHC 9.10)

Comments

@mixphix
Copy link
Collaborator

mixphix commented Sep 8, 2022

Currently we have unzip :: Functor f => f (a, b) -> (f a, f b) exported from Data.List.NonEmpty. The implementation can remain the same, but I propose that the type should be restricted to unzip :: NonEmpty (a, b) -> (NonEmpty a, NonEmpty b) for analogy with Data.List as well as to avoid exporting overgeneralized functions from specific modules. Really, why should we be exporting something that works with any Functor from Data.List.NonEmpty?

@treeowl
Copy link

treeowl commented Sep 8, 2022

I would want to change the implementation as well, so as not to be gratuitously lazy (and memory leaky).

unzip ((a, b) :| abs) = (a :| as, b :| bs)
  where
    (as, bs) = Data.List.unzip abs

@mixphix
Copy link
Collaborator Author

mixphix commented Sep 8, 2022

Many of the other functions in that module are defined using lazy pattern matching ~(x :| xs). Why pick unzip to be more strict than its relatives?

@treeowl
Copy link

treeowl commented Sep 8, 2022

Many things in that module may be lazier than they should be. I haven't looked at it recently. The current unzip seems pretty awful, but maybe there should be a separate proposal to stricten everything up?

@Bodigrim
Copy link
Collaborator

Bodigrim commented Sep 8, 2022

Data.List.NonEmpty API is extremely lazy, but let's leave this for another proposal.

> Data.List.NonEmpty.map id undefined `seq` ()
()

@mixphix
Copy link
Collaborator Author

mixphix commented Sep 10, 2022

I'm hesitant to remove this function completely, though. It's already here and useful. I wonder if it would be better to export this function from Data.Functor? A Hackage search shows a few duplicates, indicating a level of utility.

@ocharles
Copy link

I opened #88 today (completely independently of this - weird timing!). That proposal discusses having Data.Functor.unzip

@mixphix
Copy link
Collaborator Author

mixphix commented Sep 12, 2022

Haha, well in any case, they seem to be in tandem with each other :)

@josephcsible
Copy link
Contributor

overgeneralized functions

I don't understand why this is a bad thing. With the exception of functions whose sole purpose is to restrict types (asTypeOf), why would you ever want a less general function when you could have an equivalent more general one instead?

@mixphix
Copy link
Collaborator Author

mixphix commented Sep 12, 2022

As @treeowl mentioned, strictness is one factor. Another is performance, though here it may not apply: for specific types one might be able to write it more efficiently than the general abstraction could perform, since the latter mustn't assume anything about the representation.

@treeowl
Copy link

treeowl commented Sep 12, 2022

Since I was being a bit vague above.... For [], say, we have

traverseBia :: Biapplicative f => (a -> f b c) -> [a] -> f [b] [c]
traverseBia _ [] = bipure [] []
traverseBia f (x : xs) = biliftA2 (:) (:) (f x) (traverseBia f xs)

In particular,

traverseBia :: (a -> (b, c)) -> [a] -> ([b], [c])

The Biapplicative instance for (,) is still a mess on Hackage (not matching the Bifunctor instance), but in the next release they'll both be lazy. One could also write a custom pair type with stricter versions. Anyway, we'll have

bipure = (,)
biliftA2 f g ~(a, b) ~(c, d) = (f a c, g b d)

Inlining,

traverseBia :: (a -> (b, c)) -> [a] -> ([b], [c])
traverseBia _ [] = ([], [])
traverseBia f (x : xs) = (fx1 : r1, fx2 : r2)
  where
    (fx1, fx2) = f x
    (r1, r2) = traverseBia f xs

Walking one of the result lists will cause the other one to be realized, with selector thunks for elements (assuming the optimizer doesn't do anything horrible). Memory leak averted, despite remaining very lazy. It'll still be quite a bit lazier than the hand-written unzip, but I think it's about the only reasonable way to write a lazy unzip for an arbitrary Traversable. Of course, if you want a totally eager unzip, you can do that too, by using a custom pair type with strict Bifunctor and Biapplicative instances.

@treeowl
Copy link

treeowl commented Sep 12, 2022

To sum up my position, I think Data.Functor, Data.Traversable, and Data.NonEmpty are all good places for unzip, each with a different implementation.

@ocharles
Copy link

FWIW, I would be fine to have all three.

Also,

why would you ever want a less general function when you could have an equivalent more general one instead?

More specific types can lead to more specific type errors.

@phadej
Copy link

phadej commented Sep 13, 2022

I think this is a corollary of #22/#76/... When that is done, this could be done as well.

@xplat
Copy link

xplat commented Sep 14, 2022

To sum up my position, I think Data.Functor, Data.Traversable, and Data.NonEmpty are all good places for unzip, each with a different implementation.

While I see the merits of each of these versions of unzip on their own, having them all seems like a namespacing nightmare.

@tomjaguarpaw
Copy link
Member

I described the conditions under which I'm prepared to vote in favour of this breaking change on a related discussion

@Bodigrim
Copy link
Collaborator

Since this is a breaking change, an impact assessment is due. It's quite possible that someone exported Data.List.NonEmpty.unzip and used it with another Functor.

@tomjaguarpaw
Copy link
Member

It's quite possible that someone exported Data.List.NonEmpty.unzip and used it with another Functor.

Here is a real world example in non-public code: #88 (comment)

@Bodigrim
Copy link
Collaborator

@mixphix are you still interested in this? Could we make some progress towards impact assessment?

@Bodigrim Bodigrim added the awaits-impact-assessment Awaits an impact assessment (e. g., using https://github.com/Bodigrim/clc-stackage) label Nov 20, 2022
@ocharles
Copy link

I'd like to take this proposal over, if you're stuck/need help @mixphix. Is the next step still the impact assessment?

@Bodigrim
Copy link
Collaborator

Yes, impact assessment.

@mixphix
Copy link
Collaborator Author

mixphix commented Jan 13, 2023

If you wouldn't mind, @ocharles, I'd be grateful to hand this proposal over to you.

@ocharles
Copy link

ocharles commented Jan 13, 2023

You got it! I'll try and put together an impact assessment in the next few days

@ocharles
Copy link

If I did everything correctly, I believe I've completed the impact assessment. I used https://gitlab.haskell.org/ghc/ghc/-/merge_requests/9757 to build a new version of GHC, then built clc-stackage without needing to patch any libraries*.

So, to reiterate: the proposed change doesn't seem to break anything in clc-stackage. I'd like to submit the proposal (precisely what's in the above MR) to the committee.

@Bodigrim
Copy link
Collaborator

Thanks, @ocharles. I'll trigger the vote once a new cast of CLC is selected in February.

@ocharles
Copy link

@Bodigrim Good to go with a vote now?

@Bodigrim
Copy link
Collaborator

Dear CLC members, the proposal is to restrict Data.List.NonEmpty.unzip from unzip :: Functor f => f (a, b) -> (f a, f b) to a more monomorphic unzip :: NonEmpty (a, b) -> (NonEmpty a, NonEmpty b). The impact assessment found no Stackage packages affected by this change.

Strategically the proposed change is in line with the monomorphisation of Data.List, approved earlier.

CLC has approved addition of polymorphic unzip to Data.Functor in #88, but this change happened to late to be included in base-4.18 to be shipped with GHC 9.6.

Before we trigger a vote, I would like to hear (non-binding) opinions from new members of CLC.
CC @chshersh @angerman @parsonsmatt @hasufell

@tomjaguarpaw
Copy link
Member

tomjaguarpaw commented Feb 18, 2023

It's a gratuitous breaking change, so -1 from me, as outlined in a previous comment.

[EDIT: oh sorry, my opinion wasn't actually requested at this time, but FWIW that's my opinion, and my vote]

@Bodigrim
Copy link
Collaborator

Bodigrim commented Feb 18, 2023

What worries me in the current plan is that if anywhere out there exists a consumer of polymorphic unzip from Data.List.NonEmpty, they won't be able to maintain backward compatibility without CPP, because Data.Functor.unzip has not been in base for long enough (in fact, it is not in base yet ;).

#88 (comment) describes a possible approach to gradual replacement, but I think inroducing Data.List.NonEmpty.unzipNonEmpty only to deprecate it later is too grotesque for the task at hand.

If there is an appetite for a smoother transition, I'd suggest the following:

  • GHC 9.8 introduces Data.Functor.unzip (already approved, @ocharles please land the MR) and adds a warning to Data.List.NonEmpty.unzip, suggesting to use Data.Functor.unzip instead.
  • GHC 9.10 restricts the type of Data.List.NonEmpty.unzip to f ~ NonEmpty and removes the warning.

This is slightly antithetical to the purpose of the proposal, because until GHC 9.10 users would not have access to a monomorphic unzip, but this is no worse than status quo.

@Bodigrim
Copy link
Collaborator

@ocharles since you took over the proposal, what would you like to do? We can vote on the original proposal, but judging from the discussion above this is likely to fail. I think that changing the proposal to a warning now and monomorphisation later (say, in GHC 9.14) has much better chances to succeed.

@ocharles
Copy link

I would like to vote with the original proposal. Personally with my stackage evaluation I don't see any significant breakage, but the CLC are of course free to see things differently. I don't think I have the resources to do much more than that though, and took this from @mixphix to try and keep things moving forward

@mixphix
Copy link
Collaborator Author

mixphix commented Feb 23, 2023

I'm obviously in favour of the proposal as written; preferably with @parsonsmatt's fast-tracked warning/restriction two-release plan, but I'll gladly accept any schedule for the sake of a more sensible type.

@ocharles
Copy link

ocharles commented Mar 7, 2023

@Bodigrim could you remove "awaits-impact-assessment", as that has been done (#86 (comment))

@Bodigrim Bodigrim removed the awaits-impact-assessment Awaits an impact assessment (e. g., using https://github.com/Bodigrim/clc-stackage) label Mar 7, 2023
@Bodigrim
Copy link
Collaborator

I'll gladly accept any schedule for the sake of a more sensible type.

I'm going to use your permission and put for a vote a plan from #86 (comment) as it's likely to gain more support.

Dear CLC members, let's vote on the following plan:

  1. GHC 9.8: deprecate Data.List.NonEmpty.unzip with {-# DEPRECATED unzip "This function will be made monomorphic in GHC 9.14, consider switching to Data.Functor.unzip" #-}. The latter is already there.
  2. GHC 9.10: no changes.
  3. GHC 9.12: no changes.
  4. GHC 9.14: change type signature of Data.List.NonEmpty.unzip to monomorphic unzip :: NonEmpty (a, b) -> (NonEmpty a, NonEmpty b) and remove the deprecation warning.

@tomjaguarpaw @chshersh @hasufell @mixphix @angerman @parsonsmatt


+1 from me. This is a very conservative migration plan for a virtually unused function, eventually bringing in more consistency.

@parsonsmatt
Copy link

+1 - would also +1 a more aggressive schedule

@ocharles
Copy link

ocharles commented Mar 22, 2023

I don't quite understand point 1. If a user of Data.List.NonEmpty.unzip uses it on a NonEmptyList they shouldnt use Data.Functor.unzip. Should they just disable the depreciation warning? If so, I think we should say that. I say this because I would expect most users are using it on NonEmptyLists, and the more general type is the exception. The impact assessment seems to corroborate this.

I'm ok with this proposal, but it doesn't sit great because legitimate uses will have to turn the warning off, but that's (at least) module wide, and while you get a warning about the deprecation, you get no such warning to stop ignoring deprecations when you reach 9.14.

@hasufell
Copy link
Member

As there is no suggestion for an implementation change, I find the motivation insufficient and vote -1.

@chshersh
Copy link
Member

+1


The deprecation schedule indeed may look too conservative. But I think it would be huge success if we could enforce and follow the same schedule for every breaking change.

I also wouldn't block this improvement on the grounds of "not doing enough". If someone in the future wants to create a proposal to change the implementation of unzip for NonEmpty, they're more than welcome to do so 😌

@tomjaguarpaw
Copy link
Member

+1


However, I really think we should also add Data.List.NonEmpty.unzipNonEmpty with a specialized implementation at the same time, and then later do Data.List.NonEmpty.unzip = unzipNonEmpty.

I'm sympathetic to @hasufell's point of view, but I still think that it's better to bring it to the attention of users that Data.List.NonEmpty.unzip is currently not a specialized implementation.

@mixphix
Copy link
Collaborator Author

mixphix commented Mar 24, 2023

The proposal to strengthen to unzip :: NonEmpty (a, b) -> (NonEmpty a, NonEmpty b) will not specialize the implementation in any way. There are other places for similar unzip functions to live that may have different implementations, but are functionally the same when specialized to NonEmpty. The names can happily coexist.

@Bodigrim
Copy link
Collaborator

@hasufell making NonEmpty functions less gratuitously lazy is tracked in #107. IMO it would be strange to change laziness of one function only, but leave everything else as is, as it would introduce even more accidental inconsistency.

@mixphix could you please cast your vote on #86 (comment)?

@mixphix
Copy link
Collaborator Author

mixphix commented Mar 24, 2023

+1

@Bodigrim
Copy link
Collaborator

Thanks all, we have enough votes for the proposal to pass.

@mixphix would you like to execute on Step 1 of #86 (comment) and raise an MR?

@Bodigrim Bodigrim added the approved Approved by CLC vote label Mar 25, 2023
@mixphix
Copy link
Collaborator Author

mixphix commented Mar 26, 2023

Raised: https://gitlab.haskell.org/ghc/ghc/-/merge_requests/10189

@Kleidukos
Copy link
Member

Kleidukos commented Apr 22, 2023

Hi from the cabal team! Do y'all know if there is a retrie or hlint rule that can be distributed over communication channels? Perhaps something as simple as replacing unzip with (fmap fst &&& fmap snd)

@mixphix
Copy link
Collaborator Author

mixphix commented Apr 23, 2023

I'm no expert but something along the lines of

- warn: {name: "Avoid NonEmpty.unzip", lhs: "Data.List.NonEmpty.unzip", rhs: "(fmap fst &&& fmap snd)"}

(or Data.Functor.unzip, depending on your preference) may work. I'm not sure how hlint handles qualified names like that.

@ndmitchell
Copy link

@mixphix - that works. Given the user already made the choice to use a function for this, and that &&& isn't all that common, I'd suggest the Data.Functor.unzip variant. Happy to take a patch adding that to HLint, probably with a note explaining and that points at this issue (add , note = "Because the function is being deprecated")

@Bodigrim Bodigrim added the base-4.20 Implemented in base-4.20 (GHC 9.10) label Jun 20, 2024
hubot pushed a commit to ghc/ghc that referenced this issue Oct 3, 2024
hubot pushed a commit to ghc/ghc that referenced this issue Oct 4, 2024
hubot pushed a commit to ghc/ghc that referenced this issue Oct 5, 2024
hubot pushed a commit to ghc/ghc that referenced this issue Oct 5, 2024
hubot pushed a commit to ghc/ghc that referenced this issue Oct 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Approved by CLC vote base-4.20 Implemented in base-4.20 (GHC 9.10)
Projects
None yet
Development

No branches or pull requests