Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move lib to a separate repository #381862

Open
infinisil opened this issue Feb 13, 2025 · 27 comments
Open

Move lib to a separate repository #381862

infinisil opened this issue Feb 13, 2025 · 27 comments
Assignees
Labels
9.needs: community feedback significant Novel ideas, large API changes, notable refactorings, issues with RFC potential, etc.

Comments

@infinisil
Copy link
Member

infinisil commented Feb 13, 2025

Important

This is a long thread, so before making a comment, please read the full proposal and discussion, and only post what hasn't been brought up before. Thanks!

I've previously suggested this on Discourse (and again here), which has garnered a surprising number of likes, so let's consider this! Here's a first draft proposal to discuss

Motivation

lib (note: not everything, see the plan below) is arguably the most independent part of Nixpkgs, it has effectively no dependency on anything other than itself (and Nix) and virtually no PRs touch both lib and non-lib (which is also in the lib PR guidelines).

  • The vast majority of Nixpkgs issues and PRs are unrelated to lib, which makes lib development hard to track. Yes there's a label, but e.g. people can't just watch the entire repo.
    • Also, the label can't be applied to issues automatically.
  • Having a separate repo allows more granular repo permissions (though for the start let's keep the status quo)
  • A separate repo allows avoiding the dependency on the rest of Nixpkgs for those who don't need it.
  • With a separate repo, it's impossible for lib to depend on Nixpkgs, this encourages stabilisation of interfaces between them and untangles any messy dependencies between them.
  • With a separate repo, CI becomes less messy/more efficient, no need to evaluate all of Nixpkgs for lib changes, and the other way around too.
  • With a separate repo it's easier to contribute: Less code to navigate, can have a more specific PR and issue template, a more specific CONTRIBUTING.md/README.md, etc.

Plan

  • Figure out which parts of lib should stay in Nixpkgs, off the top of my head at least the systems stuff and maintainers (could also consider moving that to a different repo at some point, but that's orthogonal). Refactor Nixpkgs so that the lib directory is ready to be moved.
  • Set up CI to automatically mirror the new repo to Nixpkgs' lib directory (Nixpkgs is very self-contained right now, and while this could be changed, it's very tricky, maybe controversial and not necessary for now if we just mirror)
    • We need CI to do the syncing, I've done some testing with automated git subtree mirroring before
    • We need CI in Nixpkgs to tell people to PR to the new repo instead
  • Give all Nixpkgs committers write access (we could also consider limiting this to more specific maintainers, but that's orthogonal and more controversial, so let's maintain the status quo for now)
  • Extract the lib-specific CI actions to the new repo, this should include at least part of the Nixpkgs manual build, which builds docs for lib, but also some others
  • Move the option types docs out of the NixOS manual (this should've been done a while ago, but this change forces us to!)
  • Deprecate https://github.com/nix-community/nixpkgs.lib, or alternatively consider whether it can be reused as the source of truth. Could also look at that to get inspiration for the mirroring CI
  • Inform all the existing lib PRs that they should be moved to the new repo now, or do it automatically (probably not worth it, not many lib PRs are actively being worked on)

I've also done a successful split of a Nixpkgs component into a separate repository before with nixpkgs-vet (previously known as nixpkgs-check-by-name), so we can reuse some of that knowledge. I think @willbush and @philiptaron can also attest the benefits of a separate repo :)


Let's get some general consensus on whether we should do this. If we do agree, somebody will have to lead this effort, which might as well be me (as always, probably sponsored by Tweag/Modus Create and Antithesis! ✨) :)

Ping @roberth @hsjobeki @adisbladis @Profpatsch

@infinisil infinisil added 9.needs: community feedback significant Novel ideas, large API changes, notable refactorings, issues with RFC potential, etc. labels Feb 13, 2025
@infinisil infinisil self-assigned this Feb 13, 2025
@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/any-interest-in-separating-out-nixpkgs-lib-into-its-own-repo-flake/39374/6

@nixos-discourse
Copy link

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/monorepos-dont-map-to-our-social-structure/44162/46

@NotAShelf
Copy link
Member

Splitting lib into its own repo just makes sense to me. It is independent from the rest of Nixpkgs, barely any PRs touch both (as far as I've observed) and tracking lib changes in the main repo feels counterproductive. I think this move would make iteration of nixpkgs.lib cleaner, CI faster, and dependencies more stable. Plus, it makes consuming just the lib cheaper!

+1 for standalone lib!

@sersorrel
Copy link
Contributor

sersorrel commented Feb 13, 2025

Also, the label can't be applied to issues automatically.

Is there a technical reason that this is impossible? I would naively assume that a GHA workflow could automatically label PRs that touch files in lib. (particularly if there is an assumption that we will have "CI in Nixpkgs to tell people to PR to the new repo instead".)

no need to evaluate all of Nixpkgs for lib changes,

this makes me a tad uneasy. how do you propose to catch lib changes which result in nixpkgs breakage? (not that such cases are necessarily easily caught currently, I'll admit.)

I think I am mostly curious about the detail of how the syncing will work - if new interfaces are added to lib, how long will it be before they can be used in nixpkgs, for example? ("immediate" automated commits to master? automated PRs daily, weekly, ...? would such PRs be merged automatically, or require review?) and would it become more difficult to test (parts of) nixpkgs against a given lib change?

@ConnorBaker
Copy link
Contributor

Perhaps I missed it — would lib still be usable within Nixpkgs? If so, how would it be added to the fixed point?

@infinisil
Copy link
Member Author

infinisil commented Feb 14, 2025

@sersorrel

Is there a technical reason that this is impossible? I would naively assume that a GHA workflow could automatically label PRs that touch files in lib.

It does label PRs already, but it can't label (non-PR) issues automatically :)

this makes me a tad uneasy. how do you propose to catch lib changes which result in nixpkgs breakage?

Oh that's a great point, so we should keep running the full eval in the lib repo.

@ConnorBaker

Yup, because if the repo gets mirrored to Nixpkgs' lib directory, there's effectively no change for how Nix evaluates it :)

@nyabinary
Copy link
Contributor

@sersorrel

Is there a technical reason that this is impossible? I would naively assume that a GHA workflow could automatically label PRs that touch files in lib.

It does label PRs already, but it can't label (non-PR) issues automatically :)

this makes me a tad uneasy. how do you propose to catch lib changes which result in nixpkgs breakage?

Oh that's a great point, so we should keep running the full eval in the lib repo.

@ConnorBaker

Yup, because if the repo gets mirrored to Nixpkgs' lib directory, there's effectively no change for how Nix evaluates it :)

How exactly will this work? Will lib be removed from nixpkgs and placed into a separate repository, or will it simply be cloned, with all lib-related development happening in that new repository and then mirrored back into nixpkgs? If that's the case, how will all the changes made in the separate repository be synced back into nixpkgs?

@infinisil
Copy link
Member Author

be cloned, with all lib-related development happening in that new repository and then mirrored back into nixpkgs?

Yes that's the plan.

how will all the changes made in the separate repository be synced back into nixpkgs?

Automation magic 🧙

But oh right, didn't answer this yet

I think I am mostly curious about the detail of how the syncing will work - if new interfaces are added to lib, how long will it be before they can be used in nixpkgs, for example? ("immediate" automated commits to master? automated PRs daily, weekly, ...? would such PRs be merged automatically, or require review?) and would it become more difficult to test (parts of) nixpkgs against a given lib change?

My first thought was to sync it immediately with an automated commit, but I guess we should at least run Nixpkgs CI, so we should have PRs. And at that point I think doing a weekly PR that gets automatically merged if CI passes would be best.

@jakehamilton
Copy link
Contributor

If it helps any I did rewrite the NixPkgs lib as its own standalone library last year. If anything, could be useful for surveying the surface area of things to keep in Nixpkgs.

https://git.auxolotl.org/auxolotl/labs/src/branch/main/lib

@nyabinary
Copy link
Contributor

nyabinary commented Feb 14, 2025

be cloned, with all lib-related development happening in that new repository and then mirrored back into nixpkgs?

Yes that's the plan.

how will all the changes made in the separate repository be synced back into nixpkgs?

Automation magic 🧙

But oh right, didn't answer this yet

I think I am mostly curious about the detail of how the syncing will work - if new interfaces are added to lib, how long will it be before they can be used in nixpkgs, for example? ("immediate" automated commits to master? automated PRs daily, weekly, ...? would such PRs be merged automatically, or require review?) and would it become more difficult to test (parts of) nixpkgs against a given lib change?

My first thought was to sync it immediately with an automated commit, but I guess we should at least run Nixpkgs CI, so we should have PRs. And at that point I think doing a weekly PR that gets automatically merged if CI passes would be best.

With weekly merges, I can see some gridlock that comes up if someone wants to rely on new lib functionality.

EDIT: regarding things to look at, Project Ekala might be interesting to look at regarding splitting nixpkgs.

@numinit
Copy link
Contributor

numinit commented Feb 14, 2025

This is a great idea. I already use nixpkgs.lib in a few projects. Let's formalize what's already useful for people who just want lib without downloading an entire nixpkgs tarball!

@fricklerhandwerk
Copy link
Contributor

fricklerhandwerk commented Feb 14, 2025

While I agree with the key points of the problem statement, I do have a few serious objections that splitting lib to a separate repo is the only reasonable or by far the optimal solution. There is enough weighing trade-offs that I request this should be discussed RFC-style in a living document.

Specifically, some of these problems are independent, such as only-lib-download size and issue tracking, and treating them more carefully as such would open room for more creative approaches.

@Ma27
Copy link
Member

Ma27 commented Feb 14, 2025

The one thing I like about working with nixpkgs is that I have a single repository to look at. A single PR to file. A single Hydra jobset to configure (for larger stuff such as glibc updates).

A bit more context on that: https://discourse.nixos.org/t/contributor-retention/12412/42

Granted, lib may be the one thing we could rip out. But whenever I see discussions about the monorepo, the motivation is to not only remove lib, but to split everything into smaller parts (flake or whatever else) which is why I'm pretty wary of these proposals.

Wouldn't it be another option to explore tooling like https://josh-project.github.io/josh/faq.html ? AFAIU this would provide both the monorepo experience to maintainers and the polyrepo experience to consumers. Because I fully agree with the motivation from a consumer & contributor perspective, to not fetch a full nixpkgs tarball or to not clone all of nixpkgs when becoming a contributor[1] that only cares about a specific subsystem.

However, as a maintainer I must say that I'm afraid that this will make it harder for me to contribute to nixpkgs in the long run.

[1] In fact, my nixpkgs checkout is part of my backup and when I get a new machine, I restore it from the backup because I'm not patient enough to wait for a clone.

@roberth
Copy link
Member

roberth commented Feb 14, 2025

Fetching

For the goal of making nix-community/nixpkgs.lib obsolete, we can implement efficient subdirectory fetching in Nix.
Both the git and github fetchers can implement this efficiently. It just hasn't been a priority, as nixpkgs/lib appears to be the "only" use case, but this is probably just an information problem regarding what people think can be done.
Now you know :)

@roberth
Copy link
Member

roberth commented Feb 14, 2025

Forking

My first thought was to sync it immediately with an automated commit, but I guess we should at least run Nixpkgs CI, so we should have PRs. And at that point I think doing a weekly PR that gets automatically merged if CI passes would be best.

The organizational issue around changes can also be solved with a Linux-style forking workflow. This way, NixOS/nixpkgs-lib is just a fork, and can trivially run the same CI.
However, this disincentivizes proper componentization, specifically good testing of lib by the lib test suite.
It'd be more appropriate if we used Linux-style forking elsewhere. E.g. haskell-updates could have its own repo, but we'd have to figure out the cost/benefit of that then. Might feel like overkill from the perspective of the current situation, but splitting the issue tracker, PR list and notifications may be very valuable not just for lib...

@kampka
Copy link
Contributor

kampka commented Feb 14, 2025

I'm hugely in favor of having a standalone standard nix library that's maintained by the community without splitting the work between 5 or more different projects or attempts. My main question, though, reading this threat, is what the outcome of this issue is supposed to be.
Will it be a nix stdlib with general stability guarantees and nixpkgs as one of many downstreams or is the intend to just move out the nixpkgs lib into its own repo. If it's the latter then I don't think it well be as useful or interesting as it could be to non-nixpkgs downstream.

@pbsds
Copy link
Member

pbsds commented Feb 14, 2025

I love the idea and I am very much for it 🚀, but I keep thinking of various edge-cases the more and more I think about automatic mirroring.

Should lib->nixpkgs ports target master or staging? Can this be automatically determined? What should happen if change A is merged into nixpkgs master, but then change B which relies on A has to go through staging before it has synced A from master?

What should happen if additional changes are merged into the lib repo while a lib->nixpkgs port PR is still open? Should the port PR be force-pushed (affecting review and CI), or should additional mirroring PRs wait until the previous PR has been merged?

Can lib also be mirrored to release-xx.yy? Should it? Should we have branches like master, release-xx.yy in the lib repo like in nixpkgs and mirror each automatically, or should we instead do backports manually and allow PRs to touch [email protected]/lib?

Should we support mirroring from nixpkgs back to lib? If no: what happens if something is merged into nixpkgs/lib anyway? This could become neccesary in staging-next, or it could happen through merging some old PR that has yet to run the "don't touch lib pls" CI action.

Should mirroring only sync the files, or should the commits and their messages also be mirrored? Can we do so while preserving commit signatures? Should we rewrite commit messages such that references to PRs and issues reference the correct repo? (i.e. nixos/nixpkgs#123 and nixos/lib#123)

@roberth
Copy link
Member

roberth commented Feb 14, 2025

Rebuilds

nixpkgs/lib changes basically never cause rebuilds, with the possible exclusion of lib.systems, which is its own thing anyway that probably shouldn't even be in a non-Nixpkgs standard library.

If we want to make it a proper component, we should move lib.systems to pkgs.systems or similar, and not manage it as part of the separate lib repo.

NIxpkgs flake lib layout

This touches on the mistake of exposing lib as nixpkgs-the-flake.lib leaving no real room for flake-level library functions such as nixosSystem or the nixos entrypoint function library. Tangling those things into a single ball is confusing to users, and causes issues when the extend this antipattern, or should I say extend.

So concretely the flake should look like, pseudocode (not arguing for a lib flake input just yet):

  outputs = { lib, ... }:
  {
    lib = {
      nixpkgs.systems = import ./pkgs-lib/systems.nix { inherit inputs.lib.lib; };
      nixos = import ./nixos/lib/default.nix {
        inherit inputs.lib.lib;
        # lib.nixpkgs should also provide the Nixpkgs entrypoint
        # The import <nixpkgs> interface is messy due to the legacy `system` and `*System` params.
        nixpkgs-lib = self.lib.nixpkgs;
      }
      lib = inputs.lib.lib;
    };
    # ...
  }

Rename

Part of the confusion is also that lib is a generic term.
The new lib with its reduced scope could get a different name. liblogic? It's about values and functions, so more like "math" and not about really domain-specific things. Could bikeshed in a new thread. Realistically though, lib -> liblogic would have to be done basically everywhere and I'm not up for it.

@nyabinary
Copy link
Contributor

The one thing I like about working with nixpkgs is that I have a single repository to look at. A single PR to file. A single Hydra jobset to configure (for larger stuff such as glibc updates).

A bit more context on that: https://discourse.nixos.org/t/contributor-retention/12412/42

Granted, lib may be the one thing we could rip out. But whenever I see discussions about the monorepo, the motivation is to not only remove lib, but to split everything into smaller parts (flake or whatever else) which is why I'm pretty wary of these proposals.

Wouldn't it be another option to explore tooling like https://josh-project.github.io/josh/faq.html ? AFAIU this would provide both the monorepo experience to maintainers and the polyrepo experience to consumers. Because I fully agree with the motivation from a consumer & contributor perspective, to not fetch a full nixpkgs tarball or to not clone all of nixpkgs when becoming a contributor[1] that only cares about a specific subsystem.

However, as a maintainer I must say that I'm afraid that this will make it harder for me to contribute to nixpkgs in the long run.

[1] In fact, my nixpkgs checkout is part of my backup and when I get a new machine, I restore it from the backup because I'm not patient enough to wait for a clone.

I think we should explore Josh more closely, never heard of it before, but it seems great.

@teto
Copy link
Member

teto commented Feb 14, 2025

I fail to see a compelling reason to kill one of the strengths of nixpkgs, aka the monorepo. Maybe because I dont contribute to lib/ that much (might have added 5/6 functions and some tests). But I do spend a lot of time grepping for stuff in lib/ because nix is not typed and as a consequence, I often have to consult those functons to understand what's going on.

To me the main reason would be to have a smaller repo to clone when you dont need nixpkgs but it's already solved by nixpkgs.lib .
As for the other motivations, most of them apply to the whole of nixpkgs or for stuff that I haven't felt were a problem (once again I might be wrong, like is the stabilisation of interfaces a problem really ? I feel like we are now quite good at that).

@numinit
Copy link
Contributor

numinit commented Feb 14, 2025

Well, nixpkgs will already be pulling in lib, could we expose it the same way we already do? I don't think anything necessarily needs to break.

@SomeoneSerge
Copy link
Contributor

SomeoneSerge commented Feb 14, 2025

I only skimmed through the comments, sorry if I am repeating something that was already voiced.

I don't mean to discourage, but my impression is that the main motivation here is to enforce boundaries: ensure that lib is independent of the rest of Nixpkgs. The separate git repo approach feels like hammering a nail with a sledgehammer, we just need some other tools for determining and ensuring the direction of "arrows". The size argument is countered by git supporting sparse check-outs, and that's in spit of git being a really bad tool for the job. Labels and ACL also sound like a GitHub problem?

EDIT(2025-02-15): A later comment of @kampka also mentions flakes, which I hadn't addressed but other people have, downloading the full checkout of Nixpkgs. A predictable knee-jerk reaction is that the solution is neither a fork nor a split-off, but adding a sparse checkout support to flakes

@kampka
Copy link
Contributor

kampka commented Feb 14, 2025

[...] my impression is that the main motivation here is to enforce boundaries: ensure that lib is independent of the rest of Nixpkgs.

That is definitely not the only motivation. The nixpkgs lib is useful in and of its own to build downstream projects that do not require nixos or pkgs. The current monorepo structure requires such projects to download a 65+MB tarball as a flake just for lib.
For that reason, people have already started maintaining an outside fork of nixpkgs lib.
The question here is do we want to have an "official" lib that useful instead of community ports.

@infinisil
Copy link
Member Author

There is enough weighing trade-offs that I request this should be discussed RFC-style in a living document.

Agreed, as the next step towards this I'd create a living RFC-like document that includes all the arguments people have brought up in this issue, and PR it to the Nixpkgs repo for further discussion (not a fan of the RFC process). Then we can also use GitHub threads!


Because this issue is already pretty long, I'd like to propose something procedural before it gets much longer: Before making a comment, read the full proposal and discussion, and only post what hasn't been brought up before. Since this would be turned into an RFC-like document before being decided, the main goal with this issue should be to just collect arguments for and against this change, there's no need to convince everybody that we should or should not do it. Whether we end up doing this shouldn't be decided by the number of comments/reactions for/against, but rather by solid arguments, so let's make sure we're not missing anything :)

@nyabinary
Copy link
Contributor

I think we have to address accessibility too, like monorepos have numerous benefits to nixpkgs IMO and the main problem people have with them is the processes surrounding it rather than the monorepo itself. So, is there any way we can improve the processes? We have a shit-ton of PRs open, and it's easy to get lost in the sauce per se, so how can we improve that?

@SomeoneSerge
Copy link
Contributor

After a night's sleep, I feel like highlighting one more point: one appeal I see in this proposal is that with a separate repository we'd be better incentivized to relax the dependency of Nixpkgs on lib (NB: direction reversed wrt previous comments) as well. Namely we'd have to stop assuming that pkgs and nixos are used with the one revision of lib, maybe introduce separate semvers for pkgs and lib, because they can advance at different paces. I find this desirable. That said, I still don't think the split-off is a way to achieve, an extended CI is (same applies to mixing nixpkgs versions for pkgs and nixos)

@roberth
Copy link
Member

roberth commented Feb 15, 2025

collect arguments for and against this change

Multiple problems and multiple solutions have been discussed. Not sure if this format would do the landscape justice, but we'll see.

stop assuming that pkgs and nixos are used with the one revision of lib,

The current stability and simplicity do prevent problems, and most of all, user-facing complexity.
lib should not nearly be the main character in our story, let's say. It can be a lot of fun to obsess over changes, but they generally don't impact users at all, while proportionally they do cause a lot of work for them.

A complete redesign that lives besides a legacy lib lib seems more appropriate to me, but I would hold off on that until pipe operators are accepted, and then get everyone on board of a redesign, including everyone who has worked on alternative libraries.
Only under those conditions do I think we could carefully push significant change onto users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
9.needs: community feedback significant Novel ideas, large API changes, notable refactorings, issues with RFC potential, etc.
Projects
None yet
Development

No branches or pull requests