Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intermittent cross-comp errors when many builds are queued #258

Open
MattSturgeon opened this issue Aug 18, 2024 · 3 comments
Open

Intermittent cross-comp errors when many builds are queued #258

MattSturgeon opened this issue Aug 18, 2024 · 3 comments
Milestone

Comments

@MattSturgeon
Copy link
Member

e.g. from this build:

cannot build on 'ssh-ng://[email protected]': error: failed to start SSH connection to '[email protected]'
Failed to find a machine for remote build!
derivation: sjilr3pnp4l1mcl6pfry3yaz4nxyam08-plugins-utils-dashboard.drv
required (system, features): (aarch64-linux, [])
2 available machines:
(systems, maxjobs, supportedFeatures, mandatoryFeatures)
([aarch64-linux], 80, [benchmark, big-parallel, gccarch-armv8-a, kvm, nixos-test], [])
([aarch64-darwin, x86_64-darwin], 8, [big-parallel], [])
error: a 'aarch64-linux' with features {} is required to build '/nix/store/sjilr3pnp4l1mcl6pfry3yaz4nxyam08-plugins-utils-dashboard.drv', but I am a 'x86_64-linux' with features {benchmark, big-parallel, kvm, nixos-test}

This is frustrating, because it usually happens when the load is high, and the only solution is to attempt a re-build of the entire nix-eval, wasting resources further.

@MagicRB
Copy link
Contributor

MagicRB commented Aug 18, 2024

This seems like a infra/nix issue not buildbot-nix issue. There may be a way to detect this class of failures and to retry automatically though.

@MattSturgeon
Copy link
Member Author

This seems like a infra/nix issue not buildbot-nix issue.

Thanks for the pointer, I've opened nix-community/infra#1416.

I'll close this for now, unless you think it's worth pursuing a workaround in buildbot-nix?

@MattSturgeon MattSturgeon closed this as not planned Won't fix, can't repro, duplicate, stale Aug 18, 2024
@MagicRB
Copy link
Contributor

MagicRB commented Aug 18, 2024

I think as a long term thing, handling this case would be nice. Computers suck, stuff happens, sometimes temporarily, would be nice to not involve the human, let the computer clean up its own mess if you ask me.

@MagicRB MagicRB reopened this Aug 18, 2024
@MagicRB MagicRB added this to the Future milestone Aug 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants