Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pin only major/minor plugin version with + suffix #4340

Closed
wants to merge 3 commits into from

Conversation

bentsherman
Copy link
Member

Close #4329

This PR changes the plugin system to not download plugins in offline mode (NXF_OFFLINE). Instead, Nextflow will search the plugins directory for the latest plugin version and use that one.

This change makes it possible to run a pipeline in offline mode without pinning the version of each plugin, which allows the pipeline to automatically receive plugin updates. The assumption is that users will download all the plugins they need beforehand and transfer them to the offline system, using these plugins as a local cache.

I tested this PR by installing nf-validation 0.3.0 (nextflow plugin install [email protected]) and running a script with the following config:

plugins {
  id 'nf-validation'
}

Normally, Nextflow would automatically download the newest plugin version (currently 0.3.2), but now if you run Nextflow with NXF_OFFLINE=true, it will use the already downloaded version 0.3.0 and not download anything new. I even disabled my internet connection just to be sure 😉

One limitation however is that the offline version checker doesn't verify that a plugin version is compatible with the current Nextflow version. We can get the version range from the plugin manifest file, but maybe there is some code somewhere that already does this?

@netlify
Copy link

netlify bot commented Sep 25, 2023

Deploy Preview for nextflow-docs-staging canceled.

Name Link
🔨 Latest commit 1de5e8b
🔍 Latest deploy log https://app.netlify.com/sites/nextflow-docs-staging/deploys/65171748e8eb04000819e2f4

Copy link
Member

@ewels ewels left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super nice, thanks @bentsherman!

This problem is currently causing a lot of pain within nf-core at the moment, since we recently added the first non-core plugin to all pipelines (nf-validation). So suddenly, everyone running offline is hitting a wall.

Getting this PR merged + released ASAP would be a great help. So I'd mark this as one of my highest priority Nextflow PRs currently @pditommaso 🙏🏻

@pditommaso
Copy link
Member

Need to review it. AFAIK there's already in place a logic to pick the current installed plugin in place

@bentsherman
Copy link
Member Author

@ewels in the meantime, can they not work around the issue by specifying the list of plugins with versions in their config or CLI? That should work without needing to fork the pipeline code.

@jfy133
Copy link
Contributor

jfy133 commented Sep 28, 2023

How does NXF_OFFLINE get determined exactly? Does this need to be physically set or will Nextflow assume that if it can't contact a server it should set that env for you?

Overly specific context: I ask as one of my bugbears since the nf-validation plugin being added to nf-core pipelines, is when developing on German trains where the internet connection is awful is I will start testing a pipeline, it works a couple of times, then the internet goes and then the next test run fails because it can't download the pipeline because the internet connection has dropped. That keeps happening until the connection comes back.

It would be nice for the default (IMO) is to always look for a plugin in cache unless a new version is specified, not just when NXF_OFFLINE is specified (in my context above, I don't want to keep having to sporadically having to set/deactivate the parameter...)

@pditommaso
Copy link
Member

It would be nice for the default (IMO) is to always look for a plugin in cache unless a new version is specified

this is what exactly happens for core plugins. For non-core plugins, when no version is specified, nextflow checks for the latest version remotely

@jfy133
Copy link
Contributor

jfy133 commented Sep 28, 2023

It would be nice for the default (IMO) is to always look for a plugin in cache unless a new version is specified

this is what exactly happens for core plugins. For non-core plugins, when no version is specified, nextflow checks for the latest version remotely

Is there a particular reason why the same behaviour could not also apply for non-core (assuming the plugin has already been cached)? Edit: As Ben asked in the issue from Phil

@pditommaso
Copy link
Member

Actually I want to test this scenario

@pditommaso
Copy link
Member

I confirm a plugin has a pinned version e.g. [email protected] and it's configure locally, nextflow does not try to download it

@ewels
Copy link
Member

ewels commented Sep 28, 2023

@ewels in the meantime, can they not work around the issue by specifying the list of plugins with versions in their config or CLI? That should work without needing to fork the pipeline code.

Yes that's what we're doing (actually I forgot that you could do it on the CLI, that's nicer). But I've walked people through this about 5 times in the past week and it's getting tiring 😅 We're also starting to get XY questions related to stripping the nf-validation plugin out of pipeline code, which is bad.

I confirm a plugin has a pinned version e.g. [email protected] and it's configure locally, nextflow does not try to download it

Yup, we know - #4329 is specifically about making this work without pinning.

  • We don't want to pin the plugin version in nf-core pipelines as then we can't ship bugfixes to the plugin.
  • We also don't want to introduce the overhead of making every offline user set up their own config to pin the plugin, just so that the pipeline runs.

The folks at @SciLifeLab @NationalGenomicsInfrastructure are trying to launch from Tower in an offline environment, and they were struggling to get the custom config with the pinned plugin versions to resolve correctly at all, so this is a production-blocking problem for them. Can go into more detail on that in a separate issue if you're interested, I don't want to derail the thread here about Tower config resolution.

I'm hoping that this PR means that non-core plugins will "just work" for people running offline.

Signed-off-by: Ben Sherman <[email protected]>
@ewels
Copy link
Member

ewels commented Nov 1, 2023

This PR is getting very urgent. There's a long thread over in the @nf-core slack about pinning exact plugin versions in all pipelines, because so many people are hitting difficulties fetching plugins in offline systems / systems with patchy internet.

My current thinking is that we should extend the logic for plugins to work in the same manner as Nextflow itself: if a plugin is not pinned, use the latest local version available if there is one.

Currently anyone developing a pipeline on a train can suddenly not test it when they go into a tunnel, even if they've run it 20 times in the past ten minutes. Nextflow requires reaching out to the internet on every run if a plugin is not pinned. We don't want to pin plugins in pipeline code, because pipeline release cycles can be very slow.

What ideally what we need ASAP is:

  • This PR to be merged, preferably without needing NXF_OFFLINE, instead just working like this for all runs
  • To be able to pin minimum versions in pipelines, not only exact versions: Pinning plugin major version #4036

@pditommaso
Copy link
Member

I'm not sure we should encourage the use of unversioned plugins. Even minor version changes can introduce unexpected behaviour.

Likely this has been a mistake done, it would be a practice to make the plugin version mandatory. This a common best practice for many package managers.

To solve the problem of offline execution, I think the plugins should be downloaded along with the pipeline assets, and make nextflow able to recognise the plugins in local project assets path.

How are the pipelines made available in such environment, if there's no network connection?

@bentsherman
Copy link
Member Author

How are the pipelines made available in such environment, if there's no network connection?

You start in an environment that is connected to the internet but also has e.g. SSH access to the offline environment. You download everything you need (pipeline, plugins, containers) and then transfer it to the offline environment.

@bentsherman
Copy link
Member Author

I sketched out a process for preparing an offline environment here: #4330 (comment)

Now that we have nextflow inspect, we could have another command like nextflow download or extend nf-core download to perform this whole process. I originally imagined having to specify the plugins that you need, but I guess Nextflow could also infer it from the configuration.

@pditommaso it seems a lot of nf-core folks would agree with you to simply pin the plugin versions from now on. Still, it makes good sense to make NXF_OFFLINE also apply to plugins. The expectation of offline mode is that it allows you to work completely offline -- no downloading pipeline code, plugins, or other software deps.

@pditommaso
Copy link
Member

Still, it makes good sense to make NXF_OFFLINE also apply to plugins

I agree! as such I think this PR should:

  1. when NXF_OFFLINE=false, keep the current behaviour
  2. when NXF_OFFLINE=true
  • throw an error if the plugin version is missing (do not check remote latest)
  • throw an error if a specific version of the plugin is not available locally

A separate PR to enforce the plugin version for non-core plugins.

Does it make sense ?

@ewels
Copy link
Member

ewels commented Nov 3, 2023

I think the key part of my request is getting missed.

No-one wants unversioned plugins. But there is a middle way:

  • No version at all - bad
  • 👉🏻 Minimum version - good 👈🏻
  • Exact version - bad

I want to be able to say >= 1.4.0 - basically every package manager can do this.

This allows us to patch bugs in plugins without having to go and manually update and re-release ~100 nf-core pipelines.

It's the same behaviour that we have for Nextflow, where we also specify a minimum required version. Again, minor changes in Nextflow can introduce unexpected behaviour, but no-one pins exact versions of Nextflow into pipeline code.

This is the second of my two bullet points, which I originally requested in #4036 (but we hit a wall in implementation there).

@bentsherman
Copy link
Member Author

I agree it would be useful to specify >= to automatically fetch patch releases. In offline mode, Nextflow could simply grab the minimum version from the local cache.

However, suppose I specify >= 1.4.0, then Nextflow downloads 1.4.1, then I go into offline mode. Then it is not clear how Nextflow should recover 1.4.1 from the local cache when the pipeline specified 1.4.0 as the minimum version. It needs the plugin index file from GitHub, or we need to have some kind of lock file for plugins.

For now, I have created a separate PR #4487 to improve the error message with the expectation that pipeline devs will have to pin their plugin versions. That will give a solution for now while we debate over longer-term options.

@marcodelapierre marcodelapierre added this to the 23.11.0-edge milestone Nov 7, 2023
@bentsherman bentsherman removed this from the 23.11.0-edge milestone Nov 7, 2023
@ewels
Copy link
Member

ewels commented Nov 7, 2023

Then it is not clear how Nextflow should recover 1.4.1 from the local cache when the pipeline specified 1.4.0 as the minimum version.

This is just standard semver version matching, no? We have a list of versions that we have locally, we have a range of accepted versions that we can use. It's just a case of getting a subset of acceptable versions and then picking the one with the most recent version number..

Nextflow is already using the jsemver library to work with version numbers, and that has functionality to compare versions.

@bentsherman
Copy link
Member Author

bentsherman commented Nov 15, 2023

Okay, based on our discussion from the last weekly meeting, we will adapt this PR to provide a >= mechanism to fetch the latest patch release (e.g. 1.0.x) or use the local one when offline. The semver library can be used for this.

This way, plugin versions must be pinned to a specific version or minor release when offline.

@bentsherman bentsherman self-assigned this Nov 21, 2023
@bentsherman bentsherman changed the title Disable automatic plugin download in offline mode Pin only major/minor plugin version with >= Mar 24, 2024
@bentsherman bentsherman requested a review from a team as a code owner April 3, 2024 14:39
@bentsherman bentsherman marked this pull request as draft April 29, 2024 14:47
@bentsherman bentsherman changed the title Pin only major/minor plugin version with >= Pin only major/minor plugin version with + suffix May 28, 2024
@bentsherman
Copy link
Member Author

Given the discussion here -- #5016 (comment) -- regarding the meaning of + when checking the Nextflow version, we should use that instead of >= to implement this feature, so that they are consistent.

I always thought + just meant >=, but actually it means "use this exact version or a greater patch version"... which is exactly what we want here

@bentsherman
Copy link
Member Author

Closing in favor of #5435

@bentsherman bentsherman deleted the 4329-offline-disable-plugin-update branch October 25, 2024 13:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Plugins: Use any available cached version if NXF_OFFLINE is true
5 participants