Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No non-zero exit code when running show over a non-existent module #743

Open
xdelaruelle opened this issue Jan 8, 2025 · 6 comments
Open

Comments

@xdelaruelle
Copy link
Contributor

Hello Lmod Team,

To be more in line with Lmod, I have made modulecmd.tcl command return a non-zero exit code when a module evaluation fails. This is a regular behavior for any kind of command when something goes wrong, like for instance in our field when trying to evaluate a module that does not exist.

I am working to adapt EasyBuild to make it fit this behavior change in Environment Modules v5.5 (easybuilders/easybuild-framework#4739) and I found something that may be of interest for you.

Describe the issue
EasyBuild is internally running the show sub-command to determine if a module exist or not.

When trying to load a non-existent module, the lmod script sets a non-zero exit code.

But when running the show sub-command, the lmod script sets a zero exit code whether the module exist or not.

To Reproduce

$ $LMOD_CMD bash --version

Modules based on Lua: Version 8.7.55 (8.7.55-5-g1f416cc2a) 2025-01-08 07:32 +01:00
    by Robert McLay [email protected]

$ $LMOD_CMD bash load unknown_module
Lmod has detected the following error:  The following module(s) are unknown: "unknown_module"

Please check the spelling or version number. Also try "module spider ..."
It is also possible your cache file is out-of-date; it may help to try:
  $ module --ignore_cache load "unknown_module"

Also make sure that all modulefiles written in TCL start with the string #%Module




false
$ echo $?
1
$ $LMOD_CMD bash show unknown_module
Lmod Warning:  Failed to find the following module(s): "unknown_module" in your MODULEPATH
Try:

    $ module spider unknown_module

to see if the module(s) are available across all compilers and MPI implementations.



MODULEPATH=/usr/share/Modules/modulefiles:/etc/modulefiles:/usr/share/modulefiles;
export MODULEPATH;
_ModuleTable001_=X01vZHVsZVRhYmxlXyA9IHsKTVR2ZXJzaW9uID0gMywKY19yZWJ1aWxkVGltZSA9IDcyMDAuMCwKY19zaG9ydFRpbWUgPSAwLjA2NjI1MTAzOTUwNTAwNSwKZGVwdGhUID0ge30sCmZhbWlseSA9IHt9LAptVCA9IHt9LAptcGF0aEEgPSB7CiIvdXNyL3NoYXJlL01vZHVsZXMvbW9kdWxlZmlsZXMiLCAiL2V0Yy9tb2R1bGVmaWxlcyIsICIvdXNyL3NoYXJlL21vZHVsZWZpbGVzIiwKfSwKc3lzdGVtQmFzZU1QQVRIID0gIi91c3Ivc2hhcmUvTW9kdWxlcy9tb2R1bGVmaWxlczovZXRjL21vZHVsZWZpbGVzOi91c3Ivc2hhcmUvbW9kdWxlZmlsZXMiLAp9Cg==;
export _ModuleTable001_;
_ModuleTable_Sz_=1;
export _ModuleTable_Sz_;
$ echo $?
0

Expected behavior
As show sub-command makes a module evaluation like load sub-command, it may be interesting to produce an error (and make lmod script sets a non-zero exit code) when trying to evaluate a non-exiting module.

Desktop (please complete the following information):

  • OS: Linux
  • Linux distribution: Fedora 40
  • Lmod Version: 8.7.55
@rtmclay
Copy link
Member

rtmclay commented Jan 9, 2025

Thanks very much for reporting this issue. I have uploaded branch IS743-show to github that fixes this for me. This change comes down to changing the warning to an error when show cannot find a module.

@xdelaruelle : Please test out this branch if you get the change and report back
@boegel Could you also test this change with easybuild

Thanks!

@xdelaruelle
Copy link
Contributor Author

Many thanks Robert. I have test the new branch and it gives me the expected result with the show sub-command.

I have also run the modules and modulestool sections of the easybuild-framework testsuite against this new Lmod version. With regular easybuild-framework code from 5.0.x branch, tests test_exist and test_modpath_extensions_for fail. But with patch proposed on easybuilders/easybuild-framework#4739, all tests pass.

Input from @boegel is important to determine if this change does not break all existing installation of EasyBuild using Lmod if people decide to update Lmod to the latest version.

@rtmclay
Copy link
Member

rtmclay commented Jan 27, 2025

Closing this issue. If easybuild has an issue then please feel free to re-open. Thanks again for the bug-report.

@boegel
Copy link
Contributor

boegel commented Feb 9, 2025

Sorry for not answering here yet...

While I support this change (a non-zero exit code for module show on a non-existing module file makes perfect sense), this is a backwards-incompatible change (introduced in a bugfix release of Lmod, so semantic versioning-wise that's quite annoying).
EasyBuild users will run into trouble very easily after upgrading Lmod to 8.7.56, see also easybuilders/easybuild-framework#4759

While the problem is already being fixed via easybuilders/easybuild-framework#4739 (thanks @xdelaruelle!), this essentially makes all current EasyBuild releases incompatible with Lmod 8.7.56+, which seems quite harsh...
The changes in easybuilders/easybuild-framework#4739 make sense, but will only be integrated in the upcoming EasyBuild v5.0 release (there will be no further EasyBuild 4.x releases), which makes this situation even more painful.

Is there something we can do to prevent this?

EasyBuild has set $LMOD_QUIET to 1 since June 2014 (EasyBuild v1.14.0).
Can the exit code of module show be kept to always being zero in that case, to avoid breaking compatibility between existing (pre-5.0) EasyBuild releases and recent Lmod versions?
Not sure if it helps, but if you want to make that exception even more specific to EasyBuild, it could only be done when Lmod is being called with python as a shell (since EasyBuild actually runs $LMOD_CMD python show, not module show).

@rtmclay Giving the impact of this breaking change, can we re-open this issue to see if we can come up with something to avoid the problems we see with Lmod 8.7.56 in EasyBuild?

@rtmclay rtmclay reopened this Feb 15, 2025
rtmclay pushed a commit that referenced this issue Feb 19, 2025
@rtmclay
Copy link
Member

rtmclay commented Feb 19, 2025

I have uploaded to branch IS743-show the changes you requested. Namely module show noSuchModule generates an error unless $LMOD_QUIET is set. Then Lmod calls LmodWarning() but since LMOD_QUIET is set no warning are produced.

@boegel Please branch to see if it works for you.

@boegel
Copy link
Contributor

boegel commented Feb 19, 2025

@rtmclay Works like a charm!

$ ml --version

Modules based on Lua: Version 8.7.56 (8.7.56-2-g61daab167) [branch: IS743-show] 2025-02-19 08:10
    by Robert McLay [email protected]
$ module show binutils/2.37
Lmod has detected the following error:  Failed to find the following module(s): "binutils/2.37" in your MODULEPATH
Try:

    $ module spider binutils/2.37

to see if the module(s) are available across all compilers and MPI implementations.

$ echo $?
1
$ LMOD_QUIET=1 module show binutils/2.37

$ echo $?
0

I've also verified that it fixes the incompatibility issue with EasyBuild

So will this become Lmod 8.7.57?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants