-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
Summary
Enhance error handling and logging to improve debugging and user experience.
Proposed Improvements
Better Error Messages
-
Improve plugstack.conf error message (main.cpp:341-353)
- Add examples of correct configuration
- Suggest fixes for common mistakes
- Include link to documentation
-
Add context to error messages
slurm_error("singularity-exec: Failed to set container name '%s'. " "The option was already set to '%s'. " "Check for duplicate options in job script or command line.", optarg, s_container_name.c_str());
Error Propagation from Wrapper Script
- Capture singularity error messages and forward to Slurm logs
- Add exit code mapping for common errors
# In wrapper script if [ $? -ne 0 ]; then echo "Error: Singularity exec failed with exit code $?" >&2 exit $? fi
Structured Logging Levels
- Document current logging levels (error, debug, verbose)
- Add more granular debug levels
- Consider environment variable:
SLURM_SINGULARITY_LOG_LEVEL - Log configuration summary at debug level
Validation Error Messages
-
When container file not found, suggest checking:
- File path spelling
- File permissions
- Whether path is accessible from compute nodes
-
When script not found, provide:
- Expected script location
- How to configure custom script path
- Link to installation documentation
Logging Enhancements
- Add timestamps to debug output (optional)
- Log plugin version on initialization
- Log effective configuration (merged from defaults + CLI + env vars)
- Add option to redirect debug output to separate log file
Example Improved Error Message
Before:
singularity-exec plugin: argument in plugstack.conf is invalid: 'foo'
After:
singularity-exec plugin: Invalid argument in plugstack.conf: 'foo'
Supported arguments:
default=<path> Path to default container (can be empty)
script=<path> Path to wrapper script (default: /usr/lib/slurm/slurm-singularity-wrapper.sh)
bind=<spec> Default bind mounts (e.g., bind=/data,/scratch)
global=<options> Global singularity options (e.g., global=--silent)
args="<args>" Singularity exec arguments (quotes required, e.g., args="--no-home")
args=disabled Disable --singularity-args option
Example configuration:
required /usr/lib64/slurm/singularity-exec.so default=/opt/containers/default.sif script=/usr/libexec/slurm-singularity-wrapper.sh bind=/data global=--silent args=""
Documentation: https://github.com/GSI-HPC/slurm-singularity-exec#configuration
Benefits
- Reduced time debugging configuration issues
- Better user experience
- Easier troubleshooting for admins
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels