Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

munge cannot start in compute nodes #51

Open
gustavoberman opened this issue Jun 25, 2022 · 1 comment
Open

munge cannot start in compute nodes #51

gustavoberman opened this issue Jun 25, 2022 · 1 comment

Comments

@gustavoberman
Copy link

Hello there!
I'm using 2.0-release
Munge at compute nodes get wrong user/group so it cannot start
But I don't understand from where it's getting the wrong info.
At the chroot it already have it wrong:

[root@headnode CRI_XCBC]# ls -dl /opt/ohpc/admin/images/rocky8-compute/etc/munge/
drwx------. 2 polkitd ssh_keys 6 abr 12  2021 /opt/ohpc/admin/images/rocky8-compute/etc/munge/
[root@headnode CRI_XCBC]# ls -dl /opt/ohpc/admin/images/rocky8-compute/var/log/munge/
drwx------. 2 polkitd ssh_keys 6 abr 12  2021 /opt/ohpc/admin/images/rocky8-compute/var/log/munge/
[root@headnode CRI_XCBC]# ls -dl /opt/ohpc/admin/images/rocky8-compute/var/lib/munge/
drwx------. 2 polkitd ssh_keys 6 abr 12  2021 /opt/ohpc/admin/images/rocky8-compute/var/lib/munge/

So at the nodes it cannot start:

[root@compute-0 ~]# systemctl status munge
● munge.service - MUNGE authentication service
   Loaded: loaded (/usr/lib/systemd/system/munge.service; enabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Sat 2022-06-25 16:25:05 -03; 14min ago
     Docs: man:munged(8)

Jun 25 16:25:05 compute-0 systemd[1]: Starting MUNGE authentication service...
Jun 25 16:25:05 compute-0 munged[1224]: munged: Error: Failed to check logfile "/var/log/munge/munged.log": Permission denied
Jun 25 16:25:05 compute-0 systemd[1]: munge.service: Control process exited, code=exited status=1
Jun 25 16:25:05 compute-0 systemd[1]: munge.service: Failed with result 'exit-code'.
Jun 25 16:25:05 compute-0 systemd[1]: Failed to start MUNGE authentication service.
[root@compute-0 ~]# ls -dl  /etc/munge/
drwx------ 2 polkitd ssh_keys 60 Jun 25 16:23 /etc/munge/
[root@compute-0 ~]# ls -dl /var/log/munge/
drwx------ 2 polkitd ssh_keys 40 Apr 12  2021 /var/log/munge/
[root@compute-0 ~]# ls -dl /var/lib/munge/
drwx------ 2 polkitd ssh_keys 40 Apr 12  2021 /var/lib/munge/

Please help!
Thanks!

@gustavoberman
Copy link
Author

Found it:
Role compute_build_vnfs lacks munge package install:
This task needs to be updated to add munge like this:

   - name: dnf install into the image chroot
     dnf:
       name: [
               "chrony",
               "kernel",
               "lmod-ohpc",
               "grub2",
               "freeipmi",
               "ipmitool",
               "ohpc-slurm-client",
               "munge",               
             ]
       state: present
       installroot: "{{ compute_chroot_loc }}"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant