Skip to content

cannot heal file with correct md5sum, size, ... - no split-brain? #4534

Open
@mlechner

Description

@mlechner

I git a replica 3 cluster. While some files are present on the raw gluster folder, i cannot access them when mounted (socket not connected). The example file does have the same md5sum and size on all bricks:

$ for i in loc1 loc2 loc3 ; do ssh $i md5sum /var/glusterfs/testfile1.csv ; done                   
0989e3e21519239ceaff890363626d79  /var/glusterfs/testfile1.csv
0989e3e21519239ceaff890363626d79  /var/glusterfs/testfile1.csv
0989e3e21519239ceaff890363626d79  /var/glusterfs/testfile1.csv
$ for i in loc1 loc2 loc3 ; do ssh $i ls -ahln /var/glusterfs/testfile1.csv ; done
-rw-rw-r-- 2 500 1000 21K 27. Mär 02:16 /var/glusterfs/testfile1.csv
-rw-rw-r-- 2 500 1000 21K 27. Mär 02:16 /var/glusterfs/testfile1.csv
-rw-rw-r-- 2 500 1000 21K 27. Mär 02:16 /var/glusterfs/testfile1.csv

But the extended attributes differ and accessing the files through the mount-point fails:

# getfattr on mounted fs
$ for i in loc1 loc2 loc3 ; do ssh $i sudo LC_ALL=POSIX getfattr -m ^ -d -R -- /data/glusterfs/testfile1.csv ; done 
getfattr: /data/glusterfs/testfile1.csv: Transport endpoint is not connected
getfattr: /data/glusterfs/testfile1.csv: Transport endpoint is not connected
getfattr: /data/glusterfs/testfile1.csv: Transport endpoint is not connected
# getfattr on raw glusterfs dir on each brick
$ for i in loc1 loc2 loc3 ; do ssh $i sudo getfattr -m ^ -d -R -- /var/glusterfs/testfile1.csv ; done
# file: var/glusterfs/testfile1.csv
trusted.afr.dirty=0sAAAAAAAAAAAAAAAA
trusted.afr.my_replica-client-0=0sAAAAAgAAAAEAAAAA
trusted.gfid=0s/r0/qkhMRG+Bk/S5oAVmRw==
trusted.gfid2path.a5d90a0fc8fcd6d1="686e017c-e69e-459e-ba14-12222e934fc4/testfile1.csv"
trusted.glusterfs.mdata=0sAQAAAAAAAAAAAAAAAGfkptIAAAAAMJfMHQAAAABn5KbSAAAAADCXzB0AAAAAZ+Sl+QAAAAABSqfb
getfattr: Entferne führenden '/' von absoluten Pfadnamen
# file: var/glusterfs/testfile1.csv
trusted.gfid=0s/r0/qkhMRG+Bk/S5oAVmRw==
getfattr: Entferne führenden '/' von absoluten Pfadnamen
# file: var/glusterfs/testfile1.csv
trusted.afr.dirty=0sAAAAAAAAAAAAAAAA
trusted.afr.my_replica-client-0=0sAAAAAQAAAAAAAAAA
trusted.gfid=0sckqXO8BCR3mfkeVcTXUk4g==
trusted.gfid2path.a5d90a0fc8fcd6d1="686e017c-e69e-459e-ba14-12222e934fc4/testfile1.csv"
trusted.glusterfs.mdata=0sAQAAAAAAAAAAAAAAAGfkptIAAAAAMJfMHQAAAABn5KbSAAAAADCXzB0AAAAAZ+Sm0gAAAAAmTjLv

The file can not be healed using gluster volume heal my_replica full. Healing using split-brain, bigger or mtime fails telling me there is no split-brain.

How can I get out of this? It seems to be clear that a valid (and identic) file exists on all bricks. But what stops SHD from healing? The troubleshooting hints in the documenttaion didn't help me out of this (may be i did not understand them good enough)?

Regards
Marco

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions