# gluster volume statedump test-volume Volume statedump successful
Troubleshooting
This chapter provides some of the GlusterFS troubleshooting methods.
Identifying locked file and clear locks
You can use the statedump
command to list the locks held on files. The
statedump
output also provides information on each lock with its
range, basename, and PID of the application holding the lock, and so on.
You can analyze the output to find the locks whose owner/application is
no longer running or interested in that lock. After ensuring that no
application is using the file, you can clear the lock using the
following clear-locks
command:
# gluster volume clear-locks VOLNAME path kind {blocked | granted | all}{inode range | entry basename | posix range}
For more information on performing statedump
, see
Performing Statedump on a Volume
To identify locked file and clear locks
-
Perform
statedump
on the volume to view the files that are locked using the following command:# gluster volume statedump VOLNAME
For example, to display
statedump
of test-volume:The
statedump
files are created on the brick servers in the` /tmp` directory or in the directory set using theserver.statedump-path
volume option. The naming convention of the dump file isbrick-path.brick-pid.dump
. -
Clear the entry lock using the following command:
# gluster volume clear-locks VOLNAME path kind granted entry basename
The following are the sample contents of the
statedump
file indicating entry lock (entrylk). Ensure that those are stale locks and no resources own them.[xlator.features.locks.vol-locks.inode] path=/ mandatory=0 entrylk-count=1 lock-dump.domain.domain=vol-replicate-0 xlator.feature.locks.lock-dump.domain.entrylk.entrylk[0](ACTIVE)=type=ENTRYLK_WRLCK on basename=file1, pid = 714782904, owner=ffffff2a3c7f0000, transport=0x20e0670, , granted at Mon Feb 27 16:01:01 2012 conn.2.bound_xl./rhgs/brick1.hashsize=14057 conn.2.bound_xl./rhgs/brick1.name=/gfs/brick1/inode conn.2.bound_xl./rhgs/brick1.lru_limit=16384 conn.2.bound_xl./rhgs/brick1.active_size=2 conn.2.bound_xl./rhgs/brick1.lru_size=0 conn.2.bound_xl./rhgs/brick1.purge_size=0
For example, to clear the entry lock on
file1
of test-volume:# gluster volume clear-locks test-volume / kind granted entry file1 Volume clear-locks successful test-volume-locks: entry blocked locks=0 granted locks=1
-
Clear the inode lock using the following command:
# gluster volume clear-locks VOLNAME path kind granted inode range
The following are the sample contents of the
statedump
file indicating there is an inode lock (inodelk). Ensure that those are stale locks and no resources own them.[conn.2.bound_xl./rhgs/brick1.active.1] gfid=538a3d4a-01b0-4d03-9dc9-843cd8704d07 nlookup=1 ref=2 ia_type=1 [xlator.features.locks.vol-locks.inode] path=/file1 mandatory=0 inodelk-count=1 lock-dump.domain.domain=vol-replicate-0 inodelk.inodelk[0](ACTIVE)=type=WRITE, whence=0, start=0, len=0, pid = 714787072, owner=00ffff2a3c7f0000, transport=0x20e0670, , granted at Mon Feb 27 16:01:01 2012
For example, to clear the inode lock on
file1
of test-volume:# gluster volume clear-locks test-volume /file1 kind granted inode 0,0-0 Volume clear-locks successful test-volume-locks: inode blocked locks=0 granted locks=1
-
Clear the granted POSIX lock using the following command:
# gluster volume clear-locks VOLNAME path kind granted posix range
The following are the sample contents of the
statedump
file indicating there is a granted POSIX lock. Ensure that those are stale locks and no resources own them.xlator.features.locks.vol1-locks.inode] path=/file1 mandatory=0 posixlk-count=15 posixlk.posixlk[0](ACTIVE)=type=WRITE, whence=0, start=8, len=1, pid = 23848, owner=d824f04c60c3c73c, transport=0x120b370, , blocked at Mon Feb 27 16:01:01 2012 , granted at Mon Feb 27 16:01:01 2012 posixlk.posixlk[1](ACTIVE)=type=WRITE, whence=0, start=7, len=1, pid = 1, owner=30404152462d436c-69656e7431, transport=0x11eb4f0, , granted at Mon Feb 27 16:01:01 2012 posixlk.posixlk[2](BLOCKED)=type=WRITE, whence=0, start=8, len=1, pid = 1, owner=30404152462d436c-69656e7431, transport=0x11eb4f0, , blocked at Mon Feb 27 16:01:01 2012 posixlk.posixlk[3](ACTIVE)=type=WRITE, whence=0, start=6, len=1, pid = 12776, owner=a36bb0aea0258969, transport=0x120a4e0, , granted at Mon Feb 27 16:01:01 2012 ...
For example, to clear the granted POSIX lock on
file1
of test-volume:# gluster volume clear-locks test-volume /file1 kind granted posix 0,8-1 Volume clear-locks successful test-volume-locks: posix blocked locks=0 granted locks=1 test-volume-locks: posix blocked locks=0 granted locks=1 test-volume-locks: posix blocked locks=0 granted locks=1
-
Clear the blocked POSIX lock using the following command:
# gluster volume clear-locks VOLNAME path kind blocked posix range
The following are the sample contents of the
statedump
file indicating there is a blocked POSIX lock. Ensure that those are stale locks and no resources own them.[xlator.features.locks.vol1-locks.inode] path=/file1 mandatory=0 posixlk-count=30 posixlk.posixlk[0](ACTIVE)=type=WRITE, whence=0, start=0, len=1, pid = 23848, owner=d824f04c60c3c73c, transport=0x120b370, , blocked at Mon Feb 27 16:01:01 2012 , granted at Mon Feb 27 16:01:01 posixlk.posixlk[1](BLOCKED)=type=WRITE, whence=0, start=0, len=1, pid = 1, owner=30404146522d436c-69656e7432, transport=0x1206980, , blocked at Mon Feb 27 16:01:01 2012 posixlk.posixlk[2](BLOCKED)=type=WRITE, whence=0, start=0, len=1, pid = 1, owner=30404146522d436c-69656e7432, transport=0x1206980, , blocked at Mon Feb 27 16:01:01 2012 posixlk.posixlk[3](BLOCKED)=type=WRITE, whence=0, start=0, len=1, pid = 1, owner=30404146522d436c-69656e7432, transport=0x1206980, , blocked at Mon Feb 27 16:01:01 2012 posixlk.posixlk[4](BLOCKED)=type=WRITE, whence=0, start=0, len=1, pid = 1, owner=30404146522d436c-69656e7432, transport=0x1206980, , blocked at Mon Feb 27 16:01:01 2012 ...
For example, to clear the blocked POSIX lock on
file1
of test-volume:# gluster volume clear-locks test-volume /file1 kind blocked posix 0,0-1 Volume clear-locks successful test-volume-locks: posix blocked locks=28 granted locks=0 test-volume-locks: posix blocked locks=1 granted locks=0 No locks cleared.
-
Clear all POSIX locks using the following command:
# gluster volume clear-locks VOLNAME path kind all posix range
The following are the sample contents of the
statedump
file indicating that there are POSIX locks. Ensure that those are stale locks and no resources own them.[xlator.features.locks.vol1-locks.inode] path=/file1 mandatory=0 posixlk-count=11 posixlk.posixlk[0](ACTIVE)=type=WRITE, whence=0, start=8, len=1, pid = 12776, owner=a36bb0aea0258969, transport=0x120a4e0, , blocked at Mon Feb 27 16:01:01 2012 , granted at Mon Feb 27 16:01:01 2012 posixlk.posixlk[1](ACTIVE)=type=WRITE, whence=0, start=0, len=1, pid = 12776, owner=a36bb0aea0258969, transport=0x120a4e0, , granted at Mon Feb 27 16:01:01 2012 posixlk.posixlk[2](ACTIVE)=type=WRITE, whence=0, start=7, len=1, pid = 23848, owner=d824f04c60c3c73c, transport=0x120b370, , granted at Mon Feb 27 16:01:01 2012 posixlk.posixlk[3](ACTIVE)=type=WRITE, whence=0, start=6, len=1, pid = 1, owner=30404152462d436c-69656e7431, transport=0x11eb4f0, , granted at Mon Feb 27 16:01:01 2012 posixlk.posixlk[4](BLOCKED)=type=WRITE, whence=0, start=8, len=1, pid = 23848, owner=d824f04c60c3c73c, transport=0x120b370, , blocked at Mon Feb 27 16:01:01 2012 ...
For example, to clear all POSIX locks on
file1
of test-volume:# gluster volume clear-locks test-volume /file1 kind all posix 0,0-1 Volume clear-locks successful test-volume-locks: posix blocked locks=1 granted locks=0 No locks cleared. test-volume-locks: posix blocked locks=4 granted locks=1
You can perform statedump
on test-volume again to verify that all the
above locks are cleared.
Retrieving File Path from the Gluster Volume
The heal info command lists the GFIDs of the files that needs to be
healed. If you want to find the path of the files associated with the
GFIDs, use the getfattr
utility. The getfattr
utility enables you to
locate a file residing on a gluster volume brick. You can retrieve the
path of a file even if the filename is unknown.
Retrieving Known File Name
To retrieve a file path when the file name is known, execute the following command in the Fuse mount directory:
# getfattr -n trusted.glusterfs.pathinfo -e text <path_to_fuse_mount/filename>
Where,
path_to_fuse_mount: The fuse mount where the gluster volume is mounted.
filename: The name of the file for which the path information is to be retrieved.
For example:
# getfattr -n trusted.glusterfs.pathinfo -e text /mnt/fuse_mnt/File1 getfattr: Removing leading '/' from absolute path names # file: mnt/fuse_mnt/File1 trusted.glusterfs.pathinfo="(<DISTRIBUTE:testvol-dht> (<REPLICATE:testvol-replicate-0> <POSIX(/rhgs/brick1):tuxpad:/rhgs/brick1/File1> <POSIX(/rhgs/brick2):tuxpad:/rhgs/brick2/File1>))"
The command output displays the brick pathinfo under the <POSIX> tag. In this example output, two paths are displayed as the file is replicated twice and resides on a two-way replicated volume.
Retrieving Unknown File Name
You can retrieve the file path of an unknown file using its gfid string. The gfid string is the hyphenated version of the trusted.gfid attribute. For example, if the gfid is 80b0b1642ea4478ba4cda9f76c1e6efd, then the gfid string will be 80b0b164-2ea4-478b-a4cd-a9f76c1e6efd.
Note
To obtain the gfid of a file, run the following command:
# getfattr -d -m. -e hex /path/to/file/on/the/brick
Retrieving File Path using gfid String
To retrieve the file path using the gfid string, follow these steps:
-
Fuse mount the volume with the aux-gfid option enabled.
# mount -t glusterfs -o aux-gfid-mount hostname:volume-name <path_to_fuse_mnt>
Where,
path_to_fuse_mount: The fuse mount where the gluster volume is mounted.
For example:
# mount -t glusterfs -o aux-gfid-mount 127.0.0.2:testvol /mnt/aux_mount
-
After mounting the volume, execute the following command
# getfattr -n trusted.glusterfs.pathinfo -e text <path-to-fuse-mnt>/.gfid/<GFID string>
Where,
path_to_fuse_mount: The fuse mount where the gluster volume is mounted.
GFID string: The GFID string.
For example:
# getfattr -n trusted.glusterfs.pathinfo -e text /mnt/aux_mount/.gfid/80b0b164-2ea4-478b-a4cd-a9f76c1e6efd getfattr: Removing leading '/' from absolute path names # file: mnt/aux_mount/.gfid/80b0b164-2ea4-478b-a4cd-a9f76c1e6efd trusted.glusterfs.pathinfo="(<DISTRIBUTE:testvol-dht> (<REPLICATE:testvol-replicate-0> <POSIX(/rhgs/brick2):tuxpad:/rhgs/brick2/File1> <POSIX(/rhgs/brick1):tuxpad:/rhgs/brick1/File1>))
The command output displays the brick pathinfo under the <POSIX> tag. In this example output, two paths are displayed as the file is replicated twice and resides on a two-way replicated volume.