I have CentOS 6.2 installed, and I'm seeing frequent NFS hangs. I haven't found any reports of an general issue, so I'm at a loss where to go...
My /etc/exports file is ridiculously simple:
[font=Courier]$ cat /etc/exports
/ftproot/evidence *(rw,all_squash,anonuid=10101,anongid=10)[/font]
I'm mounting the exported /ftproot/evidence locally on /evidence, have it listed in fstab:
[font=Courier]$ grep evidence /etc/fstab
UUID=82ae4073-a720-4a28-a14c-1c91172dc540 /ftproot/evidence ext4 defaults 1 2
localhost:/ftproot/evidence /evidence nfs intr 0 0
[/font]
There is nothing in dmesg or /var/log/messages to suggest any problem. I can mount everything just fine, but after a few minutes all attempts to access the share via nfs hang, and don't ever come back. Any attempt to 'ls' exiles processes into the dreaded D state, never to return...
The same file system is shared via samba, and samba access continues without issue, and I can always get to the data under the /ftproot/evidence mount point. So clearly, SOMETHING is going on with NFS - but what??
Thanks for any help!
Frequent NFS hangs?
Frequent NFS hangs?
Without any info in /var/log/message, it is difficult to troubleshoot. Try enabling debugging by:
[code]
echo 1 > /proc/sys/sunrpc/nfs_debug
[/code]
and see if something turns up.
[code]
echo 1 > /proc/sys/sunrpc/nfs_debug
[/code]
and see if something turns up.
Re: Frequent NFS hangs?
I rebooted & added debug, and for a while things were good. An 'ls' of my mount would generate a lot of info in /var/log/messages, so I sat back to wait. Sure enough, I tail -f /var/log/messages and issue an ls, and the ls hangs...but there's absolutely nothing going into messages. Le sigh.
-
- Retired Moderator
- Posts: 18276
- Joined: 2006/12/13 20:15:34
- Location: Tidewater, Virginia, North America
- Contact:
Re: Frequent NFS hangs?
Check other network-related things. Can you still access there server by other means? Any possibility of an IP address conflict?
Re: Frequent NFS hangs?
No, the network is small. IPv6 is disabled, there are no signs of any network related issues. Samba access continues to work just fine, only NFS seems affected.
There are 4 processes in ?? state:
[code]
1430 1427 2 c1b08ab0 ?? 0.0 5228 1736 bash
1515 1 2 c18ee570 ?? 0.0 2912 1324 rpc.mountd
13254 13187 2 c1fc0030 ?? 0.0 4496 788 ls
26916 28898 2 f316bab0 ?? 0.0 4496 808 ls
[/code]
They're all go into nfs3_rpc_wrapper.clone.0 and get stuck on __wait_on_bit at c082f7c2
[font=Courier]
crash> bt 13254
PID: 13254 TASK: c1fc0030 CPU: 2 COMMAND: "ls"
#0 [f312fd28] schedule at c082e833
#1 [f312fdec] rpc_wait_bit_killable at f87498ff [sunrpc]
#2 [f312fdf0] __wait_on_bit at c082f7c2
#3 [f312fe08] out_of_line_wait_on_bit at c082f853
#4 [f312fe3c] __rpc_execute at f8749dc7 [sunrpc]
#5 [ec5ade6c] rpc_run_task at f87437cc [sunrpc]
#6 [ec5ade78] rpc_call_sync at f87438e4 [sunrpc]
#7 [ec5adea0] nfs3_rpc_wrapper.clone.0 at f89cd816 [nfs]
[/font]
Some got there through __nfs_revalidate_inode, some through nfs3_proc_access.
Any suggestions for tracking down what they're waiting on?
There are 4 processes in ?? state:
[code]
1430 1427 2 c1b08ab0 ?? 0.0 5228 1736 bash
1515 1 2 c18ee570 ?? 0.0 2912 1324 rpc.mountd
13254 13187 2 c1fc0030 ?? 0.0 4496 788 ls
26916 28898 2 f316bab0 ?? 0.0 4496 808 ls
[/code]
They're all go into nfs3_rpc_wrapper.clone.0 and get stuck on __wait_on_bit at c082f7c2
[font=Courier]
crash> bt 13254
PID: 13254 TASK: c1fc0030 CPU: 2 COMMAND: "ls"
#0 [f312fd28] schedule at c082e833
#1 [f312fdec] rpc_wait_bit_killable at f87498ff [sunrpc]
#2 [f312fdf0] __wait_on_bit at c082f7c2
#3 [f312fe08] out_of_line_wait_on_bit at c082f853
#4 [f312fe3c] __rpc_execute at f8749dc7 [sunrpc]
#5 [ec5ade6c] rpc_run_task at f87437cc [sunrpc]
#6 [ec5ade78] rpc_call_sync at f87438e4 [sunrpc]
#7 [ec5adea0] nfs3_rpc_wrapper.clone.0 at f89cd816 [nfs]
[/font]
Some got there through __nfs_revalidate_inode, some through nfs3_proc_access.
Any suggestions for tracking down what they're waiting on?