How to avoid rebooting Linux NFS-client when NFS-server is not available?

On Wed, 24 Jul 2013 11:18:58 +0200
Peter Funk <pf@artcom-gmbh.de> wrote:

> Hello all,
> 
> We've researched this question for quite a while now and nobody here
> found a solution to the following problem:
> 
>  1: A Linux computer is NFS client of some other Linux NFS server
>     and has some active mounts and some processes working with files 
>     on that NFS server.  
> 
>  2: Now the NFS server becomes unavailable and a system administrator 
>     wants to clean up the situation on the NFS client computer without 
>     having to reboot this client computer.
> 
> Is this possible?  And if how exactly?
> 
> Best Regards and many thanks in advance, 
> Peter Funk
> P.S.: umount -f -l did not work  
>       System hangs for a long time in shutdown and shutdown 
>       only succeeds without hard reset after reconnecting the
>       NFS server.

The problem is likely that the lookup phase in the umount() syscall is
trying to revalidate the root of the mount. Since that server is down,
it's getting stuck.

Does this patch help at all? I'm hoping to get this into 3.12, and some
extra confirmation that it works would be helpful. It mentions about
the mount being stale, but it may also help the situation where it's
unavailable:

-----------------------[snip]-------------------------------

[PATCH] vfs: allow umount to handle mountpoints without revalidating them

Christopher reported a regression where he was unable to unmount a NFS
filesystem where the root had gone stale. The problem is that
d_revalidate handles the root of the filesystem differently from other
dentries, but d_weak_revalidate does not. We could simply fix this by
making d_weak_revalidate return success on IS_ROOT dentries, but there
are cases where we do want to revalidate the root of the fs.

A umount is really a special case. We generally aren't interested in
anything but the dentry and vfsmount that's attached at that point. If
the inode turns out to be stale we just don't care since the intent is
to stop using it anyway.

Try to handle this situation better by treating umount as a special
case in the lookup code. Have it resolve the parent using normal
means, and then do a lookup of the final dentry without revalidating
it. In most cases, the final lookup will come out of the dcache, but
the case where there's a trailing symlink or !LAST_NORM entry on the
end complicates things a bit.

Cc: Neil Brown <neilb@suse.de>
Reported-by: Christopher T Vogan <cvogan@us.ibm.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
---
 fs/namei.c            | 182 ++++++++++++++++++++++++++++++++++++++++++++++++++
 fs/namespace.c        |   2 +-
 include/linux/namei.h |   1 +
 3 files changed, 184 insertions(+), 1 deletion(-)

How to avoid rebooting Linux NFS-client when NFS-server is not available?

Commit Message

Patch