diff mbox

More fun with unmounting ESTALE directories.

Message ID 20130212113813.427b8e05@notabene.brown (mailing list archive)
State New, archived
Headers show

Commit Message

NeilBrown Feb. 12, 2013, 12:38 a.m. UTC
I've been exploring difficulties with unmounting stale directories and
discovered another bug.

If I:

SERVER:  mkdir /foo/bar  #and make sure it is exported
CLIENT:  mount -o vers=4 server:/foo/bar /mnt
SERVER:  rm -r /foo
CLIENT:  > /mnt/baz # gets an error of course
CLIENT:  ls -l /mnt # error again
CLIENT:  umount /mnt

The result of that last command is:

/mnt was not found in /proc/mounts
/mnt was not found in /proc/mounts

Strange?

cat /proc/mounts

.....
10.0.2.2://foo/bar /mnt\040(deleted) nfs4 rw,relatime,vers=4,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.0.2.15,minorversion=0,local_lock=none,addr=10.0.2.2 0 0
....

Notice the "\040(deleted)".

NFS has unhashed that directory because it is obviously bad, and d_path()
notices and adds " (deleted)".

Now I might be able to argue that NFS shouldn't be unhashing a directory that
is a mountpoint - it certainly seems strange behaviour.

But I think I can more strongly argue that /proc/mounts shouldn't be showing
the mounted directory, but instead the directory that it is mounted on.
Obviously these both have the same name so it shouldn't matter ... except
that here is a case where it does.

I "fixed" it with


though I suspect that isn't safe and needs some locking.

Probably both should be fixed:  NFS should not invalidate any mounted
directory, and show_vfsmnt() should report the mointpoint, not the mounted
directory.

I can't figure out any way to get NFS to not invalidate the mounted directory.
I think it happens in nfs_lookup_revalidate() when it calls d_drop(), but I
don't know how to tell if a given dentry is a mnt_root for any mountpoint.

Suggestions?  Thoughts?

Thanks,
NeilBrown

Comments

Jeff Layton Feb. 14, 2013, 3:42 p.m. UTC | #1
On Tue, 12 Feb 2013 11:38:13 +1100
NeilBrown <neilb@suse.de> wrote:

> 
> I've been exploring difficulties with unmounting stale directories and
> discovered another bug.
> 
> If I:
> 
> SERVER:  mkdir /foo/bar  #and make sure it is exported
> CLIENT:  mount -o vers=4 server:/foo/bar /mnt
> SERVER:  rm -r /foo
> CLIENT:  > /mnt/baz # gets an error of course
> CLIENT:  ls -l /mnt # error again
> CLIENT:  umount /mnt
> 
> The result of that last command is:
> 
> /mnt was not found in /proc/mounts
> /mnt was not found in /proc/mounts
> 
> Strange?
> 
> cat /proc/mounts
> 
> .....
> 10.0.2.2://foo/bar /mnt\040(deleted) nfs4 rw,relatime,vers=4,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.0.2.15,minorversion=0,local_lock=none,addr=10.0.2.2 0 0
> ....
> 
> Notice the "\040(deleted)".
> 
> NFS has unhashed that directory because it is obviously bad, and d_path()
> notices and adds " (deleted)".
> 
> Now I might be able to argue that NFS shouldn't be unhashing a directory that
> is a mountpoint - it certainly seems strange behaviour.
> 
> But I think I can more strongly argue that /proc/mounts shouldn't be showing
> the mounted directory, but instead the directory that it is mounted on.
> Obviously these both have the same name so it shouldn't matter ... except
> that here is a case where it does.
> 
> I "fixed" it with
> 
> --- a/fs/proc_namespace.c
> +++ b/fs/proc_namespace.c
> @@ -93,7 +93,7 @@ static int show_vfsmnt(struct seq_file *m, struct vfsmount *mnt)
>  {
>  	struct mount *r = real_mount(mnt);
>  	int err = 0;
> -	struct path mnt_path = { .dentry = mnt->mnt_root, .mnt = mnt };
> +	struct path mnt_path = { .dentry = r->mnt_mountpoint, .mnt = &(r->mnt_parent)->mnt };
>  	struct super_block *sb = mnt_path.dentry->d_sb;
>  
>  	if (sb->s_op->show_devname) {
> 
> though I suspect that isn't safe and needs some locking.
> 
> Probably both should be fixed:  NFS should not invalidate any mounted
> directory, and show_vfsmnt() should report the mointpoint, not the mounted
> directory.
> 
> I can't figure out any way to get NFS to not invalidate the mounted directory.
> I think it happens in nfs_lookup_revalidate() when it calls d_drop(), but I
> don't know how to tell if a given dentry is a mnt_root for any mountpoint.
> 
> Suggestions?  Thoughts?
> 
> Thanks,
> NeilBrown
> 

I've also been looking at some weird ESTALE problems. Here's another
fun one that doesn't involve mountpoints. Assume here that we're
working in the same exported directory on client and server:

    server# mkdir a
    client# cd a
    server# mv a a.bak
    client# sleep 30  # (or whatever the dir attrcache timeout is)
    client# stat .
    stat: cannot stat ‘.’: Stale NFS file handle

Obviously, "." should not be stale. It got renamed, but the inode still
exists on the server.

If you sniff on the wire, you'll see that the server doesn't ever send
an ESTALE here. What happens is that due to FS_REVAL_DOT being set, we
end up trying to revalidate the dentry that "." refers to. We find that
the parent changed (obviously) and then try to redo the lookup of "a".
At that point we notice that it doesn't exist and turn it into ESTALE.

I don't really understand the point of FS_REVAL_DOT. What does that
actually buy us? I wonder if removing it would also help your testcase?
diff mbox

Patch

--- a/fs/proc_namespace.c
+++ b/fs/proc_namespace.c
@@ -93,7 +93,7 @@  static int show_vfsmnt(struct seq_file *m, struct vfsmount *mnt)
 {
 	struct mount *r = real_mount(mnt);
 	int err = 0;
-	struct path mnt_path = { .dentry = mnt->mnt_root, .mnt = mnt };
+	struct path mnt_path = { .dentry = r->mnt_mountpoint, .mnt = &(r->mnt_parent)->mnt };
 	struct super_block *sb = mnt_path.dentry->d_sb;
 
 	if (sb->s_op->show_devname) {