diff mbox

[4/6,v9] fs: New helper legitimize_mntget() for getting a legitimize mnt

Message ID 55D2DD17.7050504@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Kinglong Mee Aug. 18, 2015, 7:21 a.m. UTC
New helper legitimize_mntget for getting a mnt without setting
MNT_SYNC_UMOUNT | MNT_UMOUNT | MNT_DOOMED, otherwise return NULL.

v9, Update using read_seqbegin_or_lock instead read_seqlock_excl

Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
---
 fs/namespace.c        | 30 ++++++++++++++++++++++++++++++
 include/linux/mount.h |  1 +
 2 files changed, 31 insertions(+)

Comments

NeilBrown Aug. 19, 2015, 4:56 a.m. UTC | #1
On Tue, 18 Aug 2015 15:21:59 +0800 Kinglong Mee <kinglongmee@gmail.com>
wrote:

> New helper legitimize_mntget for getting a mnt without setting
> MNT_SYNC_UMOUNT | MNT_UMOUNT | MNT_DOOMED, otherwise return NULL.
> 
> v9, Update using read_seqbegin_or_lock instead read_seqlock_excl
> 
> Signed-off-by: Kinglong Mee <kinglongmee@gmail.com>
> ---
>  fs/namespace.c        | 30 ++++++++++++++++++++++++++++++
>  include/linux/mount.h |  1 +
>  2 files changed, 31 insertions(+)
> 
> diff --git a/fs/namespace.c b/fs/namespace.c
> index 2b8aa15..bf4a9f5 100644
> --- a/fs/namespace.c
> +++ b/fs/namespace.c
> @@ -1153,6 +1153,36 @@ struct vfsmount *mntget(struct vfsmount *mnt)
>  }
>  EXPORT_SYMBOL(mntget);
>  
> +struct vfsmount *legitimize_mntget(struct vfsmount *vfsmnt)
> +{
> +	struct mount *mnt;
> +	unsigned seq = 0;
> +
> +	if (vfsmnt == NULL)
> +		return NULL;
> +retry:
> +	read_seqbegin_or_lock(&mount_lock, &seq);
> +
> +	if (vfsmnt->mnt_flags & (MNT_SYNC_UMOUNT | MNT_UMOUNT | MNT_DOOMED))
> +		vfsmnt = NULL;
> +	else if (need_seqretry(&mount_lock, seq))
> +		goto retry;
> +	else {
> +		mnt = real_mount(vfsmnt);
> +		mnt_add_count(mnt, 1);
> +		if (need_seqretry(&mount_lock, seq) ||
> +		    vfsmnt->mnt_flags & (MNT_SYNC_UMOUNT | MNT_UMOUNT |
> +					 MNT_DOOMED)) {
> +			mnt_add_count(mnt, -1);
> +			goto retry;
> +		}
> +	}
> +
> +	done_seqretry(&mount_lock, seq);
> +	return vfsmnt;
> +}
> +EXPORT_SYMBOL(legitimize_mntget);

I'm struggling with this.  I know I wrote the above, but I'm no longer
sure it is correct .. at all.

All three of those flags are only set while mount_lock is held for
writing, so if the second need_seqretry() returns zero, then we can
be certain that none of the flags have changed yet, so there is no point
testing them again.

Ahh...  looking back at __legitimize_mnt I think I see the important
point now - or at least I now see that point which I saw before more
clearly.

After the mnt_add_count, if need_seqretry() returns zero, we are done.
If not, we need to clean up and try again.
To do this we need to check MNT_SYNC_UMOUNT.
If that is clear it is safe and best to call mntput() on the mnt.
If it is set, then we just do the mnt_add_count(mnt, -1) and give up
completely.

Also, the usage for read_seqbegin_or_lock is all wrong.  I'm sure I
copied something but I got it badly wrong.
When you retry, you need to set 'seq' to 1, otherwise it never locks.
I thought it would be more "clever" than at.

So maybe:

retry:
  read_seqbegin_or_lock(&mount_lock, &seq);

  if (vfsmnt->mnt_flags & (....))
        vfs_mnt = NULL;
  else {
        mnt = real_mount(vfsmnt);
        mnt_add_count(mnt, 1);
        if (need_seqretry(&mount_lock, seq)) {
             /* lost the race, need to try again */
             if (vfsmnt->mnt_flags & MNT_SYNC_UMOUNT) {
                  /* no point trying... */
                  mnt_add_count(mnt, -1):
                  vfsmnt = NULL;
             } else {
                  mntput(vfsmnt);
                  seq = 1;
                  goto retry;
             }
        }
  }
  done_seqretry(&mount_lock, seq);
  return vfsmnt;


NeilBrown

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Al Viro Aug. 19, 2015, 5:02 a.m. UTC | #2
On Wed, Aug 19, 2015 at 02:56:50PM +1000, NeilBrown wrote:
> If not, we need to clean up and try again.
> To do this we need to check MNT_SYNC_UMOUNT.
> If that is clear it is safe and best to call mntput() on the mnt.
> If it is set, then we just do the mnt_add_count(mnt, -1) and give up
> completely.

It's more subtle, actually.  See my reply upthread for details, but
basically we rely on rcu_read_lock() held since before that MNT_SYNC_UMOUNT
had been set, making synchronize_rcu() in namespace_unlock() a guaranteed
delay of the final mntput until after we have returned.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Kinglong Mee Aug. 27, 2015, 11:02 p.m. UTC | #3
On 8/19/2015 13:02, Al Viro wrote:
> On Wed, Aug 19, 2015 at 02:56:50PM +1000, NeilBrown wrote:
>> If not, we need to clean up and try again.
>> To do this we need to check MNT_SYNC_UMOUNT.
>> If that is clear it is safe and best to call mntput() on the mnt.
>> If it is set, then we just do the mnt_add_count(mnt, -1) and give up
>> completely.
> 
> It's more subtle, actually.  See my reply upthread for details, but
> basically we rely on rcu_read_lock() held since before that MNT_SYNC_UMOUNT
> had been set, making synchronize_rcu() in namespace_unlock() a guaranteed
> delay of the final mntput until after we have returned.
> 

Thanks for the comments.
I will update this patch according to Neil's comments and __legitimize_mnt.

thanks,
Kinglong Mee
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/namespace.c b/fs/namespace.c
index 2b8aa15..bf4a9f5 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -1153,6 +1153,36 @@  struct vfsmount *mntget(struct vfsmount *mnt)
 }
 EXPORT_SYMBOL(mntget);
 
+struct vfsmount *legitimize_mntget(struct vfsmount *vfsmnt)
+{
+	struct mount *mnt;
+	unsigned seq = 0;
+
+	if (vfsmnt == NULL)
+		return NULL;
+retry:
+	read_seqbegin_or_lock(&mount_lock, &seq);
+
+	if (vfsmnt->mnt_flags & (MNT_SYNC_UMOUNT | MNT_UMOUNT | MNT_DOOMED))
+		vfsmnt = NULL;
+	else if (need_seqretry(&mount_lock, seq))
+		goto retry;
+	else {
+		mnt = real_mount(vfsmnt);
+		mnt_add_count(mnt, 1);
+		if (need_seqretry(&mount_lock, seq) ||
+		    vfsmnt->mnt_flags & (MNT_SYNC_UMOUNT | MNT_UMOUNT |
+					 MNT_DOOMED)) {
+			mnt_add_count(mnt, -1);
+			goto retry;
+		}
+	}
+
+	done_seqretry(&mount_lock, seq);
+	return vfsmnt;
+}
+EXPORT_SYMBOL(legitimize_mntget);
+
 struct vfsmount *mnt_clone_internal(struct path *path)
 {
 	struct mount *p;
diff --git a/include/linux/mount.h b/include/linux/mount.h
index f822c3c..8ae9dc0 100644
--- a/include/linux/mount.h
+++ b/include/linux/mount.h
@@ -79,6 +79,7 @@  extern void mnt_drop_write(struct vfsmount *mnt);
 extern void mnt_drop_write_file(struct file *file);
 extern void mntput(struct vfsmount *mnt);
 extern struct vfsmount *mntget(struct vfsmount *mnt);
+extern struct vfsmount *legitimize_mntget(struct vfsmount *vfsmnt);
 extern struct vfsmount *mnt_clone_internal(struct path *path);
 extern int __mnt_is_readonly(struct vfsmount *mnt);