diff mbox series

[01/15] md: remove the code to flush an old instance in md_open

Message ID 20210330161727.2297292-2-hch@lst.de (mailing list archive)
State New, archived
Headers show
Series [01/15] md: remove the code to flush an old instance in md_open | expand

Commit Message

Christoph Hellwig March 30, 2021, 4:17 p.m. UTC
Due to the flush_workqueue() call in md_alloc no previous instance of
mddev can still be around at this point.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/md/md.c | 35 +++++++----------------------------
 1 file changed, 7 insertions(+), 28 deletions(-)

Comments

heming.zhao@suse.com March 31, 2021, 3:29 a.m. UTC | #1
On 3/31/21 12:17 AM, Christoph Hellwig wrote:
> Due to the flush_workqueue() call in md_alloc no previous instance of
> mddev can still be around at this point.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>   drivers/md/md.c | 35 +++++++----------------------------
>   1 file changed, 7 insertions(+), 28 deletions(-)
> 
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index 368cad6cd53a6e..cd2d825dd4f881 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -7807,43 +7807,22 @@ static int md_open(struct block_device *bdev, fmode_t mode)
>   	 * Succeed if we can lock the mddev, which confirms that
>   	 * it isn't being stopped right now.
>   	 */
> -	struct mddev *mddev = mddev_find(bdev->bd_dev);
> +	struct mddev *mddev = bdev->bd_disk->private_data;
>   	int err;
>   
> -	if (!mddev)
> -		return -ENODEV;
> -
> -	if (mddev->gendisk != bdev->bd_disk) {
> -		/* we are racing with mddev_put which is discarding this
> -		 * bd_disk.
> -		 */
> -		mddev_put(mddev);
> -		/* Wait until bdev->bd_disk is definitely gone */
> -		if (work_pending(&mddev->del_work))
> -			flush_workqueue(md_misc_wq);
> -		/* Then retry the open from the top */
> -		return -ERESTARTSYS;
> -	}
> -	BUG_ON(mddev != bdev->bd_disk->private_data);
> -
> -	if ((err = mutex_lock_interruptible(&mddev->open_mutex)))
> -		goto out;
> -
> +	err = mutex_lock_interruptible(&mddev->open_mutex);
> +	if (err)
> +		return err;
>   	if (test_bit(MD_CLOSING, &mddev->flags)) {
>   		mutex_unlock(&mddev->open_mutex);
> -		err = -ENODEV;
> -		goto out;
> +		return -ENODEV;
>   	}
> -
> -	err = 0;
> +	mddev_get(mddev);
>   	atomic_inc(&mddev->openers);
>   	mutex_unlock(&mddev->open_mutex);
>   
>   	bdev_check_media_change(bdev);
> - out:
> -	if (err)
> -		mddev_put(mddev);
> -	return err;
> +	return 0;
>   }
>   
>   static void md_release(struct gendisk *disk, fmode_t mode)
> 

Hello Christoph,

After applying your patch, the md_open() will be:
```
static int md_open(struct block_device *bdev, fmode_t mode)
{
     /* ...  */
     struct mddev *mddev = bdev->bd_disk->private_data;
     int err;

     err = mutex_lock_interruptible(&mddev->open_mutex);
     if (err)
         return err;

     if (test_bit(MD_CLOSING, &mddev->flags)) {
         mutex_unlock(&mddev->open_mutex);
         return -ENODEV;
     }

     mddev_get(mddev);
     atomic_inc(&mddev->openers);
     mutex_unlock(&mddev->open_mutex);

     bdev_check_media_change(bdev);
     return 0;
}
```

in clean path, MD_CLOSING only lives a very short time, then be cleaned in md_clean:
```
ioctl
  + test_and_set_bit(MD_CLOSING, &mddev->flags)
  + do_md_stop //case STOP_ARRAY
     md_clean
      mddev->flags = 0;
```

when userspace "mdadm -Ss" finish (the ioctl STOP_ARRAY returns),
mddev->flags will be zero. and you can see my patch email (date: 2021-3-30).
At this time, userspace will execute "mdadm --monitor" to scan the
closing md device. the md_open will trigger very soon. at this time,
bdev->bd_disk->private_data is only a skeleton, your shouldn't trust & use it.

So mddev with MD_CLOSING protection, the md_open is not safety.

PS:
Neil Brown legacy commit d3374825ce57ba2214d37502397 also describes this condition.

Thanks,
heming
Christoph Hellwig March 31, 2021, 6:53 a.m. UTC | #2
On Wed, Mar 31, 2021 at 11:29:39AM +0800, heming.zhao@suse.com wrote:
> when userspace "mdadm -Ss" finish (the ioctl STOP_ARRAY returns),
> mddev->flags will be zero. and you can see my patch email (date: 2021-3-30).
> At this time, userspace will execute "mdadm --monitor" to scan the
> closing md device. the md_open will trigger very soon. at this time,
> bdev->bd_disk->private_data is only a skeleton, your shouldn't trust & use it.

Ermm, the block layer rules require the device to be fully set up
when add_disk is called.  So if that is not the case (and I'd like
to see hints how) we need to fix this properly instead of using a hack
in ->open.
diff mbox series

Patch

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 368cad6cd53a6e..cd2d825dd4f881 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -7807,43 +7807,22 @@  static int md_open(struct block_device *bdev, fmode_t mode)
 	 * Succeed if we can lock the mddev, which confirms that
 	 * it isn't being stopped right now.
 	 */
-	struct mddev *mddev = mddev_find(bdev->bd_dev);
+	struct mddev *mddev = bdev->bd_disk->private_data;
 	int err;
 
-	if (!mddev)
-		return -ENODEV;
-
-	if (mddev->gendisk != bdev->bd_disk) {
-		/* we are racing with mddev_put which is discarding this
-		 * bd_disk.
-		 */
-		mddev_put(mddev);
-		/* Wait until bdev->bd_disk is definitely gone */
-		if (work_pending(&mddev->del_work))
-			flush_workqueue(md_misc_wq);
-		/* Then retry the open from the top */
-		return -ERESTARTSYS;
-	}
-	BUG_ON(mddev != bdev->bd_disk->private_data);
-
-	if ((err = mutex_lock_interruptible(&mddev->open_mutex)))
-		goto out;
-
+	err = mutex_lock_interruptible(&mddev->open_mutex);
+	if (err)
+		return err;
 	if (test_bit(MD_CLOSING, &mddev->flags)) {
 		mutex_unlock(&mddev->open_mutex);
-		err = -ENODEV;
-		goto out;
+		return -ENODEV;
 	}
-
-	err = 0;
+	mddev_get(mddev);
 	atomic_inc(&mddev->openers);
 	mutex_unlock(&mddev->open_mutex);
 
 	bdev_check_media_change(bdev);
- out:
-	if (err)
-		mddev_put(mddev);
-	return err;
+	return 0;
 }
 
 static void md_release(struct gendisk *disk, fmode_t mode)