diff mbox

[RESEND] btrfs: delete function btrfs_close_extra_devices()

Message ID 20180215050224.21158-1-anand.jain@oracle.com (mailing list archive)
State New, archived
Headers show

Commit Message

Anand Jain Feb. 15, 2018, 5:02 a.m. UTC
btrfs_close_extra_devices() is not exactly about just closing the opened
devices, but its also about free-ing the stale devices which may have
scanned into the btrfs_fs_devices::dev_list. The way it picks devices to
be freed is by going through the btrfs_fs_devices::dev_list and its seed
devices, and finding for devices which do not have the flag
BTRFS_DEV_STATE_IN_FS_METADATA nor if it is part of the replace target.

However, in the first place the way devices are scanned and added to the
btrfs_fs_devices::dev_list have changed for a long time now. During scan
when it finds matching fsid+uuid+devid it would add the device to
btrfs_fs_devices::dev_list. A matched device with higher generation number
overwrites the device with lower generation number during.

Further, the stale devices containing the stale fsid are removed at the
time of the scan itself.

So there isn't any opportunity that btrfs_close_extra_devices() can free
the stale device within the fsid which is being mounted.

Further about the btrfs_fs_devices::latest_bdev that
the btrfs_close_extra_devices() function assigns, is already assigned by
the function __btrfs_open_devices().

So as this function has no effect, delete it.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
 fs/btrfs/disk-io.c | 13 -----------
 fs/btrfs/volumes.c | 63 ------------------------------------------------------
 fs/btrfs/volumes.h |  1 -
 3 files changed, 77 deletions(-)

Comments

David Sterba Feb. 19, 2018, 4:53 p.m. UTC | #1
On Thu, Feb 15, 2018 at 01:02:24PM +0800, Anand Jain wrote:
> btrfs_close_extra_devices() is not exactly about just closing the opened
> devices, but its also about free-ing the stale devices which may have
> scanned into the btrfs_fs_devices::dev_list. The way it picks devices to
> be freed is by going through the btrfs_fs_devices::dev_list and its seed
> devices, and finding for devices which do not have the flag
> BTRFS_DEV_STATE_IN_FS_METADATA nor if it is part of the replace target.
> 
> However, in the first place the way devices are scanned and added to the
> btrfs_fs_devices::dev_list have changed for a long time now. During scan
> when it finds matching fsid+uuid+devid it would add the device to
> btrfs_fs_devices::dev_list. A matched device with higher generation number
> overwrites the device with lower generation number during.
> 
> Further, the stale devices containing the stale fsid are removed at the
> time of the scan itself.
> 
> So there isn't any opportunity that btrfs_close_extra_devices() can free
> the stale device within the fsid which is being mounted.
> 
> Further about the btrfs_fs_devices::latest_bdev that
> the btrfs_close_extra_devices() function assigns, is already assigned by
> the function __btrfs_open_devices().
> 
> So as this function has no effect, delete it.

I think this is correct. Freeing stale devices as a side effect of mount
does not seem right anyway. I'm still not able to convince myself that
there's not an unexpected interaction of dev scan and dev replace, as
it relies on the state bits and other locks. If you have ideas were to
put some asserts or extra checks, please suggest.

I found only one place where an assert could go:

> -	/*
> -	 * keep the device that is marked to be the target device for the
> -	 * dev_replace procedure
> -	 */
> -	btrfs_close_extra_devices(fs_devices, 0);
> -
> -	if (!fs_devices->latest_bdev) {
> -		btrfs_err(fs_info, "failed to read devices");
> -		goto fail_tree_roots;
> -	}

	ASSERT(fs_devices->latest_bdev);

as latest_bdev could have been touched even by the no-op
btrfs_close_extra_devices. I'll add that.

Reviewed-by: David Sterba <dsterba@suse.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Anand Jain Feb. 20, 2018, 1 p.m. UTC | #2
On 02/20/2018 12:53 AM, David Sterba wrote:
> On Thu, Feb 15, 2018 at 01:02:24PM +0800, Anand Jain wrote:
>> btrfs_close_extra_devices() is not exactly about just closing the opened
>> devices, but its also about free-ing the stale devices which may have
>> scanned into the btrfs_fs_devices::dev_list. The way it picks devices to
>> be freed is by going through the btrfs_fs_devices::dev_list and its seed
>> devices, and finding for devices which do not have the flag
>> BTRFS_DEV_STATE_IN_FS_METADATA nor if it is part of the replace target.
>>
>> However, in the first place the way devices are scanned and added to the
>> btrfs_fs_devices::dev_list have changed for a long time now. During scan
>> when it finds matching fsid+uuid+devid it would add the device to
>> btrfs_fs_devices::dev_list. A matched device with higher generation number
>> overwrites the device with lower generation number during.
>>
>> Further, the stale devices containing the stale fsid are removed at the
>> time of the scan itself.
>>
>> So there isn't any opportunity that btrfs_close_extra_devices() can free
>> the stale device within the fsid which is being mounted.
>>
>> Further about the btrfs_fs_devices::latest_bdev that
>> the btrfs_close_extra_devices() function assigns, is already assigned by
>> the function __btrfs_open_devices().
>>
>> So as this function has no effect, delete it.
> 
> I think this is correct. Freeing stale devices as a side effect of mount
> does not seem right anyway. I'm still not able to convince myself that
> there's not an unexpected interaction of dev scan and dev replace, as
> it relies on the state bits and other locks. If you have ideas were to
> put some asserts or extra checks, please suggest.


  Thanks for comments. I took a fresh look after a long weekend.
  Here I found something new. Following test case [1] can simulate stale
  SB on the missing device which got deleted. Now if we do the fresh
  dev scan and mount the deleted device should not be kept in the
  device_list after the mount because that's stale. And as IN_FS_METADATA
  flag is not set on this device this function helps to clean up those
  devices. The dev scan free_stale can't catch this condition because
  we won't go deeper than the SB read. Sorry, my bad pls ignore this
  patch this is a very corner case or I had gone nuts when I investigated
  this.

[1]
   mkfs.btrfs -fq -mraid1 -draid1 /dev/sde /dev/sdf /dev/sdg
   modprobe -r btrfs
   mount -o degraded,device=/dev/sde /dev/sdf /btrfs
   btrfs dev del 3 /btrfs
   umount /btrfs
   btrfs dev scan /dev/sde /dev/sdf /dev/sdg
   mount /dev/sde /btrfs
   wipefs -a /dev/sdg <-- this should work as sdg is not part of
                          fsid mounted anymore.

  I am thinking to rename this function to btrfs_free_extra_devices()
  and add comments.

  Also about the fs_devices::latest_bdev we certainly assign it in
  __btrfs_open_devices(), but there we don't care if it picks the
  replacing target. And suppose if replacing_target has the highest
  (or equal) generation number compared to other we shall read the sys
  and chunk tree from it. So during the mount context we use and
  reconstruct using the replacing target, but at the end of mount
  context, we use the non-replace target as the latest_bdev.

Thanks, Anand


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index c412ad1f103b..74ee8a250b8a 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -2756,17 +2756,6 @@  int open_ctree(struct super_block *sb,
 		goto fail_tree_roots;
 	}
 
-	/*
-	 * keep the device that is marked to be the target device for the
-	 * dev_replace procedure
-	 */
-	btrfs_close_extra_devices(fs_devices, 0);
-
-	if (!fs_devices->latest_bdev) {
-		btrfs_err(fs_info, "failed to read devices");
-		goto fail_tree_roots;
-	}
-
 retry_root_backup:
 	generation = btrfs_super_generation(disk_super);
 
@@ -2823,8 +2812,6 @@  int open_ctree(struct super_block *sb,
 		goto fail_block_groups;
 	}
 
-	btrfs_close_extra_devices(fs_devices, 1);
-
 	ret = btrfs_sysfs_add_fsid(fs_devices, NULL);
 	if (ret) {
 		btrfs_err(fs_info, "failed to init sysfs fsid interface: %d",
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 9bdc1cd849f2..23658876497c 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -896,69 +896,6 @@  static struct btrfs_fs_devices *clone_fs_devices(struct btrfs_fs_devices *orig)
 	return ERR_PTR(-ENOMEM);
 }
 
-void btrfs_close_extra_devices(struct btrfs_fs_devices *fs_devices, int step)
-{
-	struct btrfs_device *device, *next;
-	struct btrfs_device *latest_dev = NULL;
-
-	mutex_lock(&uuid_mutex);
-again:
-	/* This is the initialized path, it is safe to release the devices. */
-	list_for_each_entry_safe(device, next, &fs_devices->devices, dev_list) {
-		if (test_bit(BTRFS_DEV_STATE_IN_FS_METADATA,
-							&device->dev_state)) {
-			if (!test_bit(BTRFS_DEV_STATE_REPLACE_TGT,
-			     &device->dev_state) &&
-			     (!latest_dev ||
-			      device->generation > latest_dev->generation)) {
-				latest_dev = device;
-			}
-			continue;
-		}
-
-		if (device->devid == BTRFS_DEV_REPLACE_DEVID) {
-			/*
-			 * In the first step, keep the device which has
-			 * the correct fsid and the devid that is used
-			 * for the dev_replace procedure.
-			 * In the second step, the dev_replace state is
-			 * read from the device tree and it is known
-			 * whether the procedure is really active or
-			 * not, which means whether this device is
-			 * used or whether it should be removed.
-			 */
-			if (step == 0 || test_bit(BTRFS_DEV_STATE_REPLACE_TGT,
-						  &device->dev_state)) {
-				continue;
-			}
-		}
-		if (device->bdev) {
-			blkdev_put(device->bdev, device->mode);
-			device->bdev = NULL;
-			fs_devices->open_devices--;
-		}
-		if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state)) {
-			list_del_init(&device->dev_alloc_list);
-			clear_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state);
-			if (!test_bit(BTRFS_DEV_STATE_REPLACE_TGT,
-				      &device->dev_state))
-				fs_devices->rw_devices--;
-		}
-		list_del_init(&device->dev_list);
-		fs_devices->num_devices--;
-		free_device(device);
-	}
-
-	if (fs_devices->seed) {
-		fs_devices = fs_devices->seed;
-		goto again;
-	}
-
-	fs_devices->latest_bdev = latest_dev->bdev;
-
-	mutex_unlock(&uuid_mutex);
-}
-
 static void free_device_rcu(struct rcu_head *head)
 {
 	struct btrfs_device *device;
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index 182bfdce22b9..c8d74589426c 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -429,7 +429,6 @@  int btrfs_open_devices(struct btrfs_fs_devices *fs_devices,
 int btrfs_scan_one_device(const char *path, fmode_t flags, void *holder,
 			  struct btrfs_fs_devices **fs_devices_ret);
 int btrfs_close_devices(struct btrfs_fs_devices *fs_devices);
-void btrfs_close_extra_devices(struct btrfs_fs_devices *fs_devices, int step);
 void btrfs_assign_next_active_device(struct btrfs_fs_info *fs_info,
 		struct btrfs_device *device, struct btrfs_device *this_dev);
 int btrfs_find_device_missing_or_by_path(struct btrfs_fs_info *fs_info,