diff mbox series

btrfs: init device stats for seed devices

Message ID 20200803192308.17977-1-josef@toxicpanda.com (mailing list archive)
State New, archived
Headers show
Series btrfs: init device stats for seed devices | expand

Commit Message

Josef Bacik Aug. 3, 2020, 7:23 p.m. UTC
We recently started recording device stats across the fleet, and noticed
a large increase in messages such as this

BTRFS warning (device dm-0): get dev_stats failed, not yet valid

on our tiers that use seed devices for their root devices.  This is
because we do not initialize the device stats for any seed devices if we
have a sprout device and mount using that sprout device.  The basic
steps for reproducing are

mkfs seed device
mount seed device
fill seed device
umount seed device
btrfstune -S 1 seed device
mount seed device
btrfs device add -f sprout device /mnt/wherever
umount /mnt/wherever
mount sprout device /mnt/wherever
btrfs device stats /mnt/wherever

This will fail with the above message in dmesg.

Fix this by iterating over the fs_devices->seed if they exist in
btrfs_init_dev_stats.  This fixed the problem and properly reports the
stats for both devices.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/btrfs/volumes.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

Comments

Nikolay Borisov Aug. 5, 2020, 7:45 a.m. UTC | #1
On 3.08.20 г. 22:23 ч., Josef Bacik wrote:
> We recently started recording device stats across the fleet, and noticed
> a large increase in messages such as this
> 
> BTRFS warning (device dm-0): get dev_stats failed, not yet valid
> 
> on our tiers that use seed devices for their root devices.  This is
> because we do not initialize the device stats for any seed devices if we
> have a sprout device and mount using that sprout device.  The basic
> steps for reproducing are
> 
> mkfs seed device
> mount seed device
> fill seed device
> umount seed device
> btrfstune -S 1 seed device
> mount seed device
> btrfs device add -f sprout device /mnt/wherever
> umount /mnt/wherever
> mount sprout device /mnt/wherever
> btrfs device stats /mnt/wherever
> 
> This will fail with the above message in dmesg.
> 
> Fix this by iterating over the fs_devices->seed if they exist in
> btrfs_init_dev_stats.  This fixed the problem and properly reports the
> stats for both devices.

All devices which are anchored at fs_devices->seed are really private to
that fs_info->fs_devices created during chunk tree read which happens
before init_dev_stats. So that's correct, however I'd like this to be
based on the new pattern of dealing with seed devices  - using the
list() apis. And likely the init function would need to be refactored
into 2 functions:

1. Iterating over all fs_devices->seed and the main fs_devices
2. Doing the init work on every passed fs_devices.



> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>  fs/btrfs/volumes.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index d7670e2a9f39..dab295880117 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -7225,7 +7225,7 @@ int btrfs_init_dev_stats(struct btrfs_fs_info *fs_info)
>  	path = btrfs_alloc_path();
>  	if (!path)
>  		return -ENOMEM;
> -
> +again:
>  	mutex_lock(&fs_devices->device_list_mutex);
>  	list_for_each_entry(device, &fs_devices->devices, dev_list) {
>  		int item_size;
> @@ -7263,6 +7263,12 @@ int btrfs_init_dev_stats(struct btrfs_fs_info *fs_info)
>  	}
>  	mutex_unlock(&fs_devices->device_list_mutex);
>  
> +	/* If we have seed devices we need to init those stats as well. */
> +	if (fs_devices->seed) {
> +		fs_devices = fs_devices->seed;
> +		goto again;
> +	}
> +
>  	btrfs_free_path(path);
>  	return ret < 0 ? ret : 0;
>  }
>
Anand Jain Aug. 5, 2020, 4:43 p.m. UTC | #2
On 4/8/20 3:23 am, Josef Bacik wrote:
> We recently started recording device stats across the fleet, and noticed
> a large increase in messages such as this
> 
> BTRFS warning (device dm-0): get dev_stats failed, not yet valid
> 
> on our tiers that use seed devices for their root devices.  This is
> because we do not initialize the device stats for any seed devices if we
> have a sprout device and mount using that sprout device.

  Thanks for spotting and the fix.

>  The basic
> steps for reproducing are
> 
> mkfs seed device
> mount seed device
> fill seed device
> umount seed device
> btrfstune -S 1 seed device
> mount seed device
> btrfs device add -f sprout device /mnt/wherever
> umount /mnt/wherever
> mount sprout device /mnt/wherever
> btrfs device stats /mnt/wherever
> 
> This will fail with the above message in dmesg.
> 
> Fix this by iterating over the fs_devices->seed if they exist in
> btrfs_init_dev_stats.  This fixed the problem and properly reports the
> stats for both devices.
> 

  Also btrfs_run_dev_stats() should be updated to write seed's dev stat
  to ondisk.

> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>   fs/btrfs/volumes.c | 8 +++++++-
>   1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index d7670e2a9f39..dab295880117 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -7225,7 +7225,7 @@ int btrfs_init_dev_stats(struct btrfs_fs_info *fs_info)
>   	path = btrfs_alloc_path();
>   	if (!path)
>   		return -ENOMEM;
> -
> +again:
>   	mutex_lock(&fs_devices->device_list_mutex);
>   	list_for_each_entry(device, &fs_devices->devices, dev_list) {
>   		int item_size;
> @@ -7263,6 +7263,12 @@ int btrfs_init_dev_stats(struct btrfs_fs_info *fs_info)
>   	}
>   	mutex_unlock(&fs_devices->device_list_mutex);

  As we loop through the seed fs_devices. %fs_devices which was
  pointing to the sprout fs_devices above, shall point to the seed
  fs_devices and so will be holding the seed::device_list_mutex. But
  our threads which update the seed device such as replace and delete
  are using the sprout::device_list_mutex.

  So it should be sprout::device_list_mutex
  (fs_info->fs_devices->device_list_mutex).

  The loops in btrfs_init_devices_late() and __reada_start_machine()
  which are read only, also holds the seed device_list_mutex, I am
  sending a patch to fix, after which there isn't any thread which
  shall be holding the seed::device_list_mutex when sprout is mounted.

Thanks, Anand


>   
> +	/* If we have seed devices we need to init those stats as well. */
> +	if (fs_devices->seed) {
> +		fs_devices = fs_devices->seed;
> +		goto again;
> +	}
> +
>   	btrfs_free_path(path);
>   	return ret < 0 ? ret : 0;
>   }
>
David Sterba Sept. 17, 2020, 3:31 p.m. UTC | #3
On Mon, Aug 03, 2020 at 03:23:08PM -0400, Josef Bacik wrote:
> We recently started recording device stats across the fleet, and noticed
> a large increase in messages such as this
> 
> BTRFS warning (device dm-0): get dev_stats failed, not yet valid
> 
> on our tiers that use seed devices for their root devices.  This is
> because we do not initialize the device stats for any seed devices if we
> have a sprout device and mount using that sprout device.  The basic
> steps for reproducing are
> 
> mkfs seed device
> mount seed device
> fill seed device
> umount seed device
> btrfstune -S 1 seed device
> mount seed device
> btrfs device add -f sprout device /mnt/wherever
> umount /mnt/wherever
> mount sprout device /mnt/wherever
> btrfs device stats /mnt/wherever
> 
> This will fail with the above message in dmesg.
> 
> Fix this by iterating over the fs_devices->seed if they exist in
> btrfs_init_dev_stats.  This fixed the problem and properly reports the
> stats for both devices.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Please update the patch after the device list api changes, thanks.
diff mbox series

Patch

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index d7670e2a9f39..dab295880117 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -7225,7 +7225,7 @@  int btrfs_init_dev_stats(struct btrfs_fs_info *fs_info)
 	path = btrfs_alloc_path();
 	if (!path)
 		return -ENOMEM;
-
+again:
 	mutex_lock(&fs_devices->device_list_mutex);
 	list_for_each_entry(device, &fs_devices->devices, dev_list) {
 		int item_size;
@@ -7263,6 +7263,12 @@  int btrfs_init_dev_stats(struct btrfs_fs_info *fs_info)
 	}
 	mutex_unlock(&fs_devices->device_list_mutex);
 
+	/* If we have seed devices we need to init those stats as well. */
+	if (fs_devices->seed) {
+		fs_devices = fs_devices->seed;
+		goto again;
+	}
+
 	btrfs_free_path(path);
 	return ret < 0 ? ret : 0;
 }