Message ID | 20200803192308.17977-1-josef@toxicpanda.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | btrfs: init device stats for seed devices | expand |
On 3.08.20 г. 22:23 ч., Josef Bacik wrote: > We recently started recording device stats across the fleet, and noticed > a large increase in messages such as this > > BTRFS warning (device dm-0): get dev_stats failed, not yet valid > > on our tiers that use seed devices for their root devices. This is > because we do not initialize the device stats for any seed devices if we > have a sprout device and mount using that sprout device. The basic > steps for reproducing are > > mkfs seed device > mount seed device > fill seed device > umount seed device > btrfstune -S 1 seed device > mount seed device > btrfs device add -f sprout device /mnt/wherever > umount /mnt/wherever > mount sprout device /mnt/wherever > btrfs device stats /mnt/wherever > > This will fail with the above message in dmesg. > > Fix this by iterating over the fs_devices->seed if they exist in > btrfs_init_dev_stats. This fixed the problem and properly reports the > stats for both devices. All devices which are anchored at fs_devices->seed are really private to that fs_info->fs_devices created during chunk tree read which happens before init_dev_stats. So that's correct, however I'd like this to be based on the new pattern of dealing with seed devices - using the list() apis. And likely the init function would need to be refactored into 2 functions: 1. Iterating over all fs_devices->seed and the main fs_devices 2. Doing the init work on every passed fs_devices. > Signed-off-by: Josef Bacik <josef@toxicpanda.com> > --- > fs/btrfs/volumes.c | 8 +++++++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c > index d7670e2a9f39..dab295880117 100644 > --- a/fs/btrfs/volumes.c > +++ b/fs/btrfs/volumes.c > @@ -7225,7 +7225,7 @@ int btrfs_init_dev_stats(struct btrfs_fs_info *fs_info) > path = btrfs_alloc_path(); > if (!path) > return -ENOMEM; > - > +again: > mutex_lock(&fs_devices->device_list_mutex); > list_for_each_entry(device, &fs_devices->devices, dev_list) { > int item_size; > @@ -7263,6 +7263,12 @@ int btrfs_init_dev_stats(struct btrfs_fs_info *fs_info) > } > mutex_unlock(&fs_devices->device_list_mutex); > > + /* If we have seed devices we need to init those stats as well. */ > + if (fs_devices->seed) { > + fs_devices = fs_devices->seed; > + goto again; > + } > + > btrfs_free_path(path); > return ret < 0 ? ret : 0; > } >
On 4/8/20 3:23 am, Josef Bacik wrote: > We recently started recording device stats across the fleet, and noticed > a large increase in messages such as this > > BTRFS warning (device dm-0): get dev_stats failed, not yet valid > > on our tiers that use seed devices for their root devices. This is > because we do not initialize the device stats for any seed devices if we > have a sprout device and mount using that sprout device. Thanks for spotting and the fix. > The basic > steps for reproducing are > > mkfs seed device > mount seed device > fill seed device > umount seed device > btrfstune -S 1 seed device > mount seed device > btrfs device add -f sprout device /mnt/wherever > umount /mnt/wherever > mount sprout device /mnt/wherever > btrfs device stats /mnt/wherever > > This will fail with the above message in dmesg. > > Fix this by iterating over the fs_devices->seed if they exist in > btrfs_init_dev_stats. This fixed the problem and properly reports the > stats for both devices. > Also btrfs_run_dev_stats() should be updated to write seed's dev stat to ondisk. > Signed-off-by: Josef Bacik <josef@toxicpanda.com> > --- > fs/btrfs/volumes.c | 8 +++++++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c > index d7670e2a9f39..dab295880117 100644 > --- a/fs/btrfs/volumes.c > +++ b/fs/btrfs/volumes.c > @@ -7225,7 +7225,7 @@ int btrfs_init_dev_stats(struct btrfs_fs_info *fs_info) > path = btrfs_alloc_path(); > if (!path) > return -ENOMEM; > - > +again: > mutex_lock(&fs_devices->device_list_mutex); > list_for_each_entry(device, &fs_devices->devices, dev_list) { > int item_size; > @@ -7263,6 +7263,12 @@ int btrfs_init_dev_stats(struct btrfs_fs_info *fs_info) > } > mutex_unlock(&fs_devices->device_list_mutex); As we loop through the seed fs_devices. %fs_devices which was pointing to the sprout fs_devices above, shall point to the seed fs_devices and so will be holding the seed::device_list_mutex. But our threads which update the seed device such as replace and delete are using the sprout::device_list_mutex. So it should be sprout::device_list_mutex (fs_info->fs_devices->device_list_mutex). The loops in btrfs_init_devices_late() and __reada_start_machine() which are read only, also holds the seed device_list_mutex, I am sending a patch to fix, after which there isn't any thread which shall be holding the seed::device_list_mutex when sprout is mounted. Thanks, Anand > > + /* If we have seed devices we need to init those stats as well. */ > + if (fs_devices->seed) { > + fs_devices = fs_devices->seed; > + goto again; > + } > + > btrfs_free_path(path); > return ret < 0 ? ret : 0; > } >
On Mon, Aug 03, 2020 at 03:23:08PM -0400, Josef Bacik wrote: > We recently started recording device stats across the fleet, and noticed > a large increase in messages such as this > > BTRFS warning (device dm-0): get dev_stats failed, not yet valid > > on our tiers that use seed devices for their root devices. This is > because we do not initialize the device stats for any seed devices if we > have a sprout device and mount using that sprout device. The basic > steps for reproducing are > > mkfs seed device > mount seed device > fill seed device > umount seed device > btrfstune -S 1 seed device > mount seed device > btrfs device add -f sprout device /mnt/wherever > umount /mnt/wherever > mount sprout device /mnt/wherever > btrfs device stats /mnt/wherever > > This will fail with the above message in dmesg. > > Fix this by iterating over the fs_devices->seed if they exist in > btrfs_init_dev_stats. This fixed the problem and properly reports the > stats for both devices. > > Signed-off-by: Josef Bacik <josef@toxicpanda.com> Please update the patch after the device list api changes, thanks.
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index d7670e2a9f39..dab295880117 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -7225,7 +7225,7 @@ int btrfs_init_dev_stats(struct btrfs_fs_info *fs_info) path = btrfs_alloc_path(); if (!path) return -ENOMEM; - +again: mutex_lock(&fs_devices->device_list_mutex); list_for_each_entry(device, &fs_devices->devices, dev_list) { int item_size; @@ -7263,6 +7263,12 @@ int btrfs_init_dev_stats(struct btrfs_fs_info *fs_info) } mutex_unlock(&fs_devices->device_list_mutex); + /* If we have seed devices we need to init those stats as well. */ + if (fs_devices->seed) { + fs_devices = fs_devices->seed; + goto again; + } + btrfs_free_path(path); return ret < 0 ? ret : 0; }
We recently started recording device stats across the fleet, and noticed a large increase in messages such as this BTRFS warning (device dm-0): get dev_stats failed, not yet valid on our tiers that use seed devices for their root devices. This is because we do not initialize the device stats for any seed devices if we have a sprout device and mount using that sprout device. The basic steps for reproducing are mkfs seed device mount seed device fill seed device umount seed device btrfstune -S 1 seed device mount seed device btrfs device add -f sprout device /mnt/wherever umount /mnt/wherever mount sprout device /mnt/wherever btrfs device stats /mnt/wherever This will fail with the above message in dmesg. Fix this by iterating over the fs_devices->seed if they exist in btrfs_init_dev_stats. This fixed the problem and properly reports the stats for both devices. Signed-off-by: Josef Bacik <josef@toxicpanda.com> --- fs/btrfs/volumes.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)