[1/4] btrfs: Use rcu when iterating devices in btrfs_init_new_device
diff mbox series

Message ID 20200722080925.6802-2-nborisov@suse.com
State New
Headers show
Series
  • Misc cleanups around device addition
Related show

Commit Message

Nikolay Borisov July 22, 2020, 8:09 a.m. UTC
When adding a new device there's a mandatory check to see if a device is
being duplicated to the filesystem it's added to. Since this is a
read-only operations not necessary to take device_list_mutex and can simply
make do with an rcu-readlock. No semantics changes.

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
---
 fs/btrfs/volumes.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

Comments

Johannes Thumshirn July 22, 2020, 9:17 a.m. UTC | #1
On 22/07/2020 10:09, Nikolay Borisov wrote:
> When adding a new device there's a mandatory check to see if a device is
> being duplicated to the filesystem it's added to. Since this is a
> read-only operations not necessary to take device_list_mutex and can simply
> make do with an rcu-readlock. No semantics changes.

Hmm what are the actual locking rules for the device list? For instance looking
at add_missing_dev() in volumes.c addition to the list is unprotected (both from
read_one_chunk() and read_one_dev()). Others (i.e. btrfs_init_new_device()) do
a list_add_rcu(). clone_fs_devices() takes the device_list_mutex and so on.

Thanks,
	Johannes
Nikolay Borisov July 22, 2020, 10:36 a.m. UTC | #2
On 22.07.20 г. 12:17 ч., Johannes Thumshirn wrote:
> On 22/07/2020 10:09, Nikolay Borisov wrote:
>> When adding a new device there's a mandatory check to see if a device is
>> being duplicated to the filesystem it's added to. Since this is a
>> read-only operations not necessary to take device_list_mutex and can simply
>> make do with an rcu-readlock. No semantics changes.
> 
> Hmm what are the actual locking rules for the device list? For instance looking
> at add_missing_dev() in volumes.c addition to the list is unprotected (both from
> read_one_chunk() and read_one_dev()). Others (i.e. btrfs_init_new_device()) do
> a list_add_rcu(). clone_fs_devices() takes the device_list_mutex and so on.

Bummer, the locking rules are supposed to be those documented atop
volume.c, namely:

 * fs_devices::device_list_mutex (per-fs, with RCU)

    18  * ------------------------------------------------

    17  * protects updates to fs_devices::devices, ie. adding and
deleting
    16  *

    15  * simple list traversal with read-only actions can be done with
RCU protection
    14  *

    13  * may be used to exclude some operations from running
concurrently without any
    12  * modifications to the list (see write_all_supers)



However, your observations seem correct and rather annoying. Let me go
and try and figure this out...

> 
> Thanks,
> 	Johannes
>
David Sterba July 22, 2020, 2:47 p.m. UTC | #3
On Wed, Jul 22, 2020 at 01:36:15PM +0300, Nikolay Borisov wrote:
> 
> 
> On 22.07.20 г. 12:17 ч., Johannes Thumshirn wrote:
> > On 22/07/2020 10:09, Nikolay Borisov wrote:
> >> When adding a new device there's a mandatory check to see if a device is
> >> being duplicated to the filesystem it's added to. Since this is a
> >> read-only operations not necessary to take device_list_mutex and can simply
> >> make do with an rcu-readlock. No semantics changes.
> > 
> > Hmm what are the actual locking rules for the device list? For instance looking
> > at add_missing_dev() in volumes.c addition to the list is unprotected (both from
> > read_one_chunk() and read_one_dev()). Others (i.e. btrfs_init_new_device()) do
> > a list_add_rcu(). clone_fs_devices() takes the device_list_mutex and so on.
> 
> Bummer, the locking rules are supposed to be those documented atop
> volume.c, namely:
> 
>  * fs_devices::device_list_mutex (per-fs, with RCU)
> 
>     18  * ------------------------------------------------
> 
>     17  * protects updates to fs_devices::devices, ie. adding and
> deleting
>     16  *
> 
>     15  * simple list traversal with read-only actions can be done with
> RCU protection
>     14  *
> 
>     13  * may be used to exclude some operations from running
> concurrently without any
>     12  * modifications to the list (see write_all_supers)
> 
> 
> 
> However, your observations seem correct and rather annoying. Let me go
> and try and figure this out...

For device list it's important to know from which context is the list
used.

On unmoutned filesystems, the devices can be added by scanning ioctl.

On mounted filesystem, before the mount is finished, the devices cannot
be concurrently used (single-threaded context) and uuid_mutex is
temporarily protecting the devices agains scan ioctl.

On finished mount device list must be protected by device_list_mutex
from the same filesystem changes (ioctls, superblock write). Device
scanning is a no-op here, but still needs to use the uuid_mutex.

Enter RCU. For performance reasons we don't want to take
device_list_mutex for read-only operations like show_devname or lookups
of device id, where it's not critical that the returned information is
stale.

The mentioned helpers are used by various functions and the context is
not obvious or documented, but it should be clear once the caller chain
is known.

I can turn that into comments but please let me know if this is
sufficient as explanation or if you need more.
Johannes Thumshirn July 22, 2020, 2:53 p.m. UTC | #4
On 22/07/2020 16:48, David Sterba wrote:
> On Wed, Jul 22, 2020 at 01:36:15PM +0300, Nikolay Borisov wrote:
>>
>>
>> On 22.07.20 г. 12:17 ч., Johannes Thumshirn wrote:
>>> On 22/07/2020 10:09, Nikolay Borisov wrote:
>>>> When adding a new device there's a mandatory check to see if a device is
>>>> being duplicated to the filesystem it's added to. Since this is a
>>>> read-only operations not necessary to take device_list_mutex and can simply
>>>> make do with an rcu-readlock. No semantics changes.
>>>
>>> Hmm what are the actual locking rules for the device list? For instance looking
>>> at add_missing_dev() in volumes.c addition to the list is unprotected (both from
>>> read_one_chunk() and read_one_dev()). Others (i.e. btrfs_init_new_device()) do
>>> a list_add_rcu(). clone_fs_devices() takes the device_list_mutex and so on.
>>
>> Bummer, the locking rules are supposed to be those documented atop
>> volume.c, namely:
>>
>>  * fs_devices::device_list_mutex (per-fs, with RCU)
>>
>>     18  * ------------------------------------------------
>>
>>     17  * protects updates to fs_devices::devices, ie. adding and
>> deleting
>>     16  *
>>
>>     15  * simple list traversal with read-only actions can be done with
>> RCU protection
>>     14  *
>>
>>     13  * may be used to exclude some operations from running
>> concurrently without any
>>     12  * modifications to the list (see write_all_supers)
>>
>>
>>
>> However, your observations seem correct and rather annoying. Let me go
>> and try and figure this out...
> 
> For device list it's important to know from which context is the list
> used.
> 
> On unmoutned filesystems, the devices can be added by scanning ioctl.
> 
> On mounted filesystem, before the mount is finished, the devices cannot
> be concurrently used (single-threaded context) and uuid_mutex is
> temporarily protecting the devices agains scan ioctl.
> 
> On finished mount device list must be protected by device_list_mutex
> from the same filesystem changes (ioctls, superblock write). Device
> scanning is a no-op here, but still needs to use the uuid_mutex.
> 
> Enter RCU. For performance reasons we don't want to take
> device_list_mutex for read-only operations like show_devname or lookups
> of device id, where it's not critical that the returned information is
> stale.

OK I got confused by a) the mix of unlocked, device_list_mutex and 
uuid_mutex use and b) sometimes we use plain list_* operations and
sometimes rcu list ones.

 
> The mentioned helpers are used by various functions and the context is
> not obvious or documented, but it should be clear once the caller chain
> is known.
> 
> I can turn that into comments but please let me know if this is
> sufficient as explanation or if you need more.
> 

This would be great yes.
Nikolay Borisov July 23, 2020, 7:45 a.m. UTC | #5
On 22.07.20 г. 17:47 ч., David Sterba wrote:
> On Wed, Jul 22, 2020 at 01:36:15PM +0300, Nikolay Borisov wrote:
>>
>>
>> On 22.07.20 г. 12:17 ч., Johannes Thumshirn wrote:
>>> On 22/07/2020 10:09, Nikolay Borisov wrote:
>>>> When adding a new device there's a mandatory check to see if a device is
>>>> being duplicated to the filesystem it's added to. Since this is a
>>>> read-only operations not necessary to take device_list_mutex and can simply
>>>> make do with an rcu-readlock. No semantics changes.
>>>
>>> Hmm what are the actual locking rules for the device list? For instance looking
>>> at add_missing_dev() in volumes.c addition to the list is unprotected (both from
>>> read_one_chunk() and read_one_dev()). Others (i.e. btrfs_init_new_device()) do
>>> a list_add_rcu(). clone_fs_devices() takes the device_list_mutex and so on.
>>
>> Bummer, the locking rules are supposed to be those documented atop
>> volume.c, namely:
>>
>>  * fs_devices::device_list_mutex (per-fs, with RCU)
>>
>>     18  * ------------------------------------------------
>>
>>     17  * protects updates to fs_devices::devices, ie. adding and
>> deleting
>>     16  *
>>
>>     15  * simple list traversal with read-only actions can be done with
>> RCU protection
>>     14  *
>>
>>     13  * may be used to exclude some operations from running
>> concurrently without any
>>     12  * modifications to the list (see write_all_supers)
>>
>>
>>
>> However, your observations seem correct and rather annoying. Let me go
>> and try and figure this out...
> 
> For device list it's important to know from which context is the list
> used.
> 
> On unmoutned filesystems, the devices can be added by scanning ioctl.
> 
> On mounted filesystem, before the mount is finished, the devices cannot
> be concurrently used (single-threaded context) and uuid_mutex is
> temporarily protecting the devices agains scan ioctl.
> 
> On finished mount device list must be protected by device_list_mutex
> from the same filesystem changes (ioctls, superblock write). Device
> scanning is a no-op here, but still needs to use the uuid_mutex.
> 
> Enter RCU. For performance reasons we don't want to take
> device_list_mutex for read-only operations like show_devname or lookups
> of device id, where it's not critical that the returned information is
> stale.
> 
> The mentioned helpers are used by various functions and the context is
> not obvious or documented, but it should be clear once the caller chain
> is known.
> 
> I can turn that into comments but please let me know if this is
> sufficient as explanation or if you need more.
> 

Yes having those different contexts spelled out is really beneficial.

Patch
diff mbox series

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 9f338d9db51b..0795ab511f8b 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -2502,16 +2502,15 @@  int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path
 
 	filemap_write_and_wait(bdev->bd_inode->i_mapping);
 
-	mutex_lock(&fs_devices->device_list_mutex);
-	list_for_each_entry(device, &fs_devices->devices, dev_list) {
+	rcu_read_lock();
+	list_for_each_entry_rcu(device, &fs_devices->devices, dev_list) {
 		if (device->bdev == bdev) {
 			ret = -EEXIST;
-			mutex_unlock(
-				&fs_devices->device_list_mutex);
+			rcu_read_unlock();
 			goto error;
 		}
 	}
-	mutex_unlock(&fs_devices->device_list_mutex);
+	rcu_read_unlock();
 
 	device = btrfs_alloc_device(fs_info, NULL, NULL);
 	if (IS_ERR(device)) {