Message ID | 5f3dc05b0b44803e9d20498970f259ead2bfe7de.1509471604.git.dsterba@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 31.10.2017 19:44, David Sterba wrote: > Overview of the main locks protecting various device-related structures. > > Signed-off-by: David Sterba <dsterba@suse.com> > --- > fs/btrfs/volumes.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 66 insertions(+) > > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c > index 75aed8ec64bd..098affc58361 100644 > --- a/fs/btrfs/volumes.c > +++ b/fs/btrfs/volumes.c > @@ -145,6 +145,72 @@ static int __btrfs_map_block(struct btrfs_fs_info *fs_info, > struct btrfs_bio **bbio_ret, > int mirror_num, int need_raid_map); > > +/* > + * Device locking > + * ============== > + * > + * There are several mutexes that protect manipulation of devices and low-level > + * structures like chunks but not block groups, extents or files This sentence suggests you are describing the various mutexes but it seems you are also describing data structure below i.e. global::fs_devs/btrfs_device::name > + * > + * - uuid_mutex (global lock) > + * > + * protects the fs_uuids list that tracks all per-fs fs_devices, resulting > + * from the SCAN_DEV ioctl registration or from mount either implicitly > + * (the first device) or requested by the device= mount option > + * > + * the mutex can be very coarse and can cover long-running operations > + * > + * protects: updates to fs_devices counters like missing devices, rw devices, > + * seeding, structure cloning, openning/closing devices at mount/umount time Perhaps move the uuid_mutex description after btrfs_device::name. That way mutexes are grouped together and data structures are grouped as well. > + * > + * global::fs_devs - add, remove, updates to the global list You seem to be describing a data structure here rather than a mutex > + * > + * does not protect: manipulation of the fs_devices::devices list! but this sentence is written as if you've just described a mutex. The end result is that it's not clear what mutex you are referring to here. > + * > + * btrfs_device::name - renames (write side), read is RCU It's not clear how the write side is protected. > + * > + * - fs_devices::device_list_mutex (per-fs, with RCU) > + * > + * protects updates to fs_devices::devices, ie. adding and deleting > + * > + * simple list traversal with read-only actions can be done with RCU > + * protection > + * > + * may be used to exclude some operations from running concurrently without > + * any modifications to the list (see write_all_supers) > + * > + * - volume_mutex > + * > + * coarse lock owned by a mounted filesystem; used to exclude some operations > + * that cannot run in parallel and affect the higher-level properties of the > + * filesystem like: device add/deleting/resize/replace, or balance > + * > + * - balance_mutex > + * > + * protects balance structures (status, state) and context accessed from > + * several places (internally, ioctl) > + * > + * - chunk_mutex > + * > + * protects chunks, adding or removing during allocation, trim or when > + * a new device is added/removed > + * > + * - cleaner_mutex > + * > + * a big lock that is held by the cleaner thread and prevents running > + * subvolume cleaning together with relocation or delayed iputs > + * > + * > + * Lock nesting > + * ------------ > + * > + * uuid_mutex > + * volume_mutex > + * device_list_mutex > + * chunk_mutex > + * balance_mutex > + * > + */ > DEFINE_MUTEX(uuid_mutex); > static LIST_HEAD(fs_uuids); > struct list_head *btrfs_get_fs_uuids(void) > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Thanks for writing this. > + * - fs_devices::device_list_mutex (per-fs, with RCU) > + * > + * protects updates to fs_devices::devices, ie. adding and deleting > + * > + * simple list traversal with read-only actions can be done with RCU > + * protection > + * > + * may be used to exclude some operations from running concurrently without > + * any modifications to the list (see write_all_supers) > + * - volume_mutex > + * > + * coarse lock owned by a mounted filesystem; used to exclude some operations > + * that cannot run in parallel and affect the higher-level properties of the > + * filesystem like: device add/deleting/resize/replace, or balance > + * - chunk_mutex > + * > + * protects chunks, adding or removing during allocation, trim or when > + * a new device is added/removed :: > + * Lock nesting > + * ------------ > + * > + * uuid_mutex > + * volume_mutex > + * device_list_mutex > + * chunk_mutex > + * balance_mutex If we have a list of operations that would consume these locks then we can map it accordingly for better clarity. To me it looks like we have too many locks. - we don't have to differentiate the mounted and unmounted context for device locks. - Two lock would be sufficient, one for the device list (add/rm,replace,..) and another for device property changes (resize, trim,..). Thanks, Anand -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 11/03/2017 07:13 PM, Anand Jain wrote: > > > Thanks for writing this. > >> + * - fs_devices::device_list_mutex (per-fs, with RCU) >> + * >> + * protects updates to fs_devices::devices, ie. adding and deleting >> + * >> + * simple list traversal with read-only actions can be done with RCU >> + * protection >> + * >> + * may be used to exclude some operations from running concurrently >> without >> + * any modifications to the list (see write_all_supers) > >> + * - volume_mutex >> + * >> + * coarse lock owned by a mounted filesystem; used to exclude some >> operations >> + * that cannot run in parallel and affect the higher-level >> properties of the >> + * filesystem like: device add/deleting/resize/replace, or balance > >> + * - chunk_mutex >> + * >> + * protects chunks, adding or removing during allocation, trim or when >> + * a new device is added/removed > > :: > >> + * Lock nesting >> + * ------------ >> + * >> + * uuid_mutex >> + * volume_mutex I think it should be nested like this, but as of now its not. Ref [1] [1] btrfs: move volume_mutex into the btrfs_rm_device() Thanks, Anand >> + * device_list_mutex >> + * chunk_mutex >> + * balance_mutex > > > If we have a list of operations that would consume these locks then we > can map it accordingly for better clarity. > > To me it looks like we have too many locks. > - we don't have to differentiate the mounted and unmounted context > for device locks. > - Two lock would be sufficient, one for the device list > (add/rm,replace,..) and another for device property changes > (resize, trim,..). > > Thanks, Anand > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Nov 03, 2017 at 07:13:01PM +0800, Anand Jain wrote: > > > Thanks for writing this. > > > + * - fs_devices::device_list_mutex (per-fs, with RCU) > > + * > > + * protects updates to fs_devices::devices, ie. adding and deleting > > + * > > + * simple list traversal with read-only actions can be done with RCU > > + * protection > > + * > > + * may be used to exclude some operations from running concurrently without > > + * any modifications to the list (see write_all_supers) > > > + * - volume_mutex > > + * > > + * coarse lock owned by a mounted filesystem; used to exclude some operations > > + * that cannot run in parallel and affect the higher-level properties of the > > + * filesystem like: device add/deleting/resize/replace, or balance > > > + * - chunk_mutex > > + * > > + * protects chunks, adding or removing during allocation, trim or when > > + * a new device is added/removed > > :: > > > + * Lock nesting > > + * ------------ > > + * > > + * uuid_mutex > > + * volume_mutex > > + * device_list_mutex > > + * chunk_mutex > > + * balance_mutex > > > If we have a list of operations that would consume these locks then we > can map it accordingly for better clarity. > > To me it looks like we have too many locks. I agree. Trivially we can remove the volume_mutex, because it copies setting of BTRFS_FS_EXCL_OP, but this still needs some preparatory work in the mount path. I have patches for that but I wanted to send out the documentation first, that is supposed to reflect the current state. > - we don't have to differentiate the mounted and unmounted context > for device locks. I'm not sure about that, the device registering happens outside of mounted filesystems yet we touch the same structures on mounted filesystem. > - Two lock would be sufficient, one for the device list > (add/rm,replace,..) and another for device property changes > (resize, trim,..). I think this will be sorted once the volume_mutex is gone. The nature of add/remove and trim is different, as trim does not change the high-level status of devices, only locks the bloc groups. Resize and add/remove are mutually exclusive so they need to be protected by a common lock. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Nov 06, 2017 at 10:32:48AM +0800, Anand Jain wrote: > >> + * Lock nesting > >> + * ------------ > >> + * > >> + * uuid_mutex > >> + * volume_mutex > > I think it should be nested like this, but as of now its not. Ref [1] > [1] > btrfs: move volume_mutex into the btrfs_rm_device() Right, this violates the documented rules and I think that's because of the ambiguous semantic of the uuid_mutex. Here we're using the "on a mounted filesystem" part so we lock only in the context where this applies. Again, once volume_mutex is gone, this will be fixed. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Nov 02, 2017 at 12:29:17PM +0200, Nikolay Borisov wrote: > On 31.10.2017 19:44, David Sterba wrote: > > Overview of the main locks protecting various device-related structures. > > > > Signed-off-by: David Sterba <dsterba@suse.com> > > --- > > fs/btrfs/volumes.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > 1 file changed, 66 insertions(+) > > > > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c > > index 75aed8ec64bd..098affc58361 100644 > > --- a/fs/btrfs/volumes.c > > +++ b/fs/btrfs/volumes.c > > @@ -145,6 +145,72 @@ static int __btrfs_map_block(struct btrfs_fs_info *fs_info, > > struct btrfs_bio **bbio_ret, > > int mirror_num, int need_raid_map); > > > > +/* > > + * Device locking > > + * ============== > > + * > > + * There are several mutexes that protect manipulation of devices and low-level > > + * structures like chunks but not block groups, extents or files > > This sentence suggests you are describing the various mutexes but it > seems you are also describing data structure below i.e. > global::fs_devs/btrfs_device::name In my understanding, documenting a mutex means defining the context and data that are protected in the context. So I can start with the mutex, or the data itself, but the mutex makes IMO more sense to start. > > + * - uuid_mutex (global lock) > > + * > > + * protects the fs_uuids list that tracks all per-fs fs_devices, resulting > > + * from the SCAN_DEV ioctl registration or from mount either implicitly > > + * (the first device) or requested by the device= mount option > > + * > > + * the mutex can be very coarse and can cover long-running operations > > + * > > + * protects: updates to fs_devices counters like missing devices, rw devices, > > + * seeding, structure cloning, openning/closing devices at mount/umount time > > Perhaps move the uuid_mutex description after btrfs_device::name. That > way mutexes are grouped together and data structures are grouped as well. The grouping is by mutex. > > + * > > + * global::fs_devs - add, remove, updates to the global list > > You seem to be describing a data structure here rather than a mutex As I read it, this means "if this mutex is taken, it protects global::fs_devs in this way". > > + * does not protect: manipulation of the fs_devices::devices list! > > but this sentence is written as if you've just described a mutex. The > end result is that it's not clear what mutex you are referring to here. That's the mutex in the section above "- uuid_mutex (global lock)". > > + * btrfs_device::name - renames (write side), read is RCU > > It's not clear how the write side is protected. same answer, it's under the 'uuid_mutex', so the write side is protected by that, with exception of reads protected by RCU. I wonder what kind of documentation structure do you expect as it seems to me you missed it completely. Which is a valuable feedback to me, but I don't know how to fix it, other than make the sections more visible with underlining or indenting or more whitespace. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 6.11.2017 15:51, David Sterba wrote: > On Thu, Nov 02, 2017 at 12:29:17PM +0200, Nikolay Borisov wrote: >> On 31.10.2017 19:44, David Sterba wrote: >>> Overview of the main locks protecting various device-related structures. >>> >>> Signed-off-by: David Sterba <dsterba@suse.com> >>> --- >>> fs/btrfs/volumes.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> 1 file changed, 66 insertions(+) >>> >>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c >>> index 75aed8ec64bd..098affc58361 100644 >>> --- a/fs/btrfs/volumes.c >>> +++ b/fs/btrfs/volumes.c >>> @@ -145,6 +145,72 @@ static int __btrfs_map_block(struct btrfs_fs_info *fs_info, >>> struct btrfs_bio **bbio_ret, >>> int mirror_num, int need_raid_map); >>> >>> +/* >>> + * Device locking >>> + * ============== >>> + * >>> + * There are several mutexes that protect manipulation of devices and low-level >>> + * structures like chunks but not block groups, extents or files >> >> This sentence suggests you are describing the various mutexes but it >> seems you are also describing data structure below i.e. >> global::fs_devs/btrfs_device::name > > In my understanding, documenting a mutex means defining the context and > data that are protected in the context. So I can start with the mutex, > or the data itself, but the mutex makes IMO more sense to start. > > >>> + * - uuid_mutex (global lock) >>> + * >>> + * protects the fs_uuids list that tracks all per-fs fs_devices, resulting >>> + * from the SCAN_DEV ioctl registration or from mount either implicitly >>> + * (the first device) or requested by the device= mount option >>> + * >>> + * the mutex can be very coarse and can cover long-running operations >>> + * >>> + * protects: updates to fs_devices counters like missing devices, rw devices, >>> + * seeding, structure cloning, openning/closing devices at mount/umount time >> >> Perhaps move the uuid_mutex description after btrfs_device::name. That >> way mutexes are grouped together and data structures are grouped as well. > > The grouping is by mutex. > >>> + * >>> + * global::fs_devs - add, remove, updates to the global list >> >> You seem to be describing a data structure here rather than a mutex > > As I read it, this means "if this mutex is taken, it protects > global::fs_devs in this way". > >>> + * does not protect: manipulation of the fs_devices::devices list! >> >> but this sentence is written as if you've just described a mutex. The >> end result is that it's not clear what mutex you are referring to here. > > That's the mutex in the section above "- uuid_mutex (global lock)". > >>> + * btrfs_device::name - renames (write side), read is RCU >> >> It's not clear how the write side is protected. > > same answer, it's under the 'uuid_mutex', so the write side is protected > by that, with exception of reads protected by RCU. > > I wonder what kind of documentation structure do you expect as it seems > to me you missed it completely. Which is a valuable feedback to me, but > I don't know how to fix it, other than make the sections more visible > with underlining or indenting or more whitespace. Right, I did a fresh read and indeed the structure now makes sense. I have completely missed the nesting though. Maybe you can nest more agressively, now every sentence is aligned to the header, perhaps we want to have it one level deeper? Or use more than one - at the mutex level ? I don't really have a clear-cut answer for you. > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Nov 06, 2017 at 05:02:21PM +0200, Nikolay Borisov wrote: > Right, I did a fresh read and indeed the structure now makes sense. I > have completely missed the nesting though. Maybe you can nest more > agressively, now every sentence is aligned to the header, perhaps we > want to have it one level deeper? Or use more than one - at the mutex > level ? I don't really have a clear-cut answer for you. How about that: /* * Device locking * ============== * * There are several mutexes that protect manipulation of devices and low-level * structures like chunks but not block groups, extents or files * * uuid_mutex (global lock) * ------------------------ * protects the fs_uuids list that tracks all per-fs fs_devices, resulting from * the SCAN_DEV ioctl registration or from mount either implicitly (the first * device) or requested by the device= mount option * * the mutex can be very coarse and can cover long-running operations * * protects: updates to fs_devices counters like missing devices, rw devices, * seeding, structure cloning, openning/closing devices at mount/umount time * * global::fs_devs - add, remove, updates to the global list * * does not protect: manipulation of the fs_devices::devices list! * * btrfs_device::name - renames (write side), read is RCU * * fs_devices::device_list_mutex (per-fs, with RCU) * ------------------------------------------------ * protects updates to fs_devices::devices, ie. adding and deleting * * simple list traversal with read-only actions can be done with RCU protection * * may be used to exclude some operations from running concurrently without any * modifications to the list (see write_all_supers) * * volume_mutex * ------------ * coarse lock owned by a mounted filesystem; used to exclude some operations * that cannot run in parallel and affect the higher-level properties of the * filesystem like: device add/deleting/resize/replace, or balance * * balance_mutex * ------------- * protects balance structures (status, state) and context accessed from * several places (internally, ioctl) * * chunk_mutex * ----------- * protects chunks, adding or removing during allocation, trim or when a new * device is added/removed * * cleaner_mutex * ------------- * a big lock that is held by the cleaner thread and prevents running subvolume * cleaning together with relocation or delayed iputs * * * Lock nesting * ============ * * uuid_mutex * volume_mutex * device_list_mutex * chunk_mutex * balance_mutex */ -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 75aed8ec64bd..098affc58361 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -145,6 +145,72 @@ static int __btrfs_map_block(struct btrfs_fs_info *fs_info, struct btrfs_bio **bbio_ret, int mirror_num, int need_raid_map); +/* + * Device locking + * ============== + * + * There are several mutexes that protect manipulation of devices and low-level + * structures like chunks but not block groups, extents or files + * + * - uuid_mutex (global lock) + * + * protects the fs_uuids list that tracks all per-fs fs_devices, resulting + * from the SCAN_DEV ioctl registration or from mount either implicitly + * (the first device) or requested by the device= mount option + * + * the mutex can be very coarse and can cover long-running operations + * + * protects: updates to fs_devices counters like missing devices, rw devices, + * seeding, structure cloning, openning/closing devices at mount/umount time + * + * global::fs_devs - add, remove, updates to the global list + * + * does not protect: manipulation of the fs_devices::devices list! + * + * btrfs_device::name - renames (write side), read is RCU + * + * - fs_devices::device_list_mutex (per-fs, with RCU) + * + * protects updates to fs_devices::devices, ie. adding and deleting + * + * simple list traversal with read-only actions can be done with RCU + * protection + * + * may be used to exclude some operations from running concurrently without + * any modifications to the list (see write_all_supers) + * + * - volume_mutex + * + * coarse lock owned by a mounted filesystem; used to exclude some operations + * that cannot run in parallel and affect the higher-level properties of the + * filesystem like: device add/deleting/resize/replace, or balance + * + * - balance_mutex + * + * protects balance structures (status, state) and context accessed from + * several places (internally, ioctl) + * + * - chunk_mutex + * + * protects chunks, adding or removing during allocation, trim or when + * a new device is added/removed + * + * - cleaner_mutex + * + * a big lock that is held by the cleaner thread and prevents running + * subvolume cleaning together with relocation or delayed iputs + * + * + * Lock nesting + * ------------ + * + * uuid_mutex + * volume_mutex + * device_list_mutex + * chunk_mutex + * balance_mutex + * + */ DEFINE_MUTEX(uuid_mutex); static LIST_HEAD(fs_uuids); struct list_head *btrfs_get_fs_uuids(void)
Overview of the main locks protecting various device-related structures. Signed-off-by: David Sterba <dsterba@suse.com> --- fs/btrfs/volumes.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 66 insertions(+)