mbox series

[0/5] Convert seed devices to proper list API

Message ID 20200715104850.19071-1-nborisov@suse.com (mailing list archive)
Headers show
Series Convert seed devices to proper list API | expand

Message

Nikolay Borisov July 15, 2020, 10:48 a.m. UTC
This series converts the existing seed devices pointer in btrfs_fs_devices to
proper list api. First 4 patches are cleanups preparing the code for the switch.
Patch 5 has more information about the required transformations, it might seem
lengthy but he logic everywhere is pretty much the same so shouldn't be that
cumbersome to review.

This series survived xfstest runs and even caught 2 bugs while developing it.

Nikolay Borisov (5):
  btrfs: Factor out reada loop in __reada_start_machine
  btrfs: Factor out loop logic from btrfs_free_extra_devids
  btrfs: Make close_fs_devices return void
  btrfs: Simplify setting/clearing fs_info to btrfs_fs_devices
  btrfs: Switch seed device to list api

 fs/btrfs/disk-io.c |  39 +++++------
 fs/btrfs/reada.c   |  26 ++++---
 fs/btrfs/sysfs.c   |   4 --
 fs/btrfs/volumes.c | 166 ++++++++++++++++++++-------------------------
 fs/btrfs/volumes.h |   6 +-
 5 files changed, 112 insertions(+), 129 deletions(-)

--
2.17.1

Comments

David Sterba July 22, 2020, 2:26 p.m. UTC | #1
On Wed, Jul 15, 2020 at 01:48:45PM +0300, Nikolay Borisov wrote:
> This series converts the existing seed devices pointer in btrfs_fs_devices to
> proper list api. First 4 patches are cleanups preparing the code for the switch.

> Patch 5 has more information about the required transformations, it might seem
> lengthy but he logic everywhere is pretty much the same so shouldn't be that
> cumbersome to review.

The first 3 patches are clear, I start to have doubts in 4 and 5 was a
stop for me, but I'm not saying it's wrong. Just that I always thought
the seed devices + the sprout are close to a tree structure but you're
switching that to a linear list.

I'll add the first three patches now, but would appreciate some help
with 4 and 5.
Nikolay Borisov July 23, 2020, 8:02 a.m. UTC | #2
On 22.07.20 г. 17:26 ч., David Sterba wrote:
> On Wed, Jul 15, 2020 at 01:48:45PM +0300, Nikolay Borisov wrote:
>> This series converts the existing seed devices pointer in btrfs_fs_devices to
>> proper list api. First 4 patches are cleanups preparing the code for the switch.
> 
>> Patch 5 has more information about the required transformations, it might seem
>> lengthy but he logic everywhere is pretty much the same so shouldn't be that
>> cumbersome to review.
> 
> The first 3 patches are clear, I start to have doubts in 4 and 5 was a
> stop for me, but I'm not saying it's wrong. Just that I always thought
> the seed devices + the sprout are close to a tree structure but you're
> switching that to a linear list.
> 
> I'll add the first three patches now, but would appreciate some help
> with 4 and 5.
> 

Ok, let's take it one at a time, patch 4:

Originally fs_info for fs_devices is set by btrfs_set_fs_info_ptr called
by btrfs_sysfs_add_mounted. THe latter is called in open_ctree before
block_groups are read i.e on before the fs is exposed. My patch instead
changes this so that fs_info is set in btrfs_init_devices_late, called
from btrfs_read_roots, called by init_tree_roots. The last function is
called before btrfs_sysfs_add_mounted. This means as far as setting
fs_devices is concerned its fs_info is being set earlier. And this is
correct.

Now, let's think about clearing it, the old code cleared it in :

btrfs_sysfs_remove_mounted which is called from:

1. fail_sysfs label in open_ctree
2. close_ctree
3. failure in btrfs_sysfs_add_mounted which does goto fail_fsdev_sysfs

So how are those 3 situation handled by the new code :

1 and 3 - the last function called in open_ctree is btrfs_close_devices
in case of errors and both failure labels are before it, so we are
guaranteed that on failure in open_ctree fs_devices will have it's
fs_info being NULL.

2. Again, last function called in close_ctree is btrfs_close_devices so
we are guaranteed on unmount fs_devices will have fs_info being NULL.

Thus the patch doesn't introduce any functional changes but IMO makes
the code more coherent.


Regarding patch 5 - I don't know what made you think there is a
tree-like structure involved. Simply looking at the old way seeds are
iterated:

	while (seed_devices) {
		fs_devices = seed_devices;
		seed_devices = fs_devices->seed;
 		close_fs_devices(fs_devices);
 		free_fs_devices(fs_devices);
 	}


There's no conditional deciding if we should go left/right of the tree.
Or let's take for example deleting from a list of seed devices:

		tmp_fs_devices = fs_info->fs_devices;
		while (tmp_fs_devices) {
			if (tmp_fs_devices->seed == fs_devices) {
				tmp_fs_devices->seed = fs_devices->seed;
				break;
			}
			tmp_fs_devices = tmp_fs_devices->seed;
		}

Here a simple linear search is performed and when a member of the seed
list matches our fs_devices it simply pointed 1 past our fs_devices

When the if inside the loop matches we get the following situation:

[tmp_fs_devices]->[fs_devices]->[fs_devices->seeds]

Then we perform [tmp_fs_devices] -> [fs_devices->seed]

So a simple deletion, just the fact you were confused shows the old code
is rather wonky.
Nikolay Borisov Aug. 17, 2020, 7:19 p.m. UTC | #3
On 15.07.20 г. 13:48 ч., Nikolay Borisov wrote:
> This series converts the existing seed devices pointer in btrfs_fs_devices to
> proper list api. First 4 patches are cleanups preparing the code for the switch.
> Patch 5 has more information about the required transformations, it might seem
> lengthy but he logic everywhere is pretty much the same so shouldn't be that
> cumbersome to review.
> 
> This series survived xfstest runs and even caught 2 bugs while developing it.
> 
> Nikolay Borisov (5):
>   btrfs: Factor out reada loop in __reada_start_machine
>   btrfs: Factor out loop logic from btrfs_free_extra_devids
>   btrfs: Make close_fs_devices return void
>   btrfs: Simplify setting/clearing fs_info to btrfs_fs_devices
>   btrfs: Switch seed device to list api
> 
>  fs/btrfs/disk-io.c |  39 +++++------
>  fs/btrfs/reada.c   |  26 ++++---
>  fs/btrfs/sysfs.c   |   4 --
>  fs/btrfs/volumes.c | 166 ++++++++++++++++++++-------------------------
>  fs/btrfs/volumes.h |   6 +-
>  5 files changed, 112 insertions(+), 129 deletions(-)
> 
> --
> 2.17.1
> 

Gentle ping
David Sterba Aug. 21, 2020, 2:33 p.m. UTC | #4
On Thu, Jul 23, 2020 at 11:02:14AM +0300, Nikolay Borisov wrote:
> Regarding patch 5 - I don't know what made you think there is a
> tree-like structure involved. Simply looking at the old way seeds are
> iterated:
> 
> 	while (seed_devices) {
> 		fs_devices = seed_devices;
> 		seed_devices = fs_devices->seed;
>  		close_fs_devices(fs_devices);
>  		free_fs_devices(fs_devices);
>  	}
> 
> There's no conditional deciding if we should go left/right of the tree.

A tree structure that has only one direction arrows, from leaves to the
root. Where the root is the seed fs and the leaves are the sprouts.

	 fs1 ---> seed
		  ^ ^
	 fs2 -----| |
                    |
	 fs3 -------|

The while loop in each fs goes by the pointers up to the seeding device
and always removes it's references.

I'm not sure about a deeper tree structure here, so the fs3 -> fs2 would
be possible.

> Or let's take for example deleting from a list of seed devices:
> 
> 		tmp_fs_devices = fs_info->fs_devices;
> 		while (tmp_fs_devices) {
> 			if (tmp_fs_devices->seed == fs_devices) {
> 				tmp_fs_devices->seed = fs_devices->seed;
> 				break;
> 			}
> 			tmp_fs_devices = tmp_fs_devices->seed;
> 		}
> 
> Here a simple linear search is performed and when a member of the seed
> list matches our fs_devices it simply pointed 1 past our fs_devices
> 
> When the if inside the loop matches we get the following situation:
> 
> [tmp_fs_devices]->[fs_devices]->[fs_devices->seeds]
> 
> Then we perform [tmp_fs_devices] -> [fs_devices->seed]
> 
> So a simple deletion, just the fact you were confused shows the old code
> is rather wonky.

Well, I still am after reading the above, the missing part is about the
device cloning and that each fs has it's own copy. The tree structure
implies sharing and would need reference counting, but this does not
seem to be so.

I'll add the series to for-next so we have a base for the other seeding
patches but I haven't reviewed it to the point where I'm convinced it's
correct.