diff mbox series

[v2,1/2] btrfs: harden identification of the stale device

Message ID 612eac6f9309cbee107afbbd4817c0a628207438.1639155519.git.anand.jain@oracle.com (mailing list archive)
State New, archived
Headers show
Series btrfs: match device by dev_t | expand

Commit Message

Anand Jain Dec. 10, 2021, 6:15 p.m. UTC
Identifying and removing the stale device from the fs_uuids list is done
by the function btrfs_free_stale_devices().
btrfs_free_stale_devices() in turn depends on the function
device_path_matched() to check if the device repeats in more than one
btrfs_device structure.

The matching of the device happens by its path, the device path. However,
when dm mapper is in use, the dm device paths are nothing but a link to
the actual block device, which leads to the device_path_matched() failing
to match.

Fix this by matching the dev_t as provided by lookup_bdev() instead of
plain strcmp() the device paths.

Reported-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
---

v2: Fix 
     sparse: warning: incorrect type in argument 1 (different address spaces)
     For using device->name->str

    Fix Josef suggestion to pass dev_t instead of device-path in the
     patch 2/2.

 fs/btrfs/volumes.c | 41 ++++++++++++++++++++++++++++++++++++-----
 1 file changed, 36 insertions(+), 5 deletions(-)

Comments

Nikolay Borisov Dec. 13, 2021, 3:04 p.m. UTC | #1
On 10.12.21 г. 20:15, Anand Jain wrote:
> Identifying and removing the stale device from the fs_uuids list is done
> by the function btrfs_free_stale_devices().
> btrfs_free_stale_devices() in turn depends on the function
> device_path_matched() to check if the device repeats in more than one
> btrfs_device structure.
> 
> The matching of the device happens by its path, the device path. However,
> when dm mapper is in use, the dm device paths are nothing but a link to
> the actual block device, which leads to the device_path_matched() failing
> to match.
> 
> Fix this by matching the dev_t as provided by lookup_bdev() instead of
> plain strcmp() the device paths.
> 
> Reported-by: Josef Bacik <josef@toxicpanda.com>
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
> 
> v2: Fix 
>      sparse: warning: incorrect type in argument 1 (different address spaces)
>      For using device->name->str
> 
>     Fix Josef suggestion to pass dev_t instead of device-path in the
>      patch 2/2.
> 
>  fs/btrfs/volumes.c | 41 ++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 36 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 1b02c03a882c..559fdb0c4a0e 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -534,15 +534,46 @@ btrfs_get_bdev_and_sb(const char *device_path, fmode_t flags, void *holder,
>  	return ret;
>  }
>  
> -static bool device_path_matched(const char *path, struct btrfs_device *device)
> +/*
> + * Check if the device in the 'path' matches with the device in the given
> + * struct btrfs_device '*device'.
> + * Returns:
> + *	0	If it is the same device.
> + *	1	If it is not the same device.
> + *	-errno	For error.

This convention is somewhat confusing. This function returns a boolean
meaniing if a device matched or not, yet the retval follows strcmp
convention of return values. That is make 1 mean "device matched" and
"0" mean device not matched. Because ultimately that's what we care for.

Furthermore you give it the ability to return an error which not
consumed at all. Simply make the function boolean and return false if an
error is encountered by some of the internal calls.

> + */
> +static int device_matched(struct btrfs_device *device, const char *path)
>  {
> -	int found;
> +	char *device_name;
> +	dev_t dev_old;
> +	dev_t dev_new;
> +	int ret;
> +
> +	device_name = kzalloc(BTRFS_PATH_NAME_MAX, GFP_KERNEL);
> +	if (!device_name)
> +		return -ENOMEM;
>  
>  	rcu_read_lock();
> -	found = strcmp(rcu_str_deref(device->name), path);
> +	ret = sprintf(device_name, "%s", rcu_str_deref(device->name));
>  	rcu_read_unlock();
> +	if (!ret) {
> +		kfree(device_name);
> +		return -EINVAL;
> +	}
>  
> -	return found == 0;
> +	ret = lookup_bdev(device_name, &dev_old);

Instead of allocating memory for storing device->name and freeing it,
AFAICS lookup_bdev can be called under rcu read section so you can
simply call lookup_bdev under rcu_read_lock which simplifies memory
management.


In the end this function really boils down to making 2 calls to
lookup_bdev and comparing their values for equality, no need for
anything more fancy than that.


> +	kfree(device_name);
> +	if (ret)
> +		return ret;
> +
> +	ret = lookup_bdev(path, &dev_new);
> +	if (ret)
> +		return ret;
> +
> +	if (dev_old == dev_new)
> +		return 0;
> +
> +	return 1;
>  }
>  
>  /*
> @@ -577,7 +608,7 @@ static int btrfs_free_stale_devices(const char *path,

What's more lookinng at the body of free_stale_device I find the name of
the function somewhat confusing. What it does is really delete all
devices from all fs_devices which match a particular criterion for a
device path i.e the function's body doesn't deal with the concept of
"stale" at all. As such I think it should be renamed and given a more
generic name like btrfs_free_specific_device or something along those
lines.

>  				continue;
>  			if (path && !device->name)
>  				continue;
> -			if (path && !device_path_matched(path, device))
> +			if (path && device_matched(device, path) != 0)
>  				continue;
>  			if (fs_devices->opened) {
>  				/* for an already deleted device return 0 */
>
Nikolay Borisov Dec. 13, 2021, 3:14 p.m. UTC | #2
On 13.12.21 г. 17:04, Nikolay Borisov wrote:
> 
> 
> On 10.12.21 г. 20:15, Anand Jain wrote:
>> Identifying and removing the stale device from the fs_uuids list is done
>> by the function btrfs_free_stale_devices().
>> btrfs_free_stale_devices() in turn depends on the function
>> device_path_matched() to check if the device repeats in more than one
>> btrfs_device structure.
>>
>> The matching of the device happens by its path, the device path. However,
>> when dm mapper is in use, the dm device paths are nothing but a link to
>> the actual block device, which leads to the device_path_matched() failing
>> to match.
>>
>> Fix this by matching the dev_t as provided by lookup_bdev() instead of
>> plain strcmp() the device paths.
>>
>> Reported-by: Josef Bacik <josef@toxicpanda.com>
>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
>> ---
>>
>> v2: Fix 
>>      sparse: warning: incorrect type in argument 1 (different address spaces)
>>      For using device->name->str
>>
>>     Fix Josef suggestion to pass dev_t instead of device-path in the
>>      patch 2/2.
>>
>>  fs/btrfs/volumes.c | 41 ++++++++++++++++++++++++++++++++++++-----
>>  1 file changed, 36 insertions(+), 5 deletions(-)
>>
>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>> index 1b02c03a882c..559fdb0c4a0e 100644
>> --- a/fs/btrfs/volumes.c
>> +++ b/fs/btrfs/volumes.c
>> @@ -534,15 +534,46 @@ btrfs_get_bdev_and_sb(const char *device_path, fmode_t flags, void *holder,
>>  	return ret;
>>  }
>>  
>> -static bool device_path_matched(const char *path, struct btrfs_device *device)
>> +/*
>> + * Check if the device in the 'path' matches with the device in the given
>> + * struct btrfs_device '*device'.
>> + * Returns:
>> + *	0	If it is the same device.
>> + *	1	If it is not the same device.
>> + *	-errno	For error.
> 
> This convention is somewhat confusing. This function returns a boolean
> meaniing if a device matched or not, yet the retval follows strcmp
> convention of return values. That is make 1 mean "device matched" and
> "0" mean device not matched. Because ultimately that's what we care for.
> 
> Furthermore you give it the ability to return an error which not
> consumed at all. Simply make the function boolean and return false if an
> error is encountered by some of the internal calls.
> 
>> + */
>> +static int device_matched(struct btrfs_device *device, const char *path)
>>  {
>> -	int found;
>> +	char *device_name;
>> +	dev_t dev_old;
>> +	dev_t dev_new;
>> +	int ret;
>> +
>> +	device_name = kzalloc(BTRFS_PATH_NAME_MAX, GFP_KERNEL);
>> +	if (!device_name)
>> +		return -ENOMEM;
>>  
>>  	rcu_read_lock();
>> -	found = strcmp(rcu_str_deref(device->name), path);
>> +	ret = sprintf(device_name, "%s", rcu_str_deref(device->name));
>>  	rcu_read_unlock();
>> +	if (!ret) {
>> +		kfree(device_name);
>> +		return -EINVAL;
>> +	}
>>  
>> -	return found == 0;
>> +	ret = lookup_bdev(device_name, &dev_old);
> 
> Instead of allocating memory for storing device->name and freeing it,
> AFAICS lookup_bdev can be called under rcu read section so you can
> simply call lookup_bdev under rcu_read_lock which simplifies memory
> management.

lookup_bdev calls kern_path->filejame_lookup which does an initial try
to lookup the name via an RCU but if it gets a ESTALE/ECHILD it will
fallback to a full path resolution and that *might* sleep so actually
doing the dynamic memory allocation is necessary... Bummer.

> 
> 
> In the end this function really boils down to making 2 calls to
> lookup_bdev and comparing their values for equality, no need for
> anything more fancy than that.
> 
> 
>> +	kfree(device_name);
>> +	if (ret)
>> +		return ret;
>> +
>> +	ret = lookup_bdev(path, &dev_new);
>> +	if (ret)
>> +		return ret;
>> +
>> +	if (dev_old == dev_new)
>> +		return 0;
>> +
>> +	return 1;
>>  }
>>  
>>  /*
>> @@ -577,7 +608,7 @@ static int btrfs_free_stale_devices(const char *path,
> 
> What's more lookinng at the body of free_stale_device I find the name of
> the function somewhat confusing. What it does is really delete all
> devices from all fs_devices which match a particular criterion for a
> device path i.e the function's body doesn't deal with the concept of
> "stale" at all. As such I think it should be renamed and given a more
> generic name like btrfs_free_specific_device or something along those
> lines.
> 
>>  				continue;
>>  			if (path && !device->name)
>>  				continue;
>> -			if (path && !device_path_matched(path, device))
>> +			if (path && device_matched(device, path) != 0)
>>  				continue;
>>  			if (fs_devices->opened) {
>>  				/* for an already deleted device return 0 */
>>
>
Anand Jain Dec. 14, 2021, 2:27 p.m. UTC | #3
On 13/12/2021 23:14, Nikolay Borisov wrote:
> 
> 
> On 13.12.21 г. 17:04, Nikolay Borisov wrote:
>>
>>
>> On 10.12.21 г. 20:15, Anand Jain wrote:
>>> Identifying and removing the stale device from the fs_uuids list is done
>>> by the function btrfs_free_stale_devices().
>>> btrfs_free_stale_devices() in turn depends on the function
>>> device_path_matched() to check if the device repeats in more than one
>>> btrfs_device structure.
>>>
>>> The matching of the device happens by its path, the device path. However,
>>> when dm mapper is in use, the dm device paths are nothing but a link to
>>> the actual block device, which leads to the device_path_matched() failing
>>> to match.
>>>
>>> Fix this by matching the dev_t as provided by lookup_bdev() instead of
>>> plain strcmp() the device paths.
>>>
>>> Reported-by: Josef Bacik <josef@toxicpanda.com>
>>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
>>> ---
>>>
>>> v2: Fix
>>>       sparse: warning: incorrect type in argument 1 (different address spaces)
>>>       For using device->name->str
>>>
>>>      Fix Josef suggestion to pass dev_t instead of device-path in the
>>>       patch 2/2.
>>>
>>>   fs/btrfs/volumes.c | 41 ++++++++++++++++++++++++++++++++++++-----
>>>   1 file changed, 36 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>>> index 1b02c03a882c..559fdb0c4a0e 100644
>>> --- a/fs/btrfs/volumes.c
>>> +++ b/fs/btrfs/volumes.c
>>> @@ -534,15 +534,46 @@ btrfs_get_bdev_and_sb(const char *device_path, fmode_t flags, void *holder,
>>>   	return ret;
>>>   }
>>>   
>>> -static bool device_path_matched(const char *path, struct btrfs_device *device)
>>> +/*
>>> + * Check if the device in the 'path' matches with the device in the given
>>> + * struct btrfs_device '*device'.
>>> + * Returns:
>>> + *	0	If it is the same device.
>>> + *	1	If it is not the same device.
>>> + *	-errno	For error.
>>
>> This convention is somewhat confusing. This function returns a boolean
>> meaniing if a device matched or not, yet the retval follows strcmp
>> convention of return values. That is make 1 mean "device matched" and
>> "0" mean device not matched. Because ultimately that's what we care for.
>>
>> Furthermore you give it the ability to return an error which not
>> consumed at all. Simply make the function boolean and return false if an
>> error is encountered by some of the internal calls.
>>
>>> + */
>>> +static int device_matched(struct btrfs_device *device, const char *path)
>>>   {
>>> -	int found;
>>> +	char *device_name;
>>> +	dev_t dev_old;
>>> +	dev_t dev_new;
>>> +	int ret;
>>> +
>>> +	device_name = kzalloc(BTRFS_PATH_NAME_MAX, GFP_KERNEL);
>>> +	if (!device_name)
>>> +		return -ENOMEM;
>>>   
>>>   	rcu_read_lock();
>>> -	found = strcmp(rcu_str_deref(device->name), path);
>>> +	ret = sprintf(device_name, "%s", rcu_str_deref(device->name));
>>>   	rcu_read_unlock();
>>> +	if (!ret) {
>>> +		kfree(device_name);
>>> +		return -EINVAL;
>>> +	}
>>>   
>>> -	return found == 0;
>>> +	ret = lookup_bdev(device_name, &dev_old);
>>
>> Instead of allocating memory for storing device->name and freeing it,
>> AFAICS lookup_bdev can be called under rcu read section so you can
>> simply call lookup_bdev under rcu_read_lock which simplifies memory
>> management.
> 
> lookup_bdev calls kern_path->filejame_lookup which does an initial try
> to lookup the name via an RCU but if it gets a ESTALE/ECHILD it will
> fallback to a full path resolution and that *might* sleep so actually
> doing the dynamic memory allocation is necessary... Bummer.
> 

Yep.
Also, the device_matched() function might go away in the long run, as I 
found it is a good idea to keep the dev_t in the struct btrfs_device 
when we open it.

Thanks, Anand

>>
>>
>> In the end this function really boils down to making 2 calls to
>> lookup_bdev and comparing their values for equality, no need for
>> anything more fancy than that.
>>
>>
>>> +	kfree(device_name);
>>> +	if (ret)
>>> +		return ret;
>>> +
>>> +	ret = lookup_bdev(path, &dev_new);
>>> +	if (ret)
>>> +		return ret;
>>> +
>>> +	if (dev_old == dev_new)
>>> +		return 0;
>>> +
>>> +	return 1;
>>>   }
>>>   
>>>   /*
>>> @@ -577,7 +608,7 @@ static int btrfs_free_stale_devices(const char *path,
>>
>> What's more lookinng at the body of free_stale_device I find the name of
>> the function somewhat confusing. What it does is really delete all
>> devices from all fs_devices which match a particular criterion for a
>> device path i.e the function's body doesn't deal with the concept of
>> "stale" at all. As such I think it should be renamed and given a more
>> generic name like btrfs_free_specific_device or something along those
>> lines.
>>
>>>   				continue;
>>>   			if (path && !device->name)
>>>   				continue;
>>> -			if (path && !device_path_matched(path, device))
>>> +			if (path && device_matched(device, path) != 0)
>>>   				continue;
>>>   			if (fs_devices->opened) {
>>>   				/* for an already deleted device return 0 */
>>>
>>
David Sterba Jan. 4, 2022, 6:56 p.m. UTC | #4
On Sat, Dec 11, 2021 at 02:15:29AM +0800, Anand Jain wrote:
> Identifying and removing the stale device from the fs_uuids list is done
> by the function btrfs_free_stale_devices().
> btrfs_free_stale_devices() in turn depends on the function
> device_path_matched() to check if the device repeats in more than one
> btrfs_device structure.
> 
> The matching of the device happens by its path, the device path. However,
> when dm mapper is in use, the dm device paths are nothing but a link to
> the actual block device, which leads to the device_path_matched() failing
> to match.
> 
> Fix this by matching the dev_t as provided by lookup_bdev() instead of
> plain strcmp() the device paths.
> 
> Reported-by: Josef Bacik <josef@toxicpanda.com>
> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
> 
> v2: Fix 
>      sparse: warning: incorrect type in argument 1 (different address spaces)
>      For using device->name->str
> 
>     Fix Josef suggestion to pass dev_t instead of device-path in the
>      patch 2/2.
> 
>  fs/btrfs/volumes.c | 41 ++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 36 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 1b02c03a882c..559fdb0c4a0e 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -534,15 +534,46 @@ btrfs_get_bdev_and_sb(const char *device_path, fmode_t flags, void *holder,
>  	return ret;
>  }
>  
> -static bool device_path_matched(const char *path, struct btrfs_device *device)
> +/*
> + * Check if the device in the 'path' matches with the device in the given
> + * struct btrfs_device '*device'.
> + * Returns:
> + *	0	If it is the same device.
> + *	1	If it is not the same device.
> + *	-errno	For error.
> + */
> +static int device_matched(struct btrfs_device *device, const char *path)
>  {
> -	int found;
> +	char *device_name;
> +	dev_t dev_old;
> +	dev_t dev_new;
> +	int ret;
> +
> +	device_name = kzalloc(BTRFS_PATH_NAME_MAX, GFP_KERNEL);
> +	if (!device_name)
> +		return -ENOMEM;
>  
>  	rcu_read_lock();
> -	found = strcmp(rcu_str_deref(device->name), path);
> +	ret = sprintf(device_name, "%s", rcu_str_deref(device->name));

I wonder if the temporary allocation could be avoided, as it's the
exactly same path of the device, so it could be passed to lookup_bdev.

>  	rcu_read_unlock();
> +	if (!ret) {
> +		kfree(device_name);
> +		return -EINVAL;
> +	}
>  
> -	return found == 0;
> +	ret = lookup_bdev(device_name, &dev_old);
> +	kfree(device_name);
> +	if (ret)
> +		return ret;
> +
> +	ret = lookup_bdev(path, &dev_new);
> +	if (ret)
> +		return ret;
> +
> +	if (dev_old == dev_new)
> +		return 0;
> +
> +	return 1;
>  }
>  
>  /*
> @@ -577,7 +608,7 @@ static int btrfs_free_stale_devices(const char *path,
>  				continue;
>  			if (path && !device->name)
>  				continue;
> -			if (path && !device_path_matched(path, device))
> +			if (path && device_matched(device, path) != 0)

You've changed the fuction to return errors but it's not checked,
besides the memory allocation failure, the lookup functions could fail
for various reasons so I don't think it's safe to ignore the errors.

>  				continue;
>  			if (fs_devices->opened) {
>  				/* for an already deleted device return 0 */
> -- 
> 2.33.1
Anand Jain Jan. 5, 2022, 11:31 a.m. UTC | #5
On 05/01/2022 02:56, David Sterba wrote:
> On Sat, Dec 11, 2021 at 02:15:29AM +0800, Anand Jain wrote:
>> Identifying and removing the stale device from the fs_uuids list is done
>> by the function btrfs_free_stale_devices().
>> btrfs_free_stale_devices() in turn depends on the function
>> device_path_matched() to check if the device repeats in more than one
>> btrfs_device structure.
>>
>> The matching of the device happens by its path, the device path. However,
>> when dm mapper is in use, the dm device paths are nothing but a link to
>> the actual block device, which leads to the device_path_matched() failing
>> to match.
>>
>> Fix this by matching the dev_t as provided by lookup_bdev() instead of
>> plain strcmp() the device paths.
>>
>> Reported-by: Josef Bacik <josef@toxicpanda.com>
>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
>> ---
>>
>> v2: Fix
>>       sparse: warning: incorrect type in argument 1 (different address spaces)
>>       For using device->name->str
>>
>>      Fix Josef suggestion to pass dev_t instead of device-path in the
>>       patch 2/2.
>>
>>   fs/btrfs/volumes.c | 41 ++++++++++++++++++++++++++++++++++++-----
>>   1 file changed, 36 insertions(+), 5 deletions(-)
>>
>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>> index 1b02c03a882c..559fdb0c4a0e 100644
>> --- a/fs/btrfs/volumes.c
>> +++ b/fs/btrfs/volumes.c
>> @@ -534,15 +534,46 @@ btrfs_get_bdev_and_sb(const char *device_path, fmode_t flags, void *holder,
>>   	return ret;
>>   }
>>   
>> -static bool device_path_matched(const char *path, struct btrfs_device *device)
>> +/*
>> + * Check if the device in the 'path' matches with the device in the given
>> + * struct btrfs_device '*device'.
>> + * Returns:
>> + *	0	If it is the same device.
>> + *	1	If it is not the same device.
>> + *	-errno	For error.
>> + */
>> +static int device_matched(struct btrfs_device *device, const char *path)
>>   {
>> -	int found;
>> +	char *device_name;
>> +	dev_t dev_old;
>> +	dev_t dev_new;
>> +	int ret;
>> +
>> +	device_name = kzalloc(BTRFS_PATH_NAME_MAX, GFP_KERNEL);
>> +	if (!device_name)
>> +		return -ENOMEM;
>>   
>>   	rcu_read_lock();
>> -	found = strcmp(rcu_str_deref(device->name), path);
>> +	ret = sprintf(device_name, "%s", rcu_str_deref(device->name));
> 
> I wonder if the temporary allocation could be avoided, as it's the
> exactly same path of the device, so it could be passed to lookup_bdev.

  Yeah, I tried but to no avail. Unless I drop the rcu read lock and
  use the str directly as below.

  lookup_bdev(device->name->str, &dev_old);

  We do skip rcu access for device->name at a few locations.

  Also, pls note lookup_bdev() can't be called within rcu_read_lock(),
  (calling sleep function warning).


>>   	rcu_read_unlock();
>> +	if (!ret) {
>> +		kfree(device_name);
>> +		return -EINVAL;
>> +	}
>>   
>> -	return found == 0;
>> +	ret = lookup_bdev(device_name, &dev_old);
>> +	kfree(device_name);
>> +	if (ret)
>> +		return ret;
>> +
>> +	ret = lookup_bdev(path, &dev_new);
>> +	if (ret)
>> +		return ret;
>> +
>> +	if (dev_old == dev_new)
>> +		return 0;
>> +
>> +	return 1;
>>   }
>>   
>>   /*
>> @@ -577,7 +608,7 @@ static int btrfs_free_stale_devices(const char *path,
>>   				continue;
>>   			if (path && !device->name)
>>   				continue;
>> -			if (path && !device_path_matched(path, device))
>> +			if (path && device_matched(device, path) != 0)
> 
> You've changed the fuction to return errors but it's not checked,
> besides the memory allocation failure, the lookup functions could fail
> for various reasons so I don't think it's safe to ignore the errors.

IMO there isn't much that the parent function should do even if the
device_matched() returns an error for the reasons you stated.

Mainly because btrfs_free_stale_devices() OR btrfs_forget_devices()
is used as a cleanup ops in the primary task functions such as
btrfs_scan_one_device() and btrfs_init_new_device(). Even if we don't
remove the stale there is no harm.

Further btrfs_forget_devices() is called from btrfs_control_ioctl()
which is a userland call for forget devices. So as we traverse the
device list, if a device lookup fails IMO, it is ok to skip to the next
device in the list and return the status of the device match.

Even more, IMO we should save the dev_t in the struct btrfs_device,
upon which the device_matched() will go away altogether. This change
is outside of the bug that we intended to fix here. I will clean that
up separately.

Thanks, Anand

>>   				continue;
>>   			if (fs_devices->opened) {
>>   				/* for an already deleted device return 0 */
>> -- 
>> 2.33.1
Su Yue Jan. 6, 2022, 6:05 a.m. UTC | #6
On Wed 05 Jan 2022 at 19:31, Anand Jain <anand.jain@oracle.com> 
wrote:

> On 05/01/2022 02:56, David Sterba wrote:
>> On Sat, Dec 11, 2021 at 02:15:29AM +0800, Anand Jain wrote:
>>> Identifying and removing the stale device from the fs_uuids 
>>> list is done
>>> by the function btrfs_free_stale_devices().
>>> btrfs_free_stale_devices() in turn depends on the function
>>> device_path_matched() to check if the device repeats in more 
>>> than one
>>> btrfs_device structure.
>>>
>>> The matching of the device happens by its path, the device 
>>> path. However,
>>> when dm mapper is in use, the dm device paths are nothing but 
>>> a link to
>>> the actual block device, which leads to the 
>>> device_path_matched() failing
>>> to match.
>>>
>>> Fix this by matching the dev_t as provided by lookup_bdev() 
>>> instead of
>>> plain strcmp() the device paths.
>>>
>>> Reported-by: Josef Bacik <josef@toxicpanda.com>
>>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
>>> ---
>>>
>>> v2: Fix
>>>       sparse: warning: incorrect type in argument 1 (different 
>>>       address spaces)
>>>       For using device->name->str
>>>
>>>      Fix Josef suggestion to pass dev_t instead of device-path 
>>>      in the
>>>       patch 2/2.
>>>
>>>   fs/btrfs/volumes.c | 41 
>>>   ++++++++++++++++++++++++++++++++++++-----
>>>   1 file changed, 36 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>>> index 1b02c03a882c..559fdb0c4a0e 100644
>>> --- a/fs/btrfs/volumes.c
>>> +++ b/fs/btrfs/volumes.c
>>> @@ -534,15 +534,46 @@ btrfs_get_bdev_and_sb(const char 
>>> *device_path, fmode_t flags, void *holder,
>>>   	return ret;
>>>   }
>>>   -static bool device_path_matched(const char *path, struct 
>>>   btrfs_device
>>> *device)
>>> +/*
>>> + * Check if the device in the 'path' matches with the device 
>>> in the given
>>> + * struct btrfs_device '*device'.
>>> + * Returns:
>>> + *	0	If it is the same device.
>>> + *	1	If it is not the same device.
>>> + *	-errno	For error.
>>> + */
>>> +static int device_matched(struct btrfs_device *device, const 
>>> char *path)
>>>   {
>>> -	int found;
>>> +	char *device_name;
>>> +	dev_t dev_old;
>>> +	dev_t dev_new;
>>> +	int ret;
>>> +
>>> +	device_name = kzalloc(BTRFS_PATH_NAME_MAX, GFP_KERNEL);
>>> +	if (!device_name)
>>> +		return -ENOMEM;
>>>     	rcu_read_lock();
>>> -	found = strcmp(rcu_str_deref(device->name), path);
>>> +	ret = sprintf(device_name, "%s", 
>>> rcu_str_deref(device->name));
>> I wonder if the temporary allocation could be avoided, as it's 
>> the
>> exactly same path of the device, so it could be passed to 
>> lookup_bdev.
>
>  Yeah, I tried but to no avail. Unless I drop the rcu read lock 
>  and
>  use the str directly as below.
>
>  lookup_bdev(device->name->str, &dev_old);
>
>  We do skip rcu access for device->name at a few locations.
>
>  Also, pls note lookup_bdev() can't be called within 
>  rcu_read_lock(),
>  (calling sleep function warning).
>

Got it. And another evil and dirty solution is that open code the 
logic in
the only user btrfs_free_stale_devices(). Do the memory allocation 
and
lookup_bdev(path) before the big big loop so we won't be disturbed
error handling and save some times of lookup_bdev ;)

--
Su
>
>>>   	rcu_read_unlock();
>>> +	if (!ret) {
>>> +		kfree(device_name);
>>> +		return -EINVAL;
>>> +	}
>>>   -	return found == 0;
>>> +	ret = lookup_bdev(device_name, &dev_old);
>>> +	kfree(device_name);
>>> +	if (ret)
>>> +		return ret;
>>> +
>>> +	ret = lookup_bdev(path, &dev_new);
>>> +	if (ret)
>>> +		return ret;
>>> +
>>> +	if (dev_old == dev_new)
>>> +		return 0;
>>> +
>>> +	return 1;
>>>   }
>>>     /*
>>> @@ -577,7 +608,7 @@ static int btrfs_free_stale_devices(const 
>>> char *path,
>>>   				continue;
>>>   			if (path && !device->name)
>>>   				continue;
>>> -			if (path && !device_path_matched(path, 
>>> device))
>>> +			if (path && device_matched(device, path) 
>>> != 0)
>> You've changed the fuction to return errors but it's not 
>> checked,
>> besides the memory allocation failure, the lookup functions 
>> could fail
>> for various reasons so I don't think it's safe to ignore the 
>> errors.
>
> IMO there isn't much that the parent function should do even if 
> the
> device_matched() returns an error for the reasons you stated.
>
> Mainly because btrfs_free_stale_devices() OR 
> btrfs_forget_devices()
> is used as a cleanup ops in the primary task functions such as
> btrfs_scan_one_device() and btrfs_init_new_device(). Even if we 
> don't
> remove the stale there is no harm.
>
> Further btrfs_forget_devices() is called from 
> btrfs_control_ioctl()
> which is a userland call for forget devices. So as we traverse 
> the
> device list, if a device lookup fails IMO, it is ok to skip to 
> the next
> device in the list and return the status of the device match.
>
> Even more, IMO we should save the dev_t in the struct 
> btrfs_device,
> upon which the device_matched() will go away altogether. This 
> change
> is outside of the bug that we intended to fix here. I will clean 
> that
> up separately.
>
> Thanks, Anand
>
>>>   				continue;
>>>   			if (fs_devices->opened) {
>>>   				/* for an already deleted device 
>>>   return 0 */
>>> -- 2.33.1
David Sterba Jan. 7, 2022, 2:48 p.m. UTC | #7
On Wed, Jan 05, 2022 at 07:31:31PM +0800, Anand Jain wrote:
> 
> 
> On 05/01/2022 02:56, David Sterba wrote:
> > On Sat, Dec 11, 2021 at 02:15:29AM +0800, Anand Jain wrote:
> >> Identifying and removing the stale device from the fs_uuids list is done
> >> by the function btrfs_free_stale_devices().
> >> btrfs_free_stale_devices() in turn depends on the function
> >> device_path_matched() to check if the device repeats in more than one
> >> btrfs_device structure.
> >>
> >> The matching of the device happens by its path, the device path. However,
> >> when dm mapper is in use, the dm device paths are nothing but a link to
> >> the actual block device, which leads to the device_path_matched() failing
> >> to match.
> >>
> >> Fix this by matching the dev_t as provided by lookup_bdev() instead of
> >> plain strcmp() the device paths.
> >>
> >> Reported-by: Josef Bacik <josef@toxicpanda.com>
> >> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> >> ---
> >>
> >> v2: Fix
> >>       sparse: warning: incorrect type in argument 1 (different address spaces)
> >>       For using device->name->str
> >>
> >>      Fix Josef suggestion to pass dev_t instead of device-path in the
> >>       patch 2/2.
> >>
> >>   fs/btrfs/volumes.c | 41 ++++++++++++++++++++++++++++++++++++-----
> >>   1 file changed, 36 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> >> index 1b02c03a882c..559fdb0c4a0e 100644
> >> --- a/fs/btrfs/volumes.c
> >> +++ b/fs/btrfs/volumes.c
> >> @@ -534,15 +534,46 @@ btrfs_get_bdev_and_sb(const char *device_path, fmode_t flags, void *holder,
> >>   	return ret;
> >>   }
> >>   
> >> -static bool device_path_matched(const char *path, struct btrfs_device *device)
> >> +/*
> >> + * Check if the device in the 'path' matches with the device in the given
> >> + * struct btrfs_device '*device'.
> >> + * Returns:
> >> + *	0	If it is the same device.
> >> + *	1	If it is not the same device.
> >> + *	-errno	For error.
> >> + */
> >> +static int device_matched(struct btrfs_device *device, const char *path)
> >>   {
> >> -	int found;
> >> +	char *device_name;
> >> +	dev_t dev_old;
> >> +	dev_t dev_new;
> >> +	int ret;
> >> +
> >> +	device_name = kzalloc(BTRFS_PATH_NAME_MAX, GFP_KERNEL);
> >> +	if (!device_name)
> >> +		return -ENOMEM;
> >>   
> >>   	rcu_read_lock();
> >> -	found = strcmp(rcu_str_deref(device->name), path);
> >> +	ret = sprintf(device_name, "%s", rcu_str_deref(device->name));
> > 
> > I wonder if the temporary allocation could be avoided, as it's the
> > exactly same path of the device, so it could be passed to lookup_bdev.
> 
>   Yeah, I tried but to no avail. Unless I drop the rcu read lock and
>   use the str directly as below.
> 
>   lookup_bdev(device->name->str, &dev_old);
> 
>   We do skip rcu access for device->name at a few locations.
> 
>   Also, pls note lookup_bdev() can't be called within rcu_read_lock(),
>   (calling sleep function warning).

I see, thank for checking it.

> 
> 
> >>   	rcu_read_unlock();
> >> +	if (!ret) {
> >> +		kfree(device_name);
> >> +		return -EINVAL;
> >> +	}
> >>   
> >> -	return found == 0;
> >> +	ret = lookup_bdev(device_name, &dev_old);
> >> +	kfree(device_name);
> >> +	if (ret)
> >> +		return ret;
> >> +
> >> +	ret = lookup_bdev(path, &dev_new);
> >> +	if (ret)
> >> +		return ret;
> >> +
> >> +	if (dev_old == dev_new)
> >> +		return 0;
> >> +
> >> +	return 1;
> >>   }
> >>   
> >>   /*
> >> @@ -577,7 +608,7 @@ static int btrfs_free_stale_devices(const char *path,
> >>   				continue;
> >>   			if (path && !device->name)
> >>   				continue;
> >> -			if (path && !device_path_matched(path, device))
> >> +			if (path && device_matched(device, path) != 0)
> > 
> > You've changed the fuction to return errors but it's not checked,
> > besides the memory allocation failure, the lookup functions could fail
> > for various reasons so I don't think it's safe to ignore the errors.
> 
> IMO there isn't much that the parent function should do even if the
> device_matched() returns an error for the reasons you stated.
> 
> Mainly because btrfs_free_stale_devices() OR btrfs_forget_devices()
> is used as a cleanup ops in the primary task functions such as
> btrfs_scan_one_device() and btrfs_init_new_device(). Even if we don't
> remove the stale there is no harm.

Right, so a comment explaining why it's ok to ignore errors should be
sufficient.

> Further btrfs_forget_devices() is called from btrfs_control_ioctl()
> which is a userland call for forget devices. So as we traverse the
> device list, if a device lookup fails IMO, it is ok to skip to the next
> device in the list and return the status of the device match.
> 
> Even more, IMO we should save the dev_t in the struct btrfs_device,
> upon which the device_matched() will go away altogether. This change
> is outside of the bug that we intended to fix here. I will clean that
> up separately.
diff mbox series

Patch

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 1b02c03a882c..559fdb0c4a0e 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -534,15 +534,46 @@  btrfs_get_bdev_and_sb(const char *device_path, fmode_t flags, void *holder,
 	return ret;
 }
 
-static bool device_path_matched(const char *path, struct btrfs_device *device)
+/*
+ * Check if the device in the 'path' matches with the device in the given
+ * struct btrfs_device '*device'.
+ * Returns:
+ *	0	If it is the same device.
+ *	1	If it is not the same device.
+ *	-errno	For error.
+ */
+static int device_matched(struct btrfs_device *device, const char *path)
 {
-	int found;
+	char *device_name;
+	dev_t dev_old;
+	dev_t dev_new;
+	int ret;
+
+	device_name = kzalloc(BTRFS_PATH_NAME_MAX, GFP_KERNEL);
+	if (!device_name)
+		return -ENOMEM;
 
 	rcu_read_lock();
-	found = strcmp(rcu_str_deref(device->name), path);
+	ret = sprintf(device_name, "%s", rcu_str_deref(device->name));
 	rcu_read_unlock();
+	if (!ret) {
+		kfree(device_name);
+		return -EINVAL;
+	}
 
-	return found == 0;
+	ret = lookup_bdev(device_name, &dev_old);
+	kfree(device_name);
+	if (ret)
+		return ret;
+
+	ret = lookup_bdev(path, &dev_new);
+	if (ret)
+		return ret;
+
+	if (dev_old == dev_new)
+		return 0;
+
+	return 1;
 }
 
 /*
@@ -577,7 +608,7 @@  static int btrfs_free_stale_devices(const char *path,
 				continue;
 			if (path && !device->name)
 				continue;
-			if (path && !device_path_matched(path, device))
+			if (path && device_matched(device, path) != 0)
 				continue;
 			if (fs_devices->opened) {
 				/* for an already deleted device return 0 */