diff mbox series

btrfs: drop unused memparse() parameter

Message ID 20231205111329.6652-1-ddiss@suse.de (mailing list archive)
State New, archived
Headers show
Series btrfs: drop unused memparse() parameter | expand

Commit Message

David Disseldorp Dec. 5, 2023, 11:13 a.m. UTC
The @retptr parameter for memparse() is optional.
btrfs_devinfo_scrub_speed_max_store() doesn't use it for any input
validation, so the parameter can be dropped.

Signed-off-by: David Disseldorp <ddiss@suse.de>
---
 fs/btrfs/sysfs.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Comments

Johannes Thumshirn Dec. 5, 2023, 12:48 p.m. UTC | #1
Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
David Sterba Dec. 5, 2023, 2:22 p.m. UTC | #2
On Tue, Dec 05, 2023 at 10:13:29PM +1100, David Disseldorp wrote:
> The @retptr parameter for memparse() is optional.
> btrfs_devinfo_scrub_speed_max_store() doesn't use it for any input
> validation, so the parameter can be dropped.

Or should it be used for validation? memparse is also used in
btrfs_chunk_size_store() that accepts whitespace as trailing characters
(namely '\n' if the value is from echo).
David Disseldorp Dec. 6, 2023, 12:21 a.m. UTC | #3
On Tue, 5 Dec 2023 15:22:53 +0100, David Sterba wrote:

> On Tue, Dec 05, 2023 at 10:13:29PM +1100, David Disseldorp wrote:
> > The @retptr parameter for memparse() is optional.
> > btrfs_devinfo_scrub_speed_max_store() doesn't use it for any input
> > validation, so the parameter can be dropped.  
> 
> Or should it be used for validation? memparse is also used in
> btrfs_chunk_size_store() that accepts whitespace as trailing characters
> (namely '\n' if the value is from echo).

It probably should have been used for validation when originally added,
but the current behaviour is now part of the sysfs scrub_speed_max API.
Failing on invalid input would break scripts which do things like
  echo clear > /sys/fs/btrfs/UUID/devinfo/1/scrub_speed_max

Cheers, David
David Sterba Dec. 6, 2023, 6:53 p.m. UTC | #4
On Wed, Dec 06, 2023 at 11:21:43AM +1100, David Disseldorp wrote:
> On Tue, 5 Dec 2023 15:22:53 +0100, David Sterba wrote:
> 
> > On Tue, Dec 05, 2023 at 10:13:29PM +1100, David Disseldorp wrote:
> > > The @retptr parameter for memparse() is optional.
> > > btrfs_devinfo_scrub_speed_max_store() doesn't use it for any input
> > > validation, so the parameter can be dropped.  
> > 
> > Or should it be used for validation? memparse is also used in
> > btrfs_chunk_size_store() that accepts whitespace as trailing characters
> > (namely '\n' if the value is from echo).
> 
> It probably should have been used for validation when originally added,
> but the current behaviour is now part of the sysfs scrub_speed_max API.
> Failing on invalid input would break scripts which do things like
>   echo clear > /sys/fs/btrfs/UUID/devinfo/1/scrub_speed_max

I'm not sure the 'part of the API' is a valid agrument here. It's
documented that the value is in bytes and that suffixes can be passed
for convenience. How come anybody would use 'clear' in the first place
and expect it to work with undefined meaning?
David Disseldorp Dec. 6, 2023, 11:52 p.m. UTC | #5
On Wed, 6 Dec 2023 19:53:31 +0100, David Sterba wrote:

> On Wed, Dec 06, 2023 at 11:21:43AM +1100, David Disseldorp wrote:
> > On Tue, 5 Dec 2023 15:22:53 +0100, David Sterba wrote:
> >   
> > > On Tue, Dec 05, 2023 at 10:13:29PM +1100, David Disseldorp wrote:  
> > > > The @retptr parameter for memparse() is optional.
> > > > btrfs_devinfo_scrub_speed_max_store() doesn't use it for any input
> > > > validation, so the parameter can be dropped.    
> > > 
> > > Or should it be used for validation? memparse is also used in
> > > btrfs_chunk_size_store() that accepts whitespace as trailing characters
> > > (namely '\n' if the value is from echo).  
> > 
> > It probably should have been used for validation when originally added,
> > but the current behaviour is now part of the sysfs scrub_speed_max API.
> > Failing on invalid input would break scripts which do things like
> >   echo clear > /sys/fs/btrfs/UUID/devinfo/1/scrub_speed_max  
> 
> I'm not sure the 'part of the API' is a valid agrument here. It's
> documented that the value is in bytes and that suffixes can be passed
> for convenience. How come anybody would use 'clear' in the first place
> and expect it to work with undefined meaning?

Most people don't read documentation :). If there's a willingness to
accept any fallout from adding the validation then I'm happy to do that.
Will send a v2.

Cheers, David
Qu Wenruo Dec. 7, 2023, 2:31 a.m. UTC | #6
On 2023/12/5 21:43, David Disseldorp wrote:
> The @retptr parameter for memparse() is optional.
> btrfs_devinfo_scrub_speed_max_store() doesn't use it for any input
> validation, so the parameter can be dropped.

To me, I believe it's better to completely get rid of memparse().

As you already found out, some suffix, especially "e|E" can screw up the 
result.
E.g. "25e" would be interpreted as 25 with "e" as suffix, which is fine 
according to the rule. (without prefix, the base is 10, so only "25" is 
valid. Then the remaining part is interpreted as suffix).

And since btrfs is not going to do pretty size output for sysfs (as most 
sysfs is not directly for end users, and we need accurate output), to be 
consistent there isn't much need for suffix handling either.

So can't we just replace memparse() with kstrtoull()?

Thanks,
Qu


> 
> Signed-off-by: David Disseldorp <ddiss@suse.de>
> ---
>   fs/btrfs/sysfs.c | 3 +--
>   1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
> index e6b51fb3ddc1e..75f3b774a4d83 100644
> --- a/fs/btrfs/sysfs.c
> +++ b/fs/btrfs/sysfs.c
> @@ -1779,10 +1779,9 @@ static ssize_t btrfs_devinfo_scrub_speed_max_store(struct kobject *kobj,
>   {
>   	struct btrfs_device *device = container_of(kobj, struct btrfs_device,
>   						   devid_kobj);
> -	char *endptr;
>   	unsigned long long limit;
>   
> -	limit = memparse(buf, &endptr);
> +	limit = memparse(buf, NULL);
>   	WRITE_ONCE(device->scrub_speed_max, limit);
>   	return len;
>   }
David Sterba Dec. 7, 2023, 12:15 p.m. UTC | #7
On Thu, Dec 07, 2023 at 01:01:50PM +1030, Qu Wenruo wrote:
> 
> 
> On 2023/12/5 21:43, David Disseldorp wrote:
> > The @retptr parameter for memparse() is optional.
> > btrfs_devinfo_scrub_speed_max_store() doesn't use it for any input
> > validation, so the parameter can be dropped.
> 
> To me, I believe it's better to completely get rid of memparse().
> 
> As you already found out, some suffix, especially "e|E" can screw up the 
> result.
> E.g. "25e" would be interpreted as 25 with "e" as suffix, which is fine 
> according to the rule. (without prefix, the base is 10, so only "25" is 
> valid. Then the remaining part is interpreted as suffix).
> 
> And since btrfs is not going to do pretty size output for sysfs (as most 
> sysfs is not directly for end users, and we need accurate output), to be 
> consistent there isn't much need for suffix handling either.
> 
> So can't we just replace memparse() with kstrtoull()?

The value that can be read from the sysfs file is in bytes and it's so
that applications do not need to interpret it, like multiplying with
1024. We'll probably never return the pretty values with suffixes in
sysfs files.

However, on the input side the suffixes are a convenience, setting to
limit the throughput as '32m' is better than typing '32000000' and
counting zeros or $((32*1024*1024)) or 33554432.

This is why memparse is there and kstrtoull does not do that.
David Sterba Dec. 7, 2023, 1:55 p.m. UTC | #8
On Thu, Dec 07, 2023 at 10:52:55AM +1100, David Disseldorp wrote:
> On Wed, 6 Dec 2023 19:53:31 +0100, David Sterba wrote:
> 
> > On Wed, Dec 06, 2023 at 11:21:43AM +1100, David Disseldorp wrote:
> > > On Tue, 5 Dec 2023 15:22:53 +0100, David Sterba wrote:
> > >   
> > > > On Tue, Dec 05, 2023 at 10:13:29PM +1100, David Disseldorp wrote:  
> > > > > The @retptr parameter for memparse() is optional.
> > > > > btrfs_devinfo_scrub_speed_max_store() doesn't use it for any input
> > > > > validation, so the parameter can be dropped.    
> > > > 
> > > > Or should it be used for validation? memparse is also used in
> > > > btrfs_chunk_size_store() that accepts whitespace as trailing characters
> > > > (namely '\n' if the value is from echo).  
> > > 
> > > It probably should have been used for validation when originally added,
> > > but the current behaviour is now part of the sysfs scrub_speed_max API.
> > > Failing on invalid input would break scripts which do things like
> > >   echo clear > /sys/fs/btrfs/UUID/devinfo/1/scrub_speed_max  
> > 
> > I'm not sure the 'part of the API' is a valid agrument here. It's
> > documented that the value is in bytes and that suffixes can be passed
> > for convenience. How come anybody would use 'clear' in the first place
> > and expect it to work with undefined meaning?
> 
> Most people don't read documentation :).

Unfortunatelly true. Here the expected practice is that size values on
input can have a suffix so it should be consistent and understandable
even without reading documentation.

> If there's a willingness to
> accept any fallout from adding the validation then I'm happy to do that.
> Will send a v2.

Yes, I think the simple fix is to add the handling as
btrfs_chunk_size_store() does and this will be the pattern for any
future memparsed values in sysfs.

There's one outlier discard/kbps_limit that takes KiB directly so that's
a plain integer. Eventually discard/max_discard_size can also use memparse.
Qu Wenruo Dec. 7, 2023, 7:56 p.m. UTC | #9
On 2023/12/7 22:45, David Sterba wrote:
> On Thu, Dec 07, 2023 at 01:01:50PM +1030, Qu Wenruo wrote:
>>
>>
>> On 2023/12/5 21:43, David Disseldorp wrote:
>>> The @retptr parameter for memparse() is optional.
>>> btrfs_devinfo_scrub_speed_max_store() doesn't use it for any input
>>> validation, so the parameter can be dropped.
>>
>> To me, I believe it's better to completely get rid of memparse().
>>
>> As you already found out, some suffix, especially "e|E" can screw up the
>> result.
>> E.g. "25e" would be interpreted as 25 with "e" as suffix, which is fine
>> according to the rule. (without prefix, the base is 10, so only "25" is
>> valid. Then the remaining part is interpreted as suffix).
>>
>> And since btrfs is not going to do pretty size output for sysfs (as most
>> sysfs is not directly for end users, and we need accurate output), to be
>> consistent there isn't much need for suffix handling either.
>>
>> So can't we just replace memparse() with kstrtoull()?
>
> The value that can be read from the sysfs file is in bytes and it's so
> that applications do not need to interpret it, like multiplying with
> 1024. We'll probably never return the pretty values with suffixes in
> sysfs files.
>
> However, on the input side the suffixes are a convenience, setting to
> limit the throughput as '32m' is better than typing '32000000' and
> counting zeros or $((32*1024*1024)) or 33554432.
>
> This is why memparse is there and kstrtoull does not do that.

That suffix is causing confusion already, just check my "25e" case.
(It does not only lead to huge number, but also lead to incorrect value
even if we treat "e" as a suffix)

Furthermore, the convenience argument is not that strong, you won't
expect end users to do the change for a fs every time.
Thus it's mostly managed by a small script or some other tool.

In that case I don't think doing extra bash calculation is a big deal
anyway.

Thanks,
Qu
David Sterba Dec. 13, 2023, 11:15 p.m. UTC | #10
On Fri, Dec 08, 2023 at 06:26:50AM +1030, Qu Wenruo wrote:
> > The value that can be read from the sysfs file is in bytes and it's so
> > that applications do not need to interpret it, like multiplying with
> > 1024. We'll probably never return the pretty values with suffixes in
> > sysfs files.
> >
> > However, on the input side the suffixes are a convenience, setting to
> > limit the throughput as '32m' is better than typing '32000000' and
> > counting zeros or $((32*1024*1024)) or 33554432.
> >
> > This is why memparse is there and kstrtoull does not do that.
> 
> That suffix is causing confusion already, just check my "25e" case.
> (It does not only lead to huge number, but also lead to incorrect value
> even if we treat "e" as a suffix)
> 
> Furthermore, the convenience argument is not that strong, you won't
> expect end users to do the change for a fs every time.
> Thus it's mostly managed by a small script or some other tool.
> 
> In that case I don't think doing extra bash calculation is a big deal
> anyway.

This is a nice study of convenience vs extra work "that's no big deal".
The sysfs files can be used by scripts, often times it's the only access
to some settings (like /usr/bin/rescan-scsi-bus.sh/usr/bin/rescan-scsi-bus.sh).
The value format, naming of the sysfs files is inconsistent and
navigating to a particular directory and setting some magic value is I
think common. Another example are trace points, or dynamic printk.

There are tools abstracting that but sometimes it's a person that needs
to type it and the shortcuts like the suffix are convenient because
there's no extra need to type exact syntax and numbers for a $((...))
expression. Ask 3 people if they'd prefer to type "4m" or
"$((4*1024*1024))".

If you ask me I'm all for allowing "4m" and I did that several times
when testing things like the discard tunables, chunk size or the scrub
limits. I consider the sysfs files an interface for human interaction so
even if it's used like that in 10% of time it's still saving typing or
allowing one to focus on the real prolem instead of shell sytax.
Qu Wenruo Dec. 13, 2023, 11:46 p.m. UTC | #11
On 2023/12/14 09:45, David Sterba wrote:
> On Fri, Dec 08, 2023 at 06:26:50AM +1030, Qu Wenruo wrote:
>>> The value that can be read from the sysfs file is in bytes and it's so
>>> that applications do not need to interpret it, like multiplying with
>>> 1024. We'll probably never return the pretty values with suffixes in
>>> sysfs files.
>>>
>>> However, on the input side the suffixes are a convenience, setting to
>>> limit the throughput as '32m' is better than typing '32000000' and
>>> counting zeros or $((32*1024*1024)) or 33554432.
>>>
>>> This is why memparse is there and kstrtoull does not do that.
>>
>> That suffix is causing confusion already, just check my "25e" case.
>> (It does not only lead to huge number, but also lead to incorrect value
>> even if we treat "e" as a suffix)
>>
>> Furthermore, the convenience argument is not that strong, you won't
>> expect end users to do the change for a fs every time.
>> Thus it's mostly managed by a small script or some other tool.
>>
>> In that case I don't think doing extra bash calculation is a big deal
>> anyway.
>
> This is a nice study of convenience vs extra work "that's no big deal".
> The sysfs files can be used by scripts, often times it's the only access
> to some settings (like /usr/bin/rescan-scsi-bus.sh/usr/bin/rescan-scsi-bus.sh).
> The value format, naming of the sysfs files is inconsistent and
> navigating to a particular directory and setting some magic value is I
> think common. Another example are trace points, or dynamic printk.
>
> There are tools abstracting that but sometimes it's a person that needs
> to type it and the shortcuts like the suffix are convenient because
> there's no extra need to type exact syntax and numbers for a $((...))
> expression. Ask 3 people if they'd prefer to type "4m" or
> "$((4*1024*1024))".
>
> If you ask me I'm all for allowing "4m" and I did that several times
> when testing things like the discard tunables, chunk size or the scrub
> limits. I consider the sysfs files an interface for human interaction so
> even if it's used like that in 10% of time it's still saving typing or
> allowing one to focus on the real prolem instead of shell sytax.
>

OK, then let me jump into the rabbit hole of stroxll family for my
holiday projects, to make the memparse() itself more reliable and get
rid of the caveats.

Hopefully allowing callers to select which suffixes they want to use.

(Well, at least I hope "25e" can give us a correct value after the fix)

Thanks,
Qu
diff mbox series

Patch

diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index e6b51fb3ddc1e..75f3b774a4d83 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -1779,10 +1779,9 @@  static ssize_t btrfs_devinfo_scrub_speed_max_store(struct kobject *kobj,
 {
 	struct btrfs_device *device = container_of(kobj, struct btrfs_device,
 						   devid_kobj);
-	char *endptr;
 	unsigned long long limit;
 
-	limit = memparse(buf, &endptr);
+	limit = memparse(buf, NULL);
 	WRITE_ONCE(device->scrub_speed_max, limit);
 	return len;
 }