Message ID | 20231205111329.6652-1-ddiss@suse.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | btrfs: drop unused memparse() parameter | expand |
Looks good,
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
On Tue, Dec 05, 2023 at 10:13:29PM +1100, David Disseldorp wrote: > The @retptr parameter for memparse() is optional. > btrfs_devinfo_scrub_speed_max_store() doesn't use it for any input > validation, so the parameter can be dropped. Or should it be used for validation? memparse is also used in btrfs_chunk_size_store() that accepts whitespace as trailing characters (namely '\n' if the value is from echo).
On Tue, 5 Dec 2023 15:22:53 +0100, David Sterba wrote: > On Tue, Dec 05, 2023 at 10:13:29PM +1100, David Disseldorp wrote: > > The @retptr parameter for memparse() is optional. > > btrfs_devinfo_scrub_speed_max_store() doesn't use it for any input > > validation, so the parameter can be dropped. > > Or should it be used for validation? memparse is also used in > btrfs_chunk_size_store() that accepts whitespace as trailing characters > (namely '\n' if the value is from echo). It probably should have been used for validation when originally added, but the current behaviour is now part of the sysfs scrub_speed_max API. Failing on invalid input would break scripts which do things like echo clear > /sys/fs/btrfs/UUID/devinfo/1/scrub_speed_max Cheers, David
On Wed, Dec 06, 2023 at 11:21:43AM +1100, David Disseldorp wrote: > On Tue, 5 Dec 2023 15:22:53 +0100, David Sterba wrote: > > > On Tue, Dec 05, 2023 at 10:13:29PM +1100, David Disseldorp wrote: > > > The @retptr parameter for memparse() is optional. > > > btrfs_devinfo_scrub_speed_max_store() doesn't use it for any input > > > validation, so the parameter can be dropped. > > > > Or should it be used for validation? memparse is also used in > > btrfs_chunk_size_store() that accepts whitespace as trailing characters > > (namely '\n' if the value is from echo). > > It probably should have been used for validation when originally added, > but the current behaviour is now part of the sysfs scrub_speed_max API. > Failing on invalid input would break scripts which do things like > echo clear > /sys/fs/btrfs/UUID/devinfo/1/scrub_speed_max I'm not sure the 'part of the API' is a valid agrument here. It's documented that the value is in bytes and that suffixes can be passed for convenience. How come anybody would use 'clear' in the first place and expect it to work with undefined meaning?
On Wed, 6 Dec 2023 19:53:31 +0100, David Sterba wrote: > On Wed, Dec 06, 2023 at 11:21:43AM +1100, David Disseldorp wrote: > > On Tue, 5 Dec 2023 15:22:53 +0100, David Sterba wrote: > > > > > On Tue, Dec 05, 2023 at 10:13:29PM +1100, David Disseldorp wrote: > > > > The @retptr parameter for memparse() is optional. > > > > btrfs_devinfo_scrub_speed_max_store() doesn't use it for any input > > > > validation, so the parameter can be dropped. > > > > > > Or should it be used for validation? memparse is also used in > > > btrfs_chunk_size_store() that accepts whitespace as trailing characters > > > (namely '\n' if the value is from echo). > > > > It probably should have been used for validation when originally added, > > but the current behaviour is now part of the sysfs scrub_speed_max API. > > Failing on invalid input would break scripts which do things like > > echo clear > /sys/fs/btrfs/UUID/devinfo/1/scrub_speed_max > > I'm not sure the 'part of the API' is a valid agrument here. It's > documented that the value is in bytes and that suffixes can be passed > for convenience. How come anybody would use 'clear' in the first place > and expect it to work with undefined meaning? Most people don't read documentation :). If there's a willingness to accept any fallout from adding the validation then I'm happy to do that. Will send a v2. Cheers, David
On 2023/12/5 21:43, David Disseldorp wrote: > The @retptr parameter for memparse() is optional. > btrfs_devinfo_scrub_speed_max_store() doesn't use it for any input > validation, so the parameter can be dropped. To me, I believe it's better to completely get rid of memparse(). As you already found out, some suffix, especially "e|E" can screw up the result. E.g. "25e" would be interpreted as 25 with "e" as suffix, which is fine according to the rule. (without prefix, the base is 10, so only "25" is valid. Then the remaining part is interpreted as suffix). And since btrfs is not going to do pretty size output for sysfs (as most sysfs is not directly for end users, and we need accurate output), to be consistent there isn't much need for suffix handling either. So can't we just replace memparse() with kstrtoull()? Thanks, Qu > > Signed-off-by: David Disseldorp <ddiss@suse.de> > --- > fs/btrfs/sysfs.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c > index e6b51fb3ddc1e..75f3b774a4d83 100644 > --- a/fs/btrfs/sysfs.c > +++ b/fs/btrfs/sysfs.c > @@ -1779,10 +1779,9 @@ static ssize_t btrfs_devinfo_scrub_speed_max_store(struct kobject *kobj, > { > struct btrfs_device *device = container_of(kobj, struct btrfs_device, > devid_kobj); > - char *endptr; > unsigned long long limit; > > - limit = memparse(buf, &endptr); > + limit = memparse(buf, NULL); > WRITE_ONCE(device->scrub_speed_max, limit); > return len; > }
On Thu, Dec 07, 2023 at 01:01:50PM +1030, Qu Wenruo wrote: > > > On 2023/12/5 21:43, David Disseldorp wrote: > > The @retptr parameter for memparse() is optional. > > btrfs_devinfo_scrub_speed_max_store() doesn't use it for any input > > validation, so the parameter can be dropped. > > To me, I believe it's better to completely get rid of memparse(). > > As you already found out, some suffix, especially "e|E" can screw up the > result. > E.g. "25e" would be interpreted as 25 with "e" as suffix, which is fine > according to the rule. (without prefix, the base is 10, so only "25" is > valid. Then the remaining part is interpreted as suffix). > > And since btrfs is not going to do pretty size output for sysfs (as most > sysfs is not directly for end users, and we need accurate output), to be > consistent there isn't much need for suffix handling either. > > So can't we just replace memparse() with kstrtoull()? The value that can be read from the sysfs file is in bytes and it's so that applications do not need to interpret it, like multiplying with 1024. We'll probably never return the pretty values with suffixes in sysfs files. However, on the input side the suffixes are a convenience, setting to limit the throughput as '32m' is better than typing '32000000' and counting zeros or $((32*1024*1024)) or 33554432. This is why memparse is there and kstrtoull does not do that.
On Thu, Dec 07, 2023 at 10:52:55AM +1100, David Disseldorp wrote: > On Wed, 6 Dec 2023 19:53:31 +0100, David Sterba wrote: > > > On Wed, Dec 06, 2023 at 11:21:43AM +1100, David Disseldorp wrote: > > > On Tue, 5 Dec 2023 15:22:53 +0100, David Sterba wrote: > > > > > > > On Tue, Dec 05, 2023 at 10:13:29PM +1100, David Disseldorp wrote: > > > > > The @retptr parameter for memparse() is optional. > > > > > btrfs_devinfo_scrub_speed_max_store() doesn't use it for any input > > > > > validation, so the parameter can be dropped. > > > > > > > > Or should it be used for validation? memparse is also used in > > > > btrfs_chunk_size_store() that accepts whitespace as trailing characters > > > > (namely '\n' if the value is from echo). > > > > > > It probably should have been used for validation when originally added, > > > but the current behaviour is now part of the sysfs scrub_speed_max API. > > > Failing on invalid input would break scripts which do things like > > > echo clear > /sys/fs/btrfs/UUID/devinfo/1/scrub_speed_max > > > > I'm not sure the 'part of the API' is a valid agrument here. It's > > documented that the value is in bytes and that suffixes can be passed > > for convenience. How come anybody would use 'clear' in the first place > > and expect it to work with undefined meaning? > > Most people don't read documentation :). Unfortunatelly true. Here the expected practice is that size values on input can have a suffix so it should be consistent and understandable even without reading documentation. > If there's a willingness to > accept any fallout from adding the validation then I'm happy to do that. > Will send a v2. Yes, I think the simple fix is to add the handling as btrfs_chunk_size_store() does and this will be the pattern for any future memparsed values in sysfs. There's one outlier discard/kbps_limit that takes KiB directly so that's a plain integer. Eventually discard/max_discard_size can also use memparse.
On 2023/12/7 22:45, David Sterba wrote: > On Thu, Dec 07, 2023 at 01:01:50PM +1030, Qu Wenruo wrote: >> >> >> On 2023/12/5 21:43, David Disseldorp wrote: >>> The @retptr parameter for memparse() is optional. >>> btrfs_devinfo_scrub_speed_max_store() doesn't use it for any input >>> validation, so the parameter can be dropped. >> >> To me, I believe it's better to completely get rid of memparse(). >> >> As you already found out, some suffix, especially "e|E" can screw up the >> result. >> E.g. "25e" would be interpreted as 25 with "e" as suffix, which is fine >> according to the rule. (without prefix, the base is 10, so only "25" is >> valid. Then the remaining part is interpreted as suffix). >> >> And since btrfs is not going to do pretty size output for sysfs (as most >> sysfs is not directly for end users, and we need accurate output), to be >> consistent there isn't much need for suffix handling either. >> >> So can't we just replace memparse() with kstrtoull()? > > The value that can be read from the sysfs file is in bytes and it's so > that applications do not need to interpret it, like multiplying with > 1024. We'll probably never return the pretty values with suffixes in > sysfs files. > > However, on the input side the suffixes are a convenience, setting to > limit the throughput as '32m' is better than typing '32000000' and > counting zeros or $((32*1024*1024)) or 33554432. > > This is why memparse is there and kstrtoull does not do that. That suffix is causing confusion already, just check my "25e" case. (It does not only lead to huge number, but also lead to incorrect value even if we treat "e" as a suffix) Furthermore, the convenience argument is not that strong, you won't expect end users to do the change for a fs every time. Thus it's mostly managed by a small script or some other tool. In that case I don't think doing extra bash calculation is a big deal anyway. Thanks, Qu
On Fri, Dec 08, 2023 at 06:26:50AM +1030, Qu Wenruo wrote: > > The value that can be read from the sysfs file is in bytes and it's so > > that applications do not need to interpret it, like multiplying with > > 1024. We'll probably never return the pretty values with suffixes in > > sysfs files. > > > > However, on the input side the suffixes are a convenience, setting to > > limit the throughput as '32m' is better than typing '32000000' and > > counting zeros or $((32*1024*1024)) or 33554432. > > > > This is why memparse is there and kstrtoull does not do that. > > That suffix is causing confusion already, just check my "25e" case. > (It does not only lead to huge number, but also lead to incorrect value > even if we treat "e" as a suffix) > > Furthermore, the convenience argument is not that strong, you won't > expect end users to do the change for a fs every time. > Thus it's mostly managed by a small script or some other tool. > > In that case I don't think doing extra bash calculation is a big deal > anyway. This is a nice study of convenience vs extra work "that's no big deal". The sysfs files can be used by scripts, often times it's the only access to some settings (like /usr/bin/rescan-scsi-bus.sh/usr/bin/rescan-scsi-bus.sh). The value format, naming of the sysfs files is inconsistent and navigating to a particular directory and setting some magic value is I think common. Another example are trace points, or dynamic printk. There are tools abstracting that but sometimes it's a person that needs to type it and the shortcuts like the suffix are convenient because there's no extra need to type exact syntax and numbers for a $((...)) expression. Ask 3 people if they'd prefer to type "4m" or "$((4*1024*1024))". If you ask me I'm all for allowing "4m" and I did that several times when testing things like the discard tunables, chunk size or the scrub limits. I consider the sysfs files an interface for human interaction so even if it's used like that in 10% of time it's still saving typing or allowing one to focus on the real prolem instead of shell sytax.
On 2023/12/14 09:45, David Sterba wrote: > On Fri, Dec 08, 2023 at 06:26:50AM +1030, Qu Wenruo wrote: >>> The value that can be read from the sysfs file is in bytes and it's so >>> that applications do not need to interpret it, like multiplying with >>> 1024. We'll probably never return the pretty values with suffixes in >>> sysfs files. >>> >>> However, on the input side the suffixes are a convenience, setting to >>> limit the throughput as '32m' is better than typing '32000000' and >>> counting zeros or $((32*1024*1024)) or 33554432. >>> >>> This is why memparse is there and kstrtoull does not do that. >> >> That suffix is causing confusion already, just check my "25e" case. >> (It does not only lead to huge number, but also lead to incorrect value >> even if we treat "e" as a suffix) >> >> Furthermore, the convenience argument is not that strong, you won't >> expect end users to do the change for a fs every time. >> Thus it's mostly managed by a small script or some other tool. >> >> In that case I don't think doing extra bash calculation is a big deal >> anyway. > > This is a nice study of convenience vs extra work "that's no big deal". > The sysfs files can be used by scripts, often times it's the only access > to some settings (like /usr/bin/rescan-scsi-bus.sh/usr/bin/rescan-scsi-bus.sh). > The value format, naming of the sysfs files is inconsistent and > navigating to a particular directory and setting some magic value is I > think common. Another example are trace points, or dynamic printk. > > There are tools abstracting that but sometimes it's a person that needs > to type it and the shortcuts like the suffix are convenient because > there's no extra need to type exact syntax and numbers for a $((...)) > expression. Ask 3 people if they'd prefer to type "4m" or > "$((4*1024*1024))". > > If you ask me I'm all for allowing "4m" and I did that several times > when testing things like the discard tunables, chunk size or the scrub > limits. I consider the sysfs files an interface for human interaction so > even if it's used like that in 10% of time it's still saving typing or > allowing one to focus on the real prolem instead of shell sytax. > OK, then let me jump into the rabbit hole of stroxll family for my holiday projects, to make the memparse() itself more reliable and get rid of the caveats. Hopefully allowing callers to select which suffixes they want to use. (Well, at least I hope "25e" can give us a correct value after the fix) Thanks, Qu
diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c index e6b51fb3ddc1e..75f3b774a4d83 100644 --- a/fs/btrfs/sysfs.c +++ b/fs/btrfs/sysfs.c @@ -1779,10 +1779,9 @@ static ssize_t btrfs_devinfo_scrub_speed_max_store(struct kobject *kobj, { struct btrfs_device *device = container_of(kobj, struct btrfs_device, devid_kobj); - char *endptr; unsigned long long limit; - limit = memparse(buf, &endptr); + limit = memparse(buf, NULL); WRITE_ONCE(device->scrub_speed_max, limit); return len; }
The @retptr parameter for memparse() is optional. btrfs_devinfo_scrub_speed_max_store() doesn't use it for any input validation, so the parameter can be dropped. Signed-off-by: David Disseldorp <ddiss@suse.de> --- fs/btrfs/sysfs.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)