Message ID | 4c2f12dc417a192f4acfd804831401aadeeb9c42.1705135055.git.wqu@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | btrfs-progs: allow tree-checker to be triggered more frequently for btrfs-convert | expand |
On Sat, Jan 13, 2024 at 07:15:29PM +1030, Qu Wenruo wrote: > Since we got a recent bug report about tree-checker triggered for large > fs conversion, we need a properly way to trigger the problem for test > case purpose. > > To trigger that bug, we need to meet several conditions: > > - We need to read some tree blocks which has half-backed inodes > - We need a large enough enough fs to generate more tree blocks than > our cache. > > For our existing test cases, firstly the fs is not that large, thus > we may even go just one transaction to generate all the inodes. > > Secondly we have a global cache for tree blocks, which means a lot of > written tree blocks are still in the cache, thus won't trigger > tree-checker. > > To make the problem much easier for our existing test case to expose, > this patch would introduce a debug environment variable: > BTRFS_PROGS_DEBUG_BLOCKS_USED_THRESHOLD. > > This would affects the threshold for the transaction size, setting it to > a much smaller value would make the bug much easier to trigger. > > Signed-off-by: Qu Wenruo <wqu@suse.com> > --- > common/utils.c | 62 +++++++++++++++++++++++++++++++++++++++++++ > common/utils.h | 1 + > convert/source-ext2.c | 9 ++++++- > 3 files changed, 71 insertions(+), 1 deletion(-) > > diff --git a/common/utils.c b/common/utils.c > index 62f0e3f48b39..e6070791f5cc 100644 > --- a/common/utils.c > +++ b/common/utils.c > @@ -956,6 +956,68 @@ u8 rand_u8(void) > return (u8)(rand_u32()); > } > > +/* > + * Parse a u64 value from an environment variable. > + * > + * Supports unit suffixes "KMGTP", the suffix is always 2 ** 10 based. > + * With proper overflow detection. > + * > + * The string must end with '\0', anything unexpected non-suffix string, > + * including space, would lead to -EINVAL and no value updated. > + */ > +int get_env_u64(const char *env_name, u64 *value_ret) There already is a function for parsing sizes parse_size_from_string() in common/parse-utils.c.
On 2024/1/17 05:01, David Sterba wrote: > On Sat, Jan 13, 2024 at 07:15:29PM +1030, Qu Wenruo wrote: >> Since we got a recent bug report about tree-checker triggered for large >> fs conversion, we need a properly way to trigger the problem for test >> case purpose. >> >> To trigger that bug, we need to meet several conditions: >> >> - We need to read some tree blocks which has half-backed inodes >> - We need a large enough enough fs to generate more tree blocks than >> our cache. >> >> For our existing test cases, firstly the fs is not that large, thus >> we may even go just one transaction to generate all the inodes. >> >> Secondly we have a global cache for tree blocks, which means a lot of >> written tree blocks are still in the cache, thus won't trigger >> tree-checker. >> >> To make the problem much easier for our existing test case to expose, >> this patch would introduce a debug environment variable: >> BTRFS_PROGS_DEBUG_BLOCKS_USED_THRESHOLD. >> >> This would affects the threshold for the transaction size, setting it to >> a much smaller value would make the bug much easier to trigger. >> >> Signed-off-by: Qu Wenruo <wqu@suse.com> >> --- >> common/utils.c | 62 +++++++++++++++++++++++++++++++++++++++++++ >> common/utils.h | 1 + >> convert/source-ext2.c | 9 ++++++- >> 3 files changed, 71 insertions(+), 1 deletion(-) >> >> diff --git a/common/utils.c b/common/utils.c >> index 62f0e3f48b39..e6070791f5cc 100644 >> --- a/common/utils.c >> +++ b/common/utils.c >> @@ -956,6 +956,68 @@ u8 rand_u8(void) >> return (u8)(rand_u32()); >> } >> >> +/* >> + * Parse a u64 value from an environment variable. >> + * >> + * Supports unit suffixes "KMGTP", the suffix is always 2 ** 10 based. >> + * With proper overflow detection. >> + * >> + * The string must end with '\0', anything unexpected non-suffix string, >> + * including space, would lead to -EINVAL and no value updated. >> + */ >> +int get_env_u64(const char *env_name, u64 *value_ret) > > There already is a function for parsing sizes parse_size_from_string() > in common/parse-utils.c. Unfortunately that's not suitable. We don't want a invalid string to fully blow up the program, as the existing parser would call exit(1), especially we only need it for a debug environmental variable. Should I change the existing one to provide better error handling? (Which means around 16 call sites update), or is there some better solution? Thanks, Qu
On Wed, Jan 17, 2024 at 06:50:11AM +1030, Qu Wenruo wrote: > On 2024/1/17 05:01, David Sterba wrote: > > On Sat, Jan 13, 2024 at 07:15:29PM +1030, Qu Wenruo wrote: > >> Since we got a recent bug report about tree-checker triggered for large > >> fs conversion, we need a properly way to trigger the problem for test > >> case purpose. > >> > >> To trigger that bug, we need to meet several conditions: > >> > >> - We need to read some tree blocks which has half-backed inodes > >> - We need a large enough enough fs to generate more tree blocks than > >> our cache. > >> > >> For our existing test cases, firstly the fs is not that large, thus > >> we may even go just one transaction to generate all the inodes. > >> > >> Secondly we have a global cache for tree blocks, which means a lot of > >> written tree blocks are still in the cache, thus won't trigger > >> tree-checker. > >> > >> To make the problem much easier for our existing test case to expose, > >> this patch would introduce a debug environment variable: > >> BTRFS_PROGS_DEBUG_BLOCKS_USED_THRESHOLD. > >> > >> This would affects the threshold for the transaction size, setting it to > >> a much smaller value would make the bug much easier to trigger. > >> > >> Signed-off-by: Qu Wenruo <wqu@suse.com> > >> --- > >> common/utils.c | 62 +++++++++++++++++++++++++++++++++++++++++++ > >> common/utils.h | 1 + > >> convert/source-ext2.c | 9 ++++++- > >> 3 files changed, 71 insertions(+), 1 deletion(-) > >> > >> diff --git a/common/utils.c b/common/utils.c > >> index 62f0e3f48b39..e6070791f5cc 100644 > >> --- a/common/utils.c > >> +++ b/common/utils.c > >> @@ -956,6 +956,68 @@ u8 rand_u8(void) > >> return (u8)(rand_u32()); > >> } > >> > >> +/* > >> + * Parse a u64 value from an environment variable. > >> + * > >> + * Supports unit suffixes "KMGTP", the suffix is always 2 ** 10 based. > >> + * With proper overflow detection. > >> + * > >> + * The string must end with '\0', anything unexpected non-suffix string, > >> + * including space, would lead to -EINVAL and no value updated. > >> + */ > >> +int get_env_u64(const char *env_name, u64 *value_ret) > > > > There already is a function for parsing sizes parse_size_from_string() > > in common/parse-utils.c. > > Unfortunately that's not suitable. > > We don't want a invalid string to fully blow up the program, as the > existing parser would call exit(1), especially we only need it for a > debug environmental variable. I see. > Should I change the existing one to provide better error handling? > (Which means around 16 call sites update), or is there some better solution? Yeah, the parsers used for command line arguments are fine to exit as error handling but generic helper should not do that. There's arg_strotu64, so I'd keep the arg_ prefix for helpers that will exit on error. Looking at parse_size_from_string, it should return int or bool and the value will be in parameter. This is cleaner than using errno for that.
On 2024/1/17 11:38, David Sterba wrote: > On Wed, Jan 17, 2024 at 06:50:11AM +1030, Qu Wenruo wrote: >> On 2024/1/17 05:01, David Sterba wrote: >>> On Sat, Jan 13, 2024 at 07:15:29PM +1030, Qu Wenruo wrote: >>>> Since we got a recent bug report about tree-checker triggered for large >>>> fs conversion, we need a properly way to trigger the problem for test >>>> case purpose. >>>> >>>> To trigger that bug, we need to meet several conditions: >>>> >>>> - We need to read some tree blocks which has half-backed inodes >>>> - We need a large enough enough fs to generate more tree blocks than >>>> our cache. >>>> >>>> For our existing test cases, firstly the fs is not that large, thus >>>> we may even go just one transaction to generate all the inodes. >>>> >>>> Secondly we have a global cache for tree blocks, which means a lot of >>>> written tree blocks are still in the cache, thus won't trigger >>>> tree-checker. >>>> >>>> To make the problem much easier for our existing test case to expose, >>>> this patch would introduce a debug environment variable: >>>> BTRFS_PROGS_DEBUG_BLOCKS_USED_THRESHOLD. >>>> >>>> This would affects the threshold for the transaction size, setting it to >>>> a much smaller value would make the bug much easier to trigger. >>>> >>>> Signed-off-by: Qu Wenruo <wqu@suse.com> >>>> --- >>>> common/utils.c | 62 +++++++++++++++++++++++++++++++++++++++++++ >>>> common/utils.h | 1 + >>>> convert/source-ext2.c | 9 ++++++- >>>> 3 files changed, 71 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/common/utils.c b/common/utils.c >>>> index 62f0e3f48b39..e6070791f5cc 100644 >>>> --- a/common/utils.c >>>> +++ b/common/utils.c >>>> @@ -956,6 +956,68 @@ u8 rand_u8(void) >>>> return (u8)(rand_u32()); >>>> } >>>> >>>> +/* >>>> + * Parse a u64 value from an environment variable. >>>> + * >>>> + * Supports unit suffixes "KMGTP", the suffix is always 2 ** 10 based. >>>> + * With proper overflow detection. >>>> + * >>>> + * The string must end with '\0', anything unexpected non-suffix string, >>>> + * including space, would lead to -EINVAL and no value updated. >>>> + */ >>>> +int get_env_u64(const char *env_name, u64 *value_ret) >>> >>> There already is a function for parsing sizes parse_size_from_string() >>> in common/parse-utils.c. >> >> Unfortunately that's not suitable. >> >> We don't want a invalid string to fully blow up the program, as the >> existing parser would call exit(1), especially we only need it for a >> debug environmental variable. > > I see. > >> Should I change the existing one to provide better error handling? >> (Which means around 16 call sites update), or is there some better solution? > > Yeah, the parsers used for command line arguments are fine to exit as > error handling but generic helper should not do that. There's > arg_strotu64, so I'd keep the arg_ prefix for helpers that will exit on > error. > > Looking at parse_size_from_string, it should return int or bool and the > value will be in parameter. This is cleaner than using errno for that. > And I can take it one step further, make arg_* helper to utilize the proper parser version (which would return an int to indicate error), and call exit(1) to error out. By this, we can share the code and still keep the behavior. Although it would mean I have to change the call sites. I'll craft a proper series for the conversion/cleanup. Thanks, Qu
diff --git a/common/utils.c b/common/utils.c index 62f0e3f48b39..e6070791f5cc 100644 --- a/common/utils.c +++ b/common/utils.c @@ -956,6 +956,68 @@ u8 rand_u8(void) return (u8)(rand_u32()); } +/* + * Parse a u64 value from an environment variable. + * + * Supports unit suffixes "KMGTP", the suffix is always 2 ** 10 based. + * With proper overflow detection. + * + * The string must end with '\0', anything unexpected non-suffix string, + * including space, would lead to -EINVAL and no value updated. + */ +int get_env_u64(const char *env_name, u64 *value_ret) +{ + char *env_value_str; + char *retptr = NULL; + int shift = 0; + u64 value; + + env_value_str = getenv(env_name); + if (!env_value_str) + return -ENOENT; + value = strtoull(env_value_str, &retptr, 0); + /* No numeric string found. */ + if (retptr == env_value_str) + return -EINVAL; + if (value == ULLONG_MAX && errno == ERANGE) + return -ERANGE; + /* memparse_safe() like suffix detection. */ + switch (*retptr) { + case 'K': + case 'k': + shift = 10; + break; + case 'M': + case 'm': + shift = 20; + break; + case 'G': + case 'g': + shift = 30; + break; + case 'T': + case 't': + shift = 40; + break; + case 'P': + case 'p': + shift = 50; + break; + } + if (shift) { + retptr++; + if (value >> (64 - shift)) + return -ERANGE; + value <<= shift; + } + if (*retptr != '\0') + return -EINVAL; + pr_verbose(LOG_VERBOSE, "received env \"%s\" value %llu\n", + env_name, value); + *value_ret = value; + return 0; +} + void btrfs_config_init(void) { bconf.output_format = CMD_FORMAT_TEXT; diff --git a/common/utils.h b/common/utils.h index dcd817144f9c..30c75339b05b 100644 --- a/common/utils.h +++ b/common/utils.h @@ -132,6 +132,7 @@ u32 rand_u32(void); u64 rand_u64(void); unsigned int rand_range(unsigned int upper); void init_rand_seed(u64 seed); +int get_env_u64(const char *env_name, u64 *value_ret); char *btrfs_test_for_multiple_profiles(int fd); int btrfs_warn_multiple_profiles(int fd); diff --git a/convert/source-ext2.c b/convert/source-ext2.c index f56d79734715..e5f85111a711 100644 --- a/convert/source-ext2.c +++ b/convert/source-ext2.c @@ -30,6 +30,7 @@ #include "kernel-shared/file-item.h" #include "common/extent-cache.h" #include "common/messages.h" +#include "common/utils.h" #include "convert/common.h" #include "convert/source-fs.h" #include "convert/source-ext2.h" @@ -974,6 +975,11 @@ static int ext2_copy_inodes(struct btrfs_convert_context *cctx, ext2_ino_t ext2_ino; u64 objectid; struct btrfs_trans_handle *trans; + /* The hint on when to commit the transaction. */ + u64 blocks_used_threshold = SZ_2M; + + get_env_u64("BTRFS_PROGS_DEBUG_BLOCKS_USED_THRESHOLD", + &blocks_used_threshold); trans = btrfs_start_transaction(root, 1); if (IS_ERR(trans)) @@ -1014,7 +1020,8 @@ static int ext2_copy_inodes(struct btrfs_convert_context *cctx, * large enough to contain over 300 inlined files or * around 26k file extents. Which should be good enough. */ - if (trans->blocks_used >= SZ_2M / root->fs_info->nodesize) { + if (trans->blocks_used >= + (blocks_used_threshold / root->fs_info->nodesize)) { ret = btrfs_commit_transaction(trans, root); if (ret < 0) { errno = -ret;
Since we got a recent bug report about tree-checker triggered for large fs conversion, we need a properly way to trigger the problem for test case purpose. To trigger that bug, we need to meet several conditions: - We need to read some tree blocks which has half-backed inodes - We need a large enough enough fs to generate more tree blocks than our cache. For our existing test cases, firstly the fs is not that large, thus we may even go just one transaction to generate all the inodes. Secondly we have a global cache for tree blocks, which means a lot of written tree blocks are still in the cache, thus won't trigger tree-checker. To make the problem much easier for our existing test case to expose, this patch would introduce a debug environment variable: BTRFS_PROGS_DEBUG_BLOCKS_USED_THRESHOLD. This would affects the threshold for the transaction size, setting it to a much smaller value would make the bug much easier to trigger. Signed-off-by: Qu Wenruo <wqu@suse.com> --- common/utils.c | 62 +++++++++++++++++++++++++++++++++++++++++++ common/utils.h | 1 + convert/source-ext2.c | 9 ++++++- 3 files changed, 71 insertions(+), 1 deletion(-)