Message ID | patch-02.11-4049362e9b4-20220713T131601Z-avarab@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | submodule--helper: fix memory leaks | expand |
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > Use the less verbose { 0 }-initialization syntax rather than memset() > in builtin/submodule--helper.c, this doesn't make a difference in > terms of behavior, but as we're about to modify adjacent code makes > this more consistent, and lets us avoid worrying about when the > memset() happens v.s. a "goto cleanup". Ok. I wonder if we could reduce this kind of churn in the future by adding this to CodingGuidelines, e.g. "always use { 0 } for stack variables". Tangentially, do we require { NULL } when the first element is a pointer? (I'm not sure because this isn't in CodingGuidelines either AFAICT.) > Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> > --- > builtin/submodule--helper.c | 6 ++---- > 1 file changed, 2 insertions(+), 4 deletions(-) > > diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c > index fac52ade5e1..73717be957c 100644 > --- a/builtin/submodule--helper.c > +++ b/builtin/submodule--helper.c > @@ -1744,7 +1744,7 @@ static int module_clone(int argc, const char **argv, const char *prefix) > { > int dissociate = 0, quiet = 0, progress = 0, require_init = 0; > struct module_clone_data clone_data = MODULE_CLONE_DATA_INIT; > - struct list_objects_filter_options filter_options; > + struct list_objects_filter_options filter_options = { 0 }; > > struct option module_clone_options[] = { > OPT_STRING(0, "prefix", &clone_data.prefix, > @@ -1786,7 +1786,6 @@ static int module_clone(int argc, const char **argv, const char *prefix) > NULL > }; > > - memset(&filter_options, 0, sizeof(filter_options)); > argc = parse_options(argc, argv, prefix, module_clone_options, > git_submodule_helper_usage, 0); > > @@ -2563,7 +2562,7 @@ static int module_update(int argc, const char **argv, const char *prefix) > { > struct pathspec pathspec; > struct update_data opt = UPDATE_DATA_INIT; > - struct list_objects_filter_options filter_options; > + struct list_objects_filter_options filter_options = { 0 }; > int ret; > > struct option module_update_options[] = { > @@ -2623,7 +2622,6 @@ static int module_update(int argc, const char **argv, const char *prefix) > update_clone_config_from_gitmodules(&opt.max_jobs); > git_config(git_update_clone_config, &opt.max_jobs); > > - memset(&filter_options, 0, sizeof(filter_options)); > argc = parse_options(argc, argv, prefix, module_update_options, > git_submodule_helper_usage, 0); > > -- > 2.37.0.932.g7b7031e73bc
Glen Choo <chooglen@google.com> writes: > Ok. I wonder if we could reduce this kind of churn in the future by > adding this to CodingGuidelines, e.g. "always use { 0 } for stack > variables". Tangentially, do we require { NULL } when the first element > is a pointer? (I'm not sure because this isn't in CodingGuidelines > either AFAICT.) A valiable can legitimately be left uninitialized, or initialized to a known value that is not zero using designated initializers. So saying something like When zero-initializing an auto variable that is a struct or union in its definition, use "{ 0 }", whether the first member in the struct is of a number, a pointer, or a compound type. may be OK. I do not think we would want to say "always use X", as the world is not that simple.. We do favor designated initializers over traditional initialization in the order of members these days, so something like When declaring a struct/union variable or an array with initial value to some members or elements, consider using designated initializers, instead of listing the values in the order of members in the definition of the struct. would also be good. Thanks.
Junio C Hamano <gitster@pobox.com> writes: > Glen Choo <chooglen@google.com> writes: > >> Ok. I wonder if we could reduce this kind of churn in the future by >> adding this to CodingGuidelines, e.g. "always use { 0 } for stack >> variables". Tangentially, do we require { NULL } when the first element >> is a pointer? (I'm not sure because this isn't in CodingGuidelines >> either AFAICT.) > > A valiable can legitimately be left uninitialized, or initialized to > a known value that is not zero using designated initializers. So > saying something like > > When zero-initializing an auto variable that is a struct or > union in its definition, use "{ 0 }", whether the first member > in the struct is of a number, a pointer, or a compound type. > > may be OK. I do not think we would want to say "always use X", as > the world is not that simple.. Thanks. Also, I made a typo here, I meant to be more specific with regards to memset(), like: When zero-initializing an auto variable that is a struct or union in its definition, use "{ 0 }", whether the first member in the struct is of a number, a pointer, or a compound type. _Do not use memset()._ Unless there really is a legitimate reason to use memset(&my_auto_var, 0 sizeof(my_autovar)) that I've missed? > We do favor designated initializers over traditional initialization > in the order of members these days, so something like > > When declaring a struct/union variable or an array with initial > value to some members or elements, consider using designated > initializers, instead of listing the values in the order of > members in the definition of the struct. > > would also be good. Thanks. If I have time I'll send that proposal to CodingGuidelines, or someone else can send it (I don't mind either way).
On Wed, Jul 13 2022, Glen Choo wrote: > Junio C Hamano <gitster@pobox.com> writes: > >> Glen Choo <chooglen@google.com> writes: >> >>> Ok. I wonder if we could reduce this kind of churn in the future by >>> adding this to CodingGuidelines, e.g. "always use { 0 } for stack >>> variables". Tangentially, do we require { NULL } when the first element >>> is a pointer? (I'm not sure because this isn't in CodingGuidelines >>> either AFAICT.) >> >> A valiable can legitimately be left uninitialized, or initialized to >> a known value that is not zero using designated initializers. So >> saying something like >> >> When zero-initializing an auto variable that is a struct or >> union in its definition, use "{ 0 }", whether the first member >> in the struct is of a number, a pointer, or a compound type. >> >> may be OK. I do not think we would want to say "always use X", as >> the world is not that simple.. > > Thanks. Also, I made a typo here, I meant to be more specific with > regards to memset(), like: > > When zero-initializing an auto variable that is a struct or > union in its definition, use "{ 0 }", whether the first member > in the struct is of a number, a pointer, or a compound type. _Do not > use memset()._ It's curious that the { 0 } v.s. { NULL } form jumps out at people, but seemingly not that you don't memset(&x, NULL, ...). I.e. that we're already dealing with a form where C's "0 is NULL in pointer context" rules kick in :) So I wonder if we should say anything about the first member at all. On the other hand this is quite an odd bit of C syntax, and reveals a bit of on edge case or wart that I'm not happy with. It's why I believe GCC has a syntax extension so you can just do: struct foo x = {}; I.e. 6.7.8.10 and 6.7.8.21 disuss the semantics of this, basically that if there are fewer initializers than elements or members that the structure is initialized as though it were "static". But for the first member we rely on 0 and NULL being the same in pointer context (even though NULL my not be (void*)NULL!). So it's a tad nasty, it's basically doing the same as: const char *foo = 0; /* not NULL */ As an aside I wonder if we should say "C says this is undefined, but c'mon, let's just assume '#define NULL (void*)0'". I think in practice it's like the recently standardized 2's compliment, in that almost nobody can even remember a platform where it isn't true, and none are in current use (but maybe I'm wrong...). Anyway, I was curious about this so I tried the following locally: @@ type T; identifier I; @@ - T I; + T I = { 0 }; ... when strict when != \( I \| &I \) ( - memset(&I, 0, sizeof(I)); | - memset(&I, 0, sizeof(T)); ) Which aside from whitespace issues (that I've asked the cocci ML about) yields a sane result. What *doesn't* yield a sane result is getting rid of the "when strict" there, i.e. we must not do this in xdiff/xhistogram.c's histogram_diff(), as we "goto" to the memset() to re-zero-out the variable. But removing the "when strict" yields a change to 45 more files, the initial one with "when strict" is 76 files. With the exception of that histogram_diff() case (and I manually skimmed the code for the rest) it passes all tests at least... :) > Unless there really is a legitimate reason to use memset(&my_auto_var, 0 > sizeof(my_autovar)) that I've missed? Basically no, except when you need to do the init N times, as noted above. >> We do favor designated initializers over traditional initialization >> in the order of members these days, so something like >> >> When declaring a struct/union variable or an array with initial >> value to some members or elements, consider using designated >> initializers, instead of listing the values in the order of >> members in the definition of the struct. >> >> would also be good. > > Thanks. If I have time I'll send that proposal to CodingGuidelines, or > someone else can send it (I don't mind either way). Sounds good!
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > It's curious that the { 0 } v.s. { NULL } form jumps out at people, but > seemingly not that you don't memset(&x, NULL, ...). I.e. that we're > already dealing with a form where C's "0 is NULL in pointer context" > rules kick in :) Strictly speaking there are different mechanisms at play here. Literal "0" spelled in source code and assigned to a pointer variable assigns a NULL to the variable even on architectures where the representation of NULL is *not* "all 0-bit filling the width of the variable" and C language is what guarantees "NULL pointer is spelled as integer 0 in the source code". Also, the language, not runtime and pointer representation, is how second and subsequent members of a struct that are pointers are initialized to NULL (not necessarily to "all 0-bit filling the width of the member"). memset(&ptr, '\0', sizeof(ptr)), where ptr is a pointer variable or a struct/union with a pointer member, on the other hand, is unaware of how a NULL pointer is represented on the platform. All it can (and should) do is to fill the thing "ptr" with all 0-bit. On an exotic architecture, where NULL is not "all 0-bit filling the width of the variable", I do not think it would work. So from that point of view, using memset() as a replacement for zero initialization is simply wrong, but in practice people do not work on such a platform that the distinction matters anymore, hopefully? > So I wonder if we should say anything about the first member at all. The mention of the first member of the struct is historically important only because some checkers like sparse used to complain about struct { TYPE *member0; ... } var = { 0 }; the same way as char *string; string = 0; before they got fixed so that they know about "{ 0 }" convention to be silent about the former, while still warning about "even though C language says literal 0 is a valid way to spell NULL, you shouldn't do that" to the latter. These days, we can safely write "{ 0 }" without having to worry about the type of the first member.
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > On Wed, Jul 13 2022, Glen Choo wrote: > Anyway, I was curious about this so I tried the following locally: > > @@ > type T; > identifier I; > @@ > - T I; > + T I = { 0 }; > ... when strict > when != \( I \| &I \) > ( > - memset(&I, 0, sizeof(I)); > | > - memset(&I, 0, sizeof(T)); > ) > > > Which aside from whitespace issues (that I've asked the cocci ML about) > yields a sane result... >> ....... If I have time I'll send that proposal to CodingGuidelines, or >> someone else can send it (I don't mind either way). Adding an extra cocci check sounds like a great addition alongside the CodingGuidelines change (automatically checking the rule is way better than doing it manually of course). The fact that { 0 } is wrapped around two lines is annoying, e.g. - struct update_callback_data data; + struct update_callback_data data = { + 0 + }; struct rev_info rev; - memset(&data, 0, sizeof(data)); data.flags = flags; but since it goes away with "make style", I'm tempted to say that this check is worth having. Is it too confusing to have coccinelle recommend something that we expect users to fix afterwards?
diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c index fac52ade5e1..73717be957c 100644 --- a/builtin/submodule--helper.c +++ b/builtin/submodule--helper.c @@ -1744,7 +1744,7 @@ static int module_clone(int argc, const char **argv, const char *prefix) { int dissociate = 0, quiet = 0, progress = 0, require_init = 0; struct module_clone_data clone_data = MODULE_CLONE_DATA_INIT; - struct list_objects_filter_options filter_options; + struct list_objects_filter_options filter_options = { 0 }; struct option module_clone_options[] = { OPT_STRING(0, "prefix", &clone_data.prefix, @@ -1786,7 +1786,6 @@ static int module_clone(int argc, const char **argv, const char *prefix) NULL }; - memset(&filter_options, 0, sizeof(filter_options)); argc = parse_options(argc, argv, prefix, module_clone_options, git_submodule_helper_usage, 0); @@ -2563,7 +2562,7 @@ static int module_update(int argc, const char **argv, const char *prefix) { struct pathspec pathspec; struct update_data opt = UPDATE_DATA_INIT; - struct list_objects_filter_options filter_options; + struct list_objects_filter_options filter_options = { 0 }; int ret; struct option module_update_options[] = { @@ -2623,7 +2622,6 @@ static int module_update(int argc, const char **argv, const char *prefix) update_clone_config_from_gitmodules(&opt.max_jobs); git_config(git_update_clone_config, &opt.max_jobs); - memset(&filter_options, 0, sizeof(filter_options)); argc = parse_options(argc, argv, prefix, module_update_options, git_submodule_helper_usage, 0);
Use the less verbose { 0 }-initialization syntax rather than memset() in builtin/submodule--helper.c, this doesn't make a difference in terms of behavior, but as we're about to modify adjacent code makes this more consistent, and lets us avoid worrying about when the memset() happens v.s. a "goto cleanup". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> --- builtin/submodule--helper.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)