Message ID | f1f7fc97fe2fe5079365bb91c71fb7033378995d.1645320592.git.gitgitgadget@gmail.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 8d60e9d2010d34f8c8ca65967ed02a2b06d74dc5 |
Headers | show |
Series | Fix a couple small leaks in merge-ort | expand |
On Sun, Feb 20, 2022 at 01:29:50AM +0000, Elijah Newren via GitGitGadget wrote: > merge-ort.c | 15 +++++---------- > 1 file changed, 5 insertions(+), 10 deletions(-) Both versions of this patch look good to me (in fact, I appreciated seeing both v1 and v2, since v1 makes it more obvious what is changing, but v2 does the whole thing in a little bit of a cleaner way). > diff --git a/merge-ort.c b/merge-ort.c > index d85b1cd99e9..3d7f9feb6f7 100644 > --- a/merge-ort.c > +++ b/merge-ort.c > @@ -3086,12 +3086,11 @@ static int detect_and_process_renames(struct merge_options *opt, > struct tree *side1, > struct tree *side2) > { > - struct diff_queue_struct combined; > + struct diff_queue_struct combined = { 0 }; > struct rename_info *renames = &opt->priv->renames; > - int need_dir_renames, s, clean = 1; > + int need_dir_renames, s, i, clean = 1; And this entire patch looks good to me, but I did wonder about why "i" is an int here. Shouldn't it be a size_t instead? Looking at diff_queue_struct's definition, both "alloc" and "nr" are signed ints, when they should almost certainly be unsigned to avoid overflow. Thanks, Taylor
On Sun, Feb 20, 2022 at 6:35 PM Taylor Blau <me@ttaylorr.com> wrote: > > On Sun, Feb 20, 2022 at 01:29:50AM +0000, Elijah Newren via GitGitGadget wrote: > > merge-ort.c | 15 +++++---------- > > 1 file changed, 5 insertions(+), 10 deletions(-) > > Both versions of this patch look good to me (in fact, I appreciated > seeing both v1 and v2, since v1 makes it more obvious what is changing, > but v2 does the whole thing in a little bit of a cleaner way). Thanks for taking a look! > > diff --git a/merge-ort.c b/merge-ort.c > > index d85b1cd99e9..3d7f9feb6f7 100644 > > --- a/merge-ort.c > > +++ b/merge-ort.c > > @@ -3086,12 +3086,11 @@ static int detect_and_process_renames(struct merge_options *opt, > > struct tree *side1, > > struct tree *side2) > > { > > - struct diff_queue_struct combined; > > + struct diff_queue_struct combined = { 0 }; > > struct rename_info *renames = &opt->priv->renames; > > - int need_dir_renames, s, clean = 1; > > + int need_dir_renames, s, i, clean = 1; > > And this entire patch looks good to me, but I did wonder about why "i" > is an int here. Shouldn't it be a size_t instead? Looking at > diff_queue_struct's definition, both "alloc" and "nr" are signed ints, > when they should almost certainly be unsigned to avoid overflow. You may be right, but I'm not sure we're too worried right now about folks having billions of paths involved in renames; such a repo would make even the Microsoft monorepos look miniscule. ;-)
On Sun, Feb 20 2022, Elijah Newren via GitGitGadget wrote: > From: Elijah Newren <newren@gmail.com> > > detect_and_process_renames() detects renames on both sides of history > and then combines these into a single diff_queue_struct. The combined > diff_queue_struct needs to be able to hold the renames found on either > side, and since it knows the (maximum) size it needs, it pre-emptively > grows the array to the appropriate size: > > ALLOC_GROW(combined.queue, > renames->pairs[1].nr + renames->pairs[2].nr, > combined.alloc); > > It then collects the items from each side: > > collect_renames(opt, &combined, MERGE_SIDE1, ...) > collect_renames(opt, &combined, MERGE_SIDE2, ...) > > Note, though, that collect_renames() sometimes determines that some > pairs are unnecessary and does not include them in the combined array. > When it is done, detect_and_process_renames() frees this memory: > > if (combined.nr) { > ... > free(combined.queue); > } > > The problem is that sometimes even when there are pairs, none of them > are necessary. Instead of checking combined.nr, just remove the > if-check; free() knows to skip NULL pointers. This change fixes the > following memory leak, as reported by valgrind: > > ==PID== 192 bytes in 1 blocks are definitely lost in loss record 107 of 134 > ==PID== at 0xADDRESS: malloc > ==PID== by 0xADDRESS: realloc > ==PID== by 0xADDRESS: xrealloc (wrapper.c:126) > ==PID== by 0xADDRESS: detect_and_process_renames (merge-ort.c:3134) > ==PID== by 0xADDRESS: merge_ort_nonrecursive_internal (merge-ort.c:4610) > ==PID== by 0xADDRESS: merge_ort_internal (merge-ort.c:4709) > ==PID== by 0xADDRESS: merge_incore_recursive (merge-ort.c:4760) > ==PID== by 0xADDRESS: merge_ort_recursive (merge-ort-wrappers.c:57) > ==PID== by 0xADDRESS: try_merge_strategy (merge.c:753) > ==PID== by 0xADDRESS: cmd_merge (merge.c:1676) > ==PID== by 0xADDRESS: run_builtin (git.c:461) > ==PID== by 0xADDRESS: handle_builtin (git.c:713) > ==PID== by 0xADDRESS: run_argv (git.c:780) > ==PID== by 0xADDRESS: cmd_main (git.c:911) > ==PID== by 0xADDRESS: main (common-main.c:52) > > Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> > Signed-off-by: Elijah Newren <newren@gmail.com> > --- > merge-ort.c | 15 +++++---------- > 1 file changed, 5 insertions(+), 10 deletions(-) > > diff --git a/merge-ort.c b/merge-ort.c > index d85b1cd99e9..3d7f9feb6f7 100644 > --- a/merge-ort.c > +++ b/merge-ort.c > @@ -3086,12 +3086,11 @@ static int detect_and_process_renames(struct merge_options *opt, > struct tree *side1, > struct tree *side2) > { > - struct diff_queue_struct combined; > + struct diff_queue_struct combined = { 0 }; > struct rename_info *renames = &opt->priv->renames; > - int need_dir_renames, s, clean = 1; > + int need_dir_renames, s, i, clean = 1; > unsigned detection_run = 0; > > - memset(&combined, 0, sizeof(combined)); > if (!possible_renames(renames)) > goto cleanup; > > @@ -3175,13 +3174,9 @@ simple_cleanup: > free(renames->pairs[s].queue); > DIFF_QUEUE_CLEAR(&renames->pairs[s]); > } > - if (combined.nr) { > - int i; > - for (i = 0; i < combined.nr; i++) > - pool_diff_free_filepair(&opt->priv->pool, > - combined.queue[i]); > - free(combined.queue); > - } > + for (i = 0; i < combined.nr; i++) > + pool_diff_free_filepair(&opt->priv->pool, combined.queue[i]); > + free(combined.queue); > > return clean; > }
On Sun, Feb 20 2022, Elijah Newren via GitGitGadget wrote: [I fat-fingered an empty E-Mail reply to this earlier in https://lore.kernel.org/git/220601.86h7541yqj.gmgdl@evledraar.gmail.com/, sorry!] > From: Elijah Newren <newren@gmail.com> > > detect_and_process_renames() detects renames on both sides of history > and then combines these into a single diff_queue_struct. The combined > diff_queue_struct needs to be able to hold the renames found on either > side, and since it knows the (maximum) size it needs, it pre-emptively > grows the array to the appropriate size: > [...] > diff --git a/merge-ort.c b/merge-ort.c > index d85b1cd99e9..3d7f9feb6f7 100644 > --- a/merge-ort.c > +++ b/merge-ort.c > @@ -3086,12 +3086,11 @@ static int detect_and_process_renames(struct merge_options *opt, > struct tree *side1, > struct tree *side2) > { > - struct diff_queue_struct combined; > + struct diff_queue_struct combined = { 0 }; > struct rename_info *renames = &opt->priv->renames; > - int need_dir_renames, s, clean = 1; > + int need_dir_renames, s, i, clean = 1; > unsigned detection_run = 0; > > - memset(&combined, 0, sizeof(combined)); > if (!possible_renames(renames)) > goto cleanup; > > @@ -3175,13 +3174,9 @@ simple_cleanup: > free(renames->pairs[s].queue); > DIFF_QUEUE_CLEAR(&renames->pairs[s]); > } > - if (combined.nr) { > - int i; > - for (i = 0; i < combined.nr; i++) > - pool_diff_free_filepair(&opt->priv->pool, > - combined.queue[i]); > - free(combined.queue); > - } > + for (i = 0; i < combined.nr; i++) > + pool_diff_free_filepair(&opt->priv->pool, combined.queue[i]); > + free(combined.queue); > > return clean; > } I haven't been able to find whether this actually causes anything bad, but when I started digging into SANITIZE=leak failures I found that this change made t6435-merge-sparse.sh flaky in the v2.36.0 release when compiled with SANITIZE=leak. I.e. it "should" fail, but with 8d60e9d2010 (merge-ort: fix small memory leak in detect_and_process_renames(), 2022-02-20) it will sometimes succeed. I bisected it between 2.35.0..2.36.0 with: git bisect run sh -c 'make SANITIZE=leak && (cd t && ! (for i in $(seq 1 20); do ./t6435-merge-sparse.sh >/dev/null && echo $? || echo $?; done | grep -q 0))' I.e. "fail if we succeed" (there's some redundancy in the ad-hoc one-liner). Manually debugging it with: diff --git a/dir.c b/dir.c index d91295f2bcd..3e860769c26 100644 --- a/dir.c +++ b/dir.c @@ -878,6 +878,8 @@ void add_pattern(const char *string, const char *base, unsigned flags; int nowildcardlen; + fprintf(stderr, "adding pattern %s\n", string); + parse_path_pattern(&string, &patternlen, &flags, &nowildcardlen); if (flags & PATTERN_FLAG_MUSTBEDIR) { FLEXPTR_ALLOC_MEM(pattern, pattern, string, patternlen); Turns up this odd case, i.e. we "should" fail at the end of the "setup" in this bit of test code: [...] test_commit_this ours && git config core.sparseCheckout true && echo "/checked-out" >.git/info/sparse-checkout && git reset --hard && test_must_fail git merge their Here we leak in "git reset", so we "should" get a leak like: [...] + git tag ours + git config core.sparseCheckout true + echo /checked-out + git reset --hard adding pattern /checked-out adding pattern /checked-out adding pattern /checked-out HEAD is now at 3dfe889 ours ================================================================= ==25385==ERROR: LeakSanitizer: detected memory leaks Indirect leak of 512 byte(s) in 1 object(s) allocated from: #0 0x7f9652ef50aa in __interceptor_calloc ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:90 #1 0x6ab6dc in xcalloc /home/avar/g/git/wrapper.c:140 #2 0x58e5ac in alloc_table /home/avar/g/git/hashmap.c:79 #3 0x58e8e9 in hashmap_init /home/avar/g/git/hashmap.c:168 #4 0x569808 in add_patterns_from_buffer /home/avar/g/git/dir.c:1136 #5 0x5697a1 in add_patterns /home/avar/g/git/dir.c:1124 #6 0x56996a in add_patterns_from_file_to_list /home/avar/g/git/dir.c:1164 #7 0x56e0b2 in get_sparse_checkout_patterns /home/avar/g/git/dir.c:3273 #8 0x56a240 in init_sparse_checkout_patterns /home/avar/g/git/dir.c:1451 [...] That's consistent with what I see before, but sometimes it'll succeed like this: [...] + git tag ours + git config core.sparseCheckout true + echo /checked-out + git reset --hard adding pattern /checked-out HEAD is now at 3dfe889 ours [...] I.e. for some reason the same "reset --hard" now adds one, not three patterns (which before were all identical). Now, the "flaky success" with SANITIZE=leak does appear to be new in 8d60e9d2010, but before that running this in a loop reveals that the 2nd test sometimes succeeds, so perhaps the underlying "issue" isn't new. I say "issue" because I haven't dug enough to see if this has any impact on anything, and the failure I discovered doesn't per-se matter now. But perhaps this observed indeterminism is pointing the way to some deeper bug here? Or maybe it isn't...
diff --git a/merge-ort.c b/merge-ort.c index d85b1cd99e9..3d7f9feb6f7 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -3086,12 +3086,11 @@ static int detect_and_process_renames(struct merge_options *opt, struct tree *side1, struct tree *side2) { - struct diff_queue_struct combined; + struct diff_queue_struct combined = { 0 }; struct rename_info *renames = &opt->priv->renames; - int need_dir_renames, s, clean = 1; + int need_dir_renames, s, i, clean = 1; unsigned detection_run = 0; - memset(&combined, 0, sizeof(combined)); if (!possible_renames(renames)) goto cleanup; @@ -3175,13 +3174,9 @@ simple_cleanup: free(renames->pairs[s].queue); DIFF_QUEUE_CLEAR(&renames->pairs[s]); } - if (combined.nr) { - int i; - for (i = 0; i < combined.nr; i++) - pool_diff_free_filepair(&opt->priv->pool, - combined.queue[i]); - free(combined.queue); - } + for (i = 0; i < combined.nr; i++) + pool_diff_free_filepair(&opt->priv->pool, combined.queue[i]); + free(combined.queue); return clean; }