diff mbox series

builtin/init-db: preemptively clear repo_fmt to avoid leaks

Message ID pull.1018.git.git.1620240022594.gitgitgadget@gmail.com (mailing list archive)
State New
Headers show
Series builtin/init-db: preemptively clear repo_fmt to avoid leaks | expand

Commit Message

Andrzej Hunt May 5, 2021, 6:40 p.m. UTC
From: Andrzej Hunt <ajrhunt@google.com>

init_db() could populate various fields on repo_fmt, we add a
clear_repo_fmt() call to guarantee that we won't leak any of repo_fmt's
members.

At this time, we do not allocate any new data into repo_fmt, but that
changes in the following patch that is currently in seen:
  https://lore.kernel.org/git/2fd7cb8c0983501e2af2f195aec81d7c17fb80e1.1618832277.git.gitgitgadget@gmail.com/
Because it adds the following allocation in init_db():
  repo_fmt.ref_storage = xstrdup(ref_storage_format);

The patch linked to above is only exposing a preexisting problem - we
can add clear_repository_format() independently to preemptively avoid
this class of leak entirely.

Leaks in init_db() are of little significance as this function is called
immediately at the end of cmd_init_db(). However a leak in init_db()
will cause all known leak-free tests (t0000+1+5) to start failing when
run with LSAN - preemptively plugging such leaks is therefore helpful to
avoid regressing on those tests, and more significantly helps the ongoing
efforts to get all of t0000-t0099 to run leak-free.

LSAN output from t0000 on seen:

Direct leak of 6 byte(s) in 1 object(s) allocated from:
    #0 0x4868b4 in strdup ../projects/compiler-rt/lib/asan/asan_interceptors.cpp:452:3
    #1 0xaa1218 in xstrdup wrapper.c:29:14
    #2 0x5a4b0a in init_db builtin/init-db.c:442:25
    #3 0x5a7a17 in cmd_init_db builtin/init-db.c:742:9
    #4 0x4ce8fe in run_builtin git.c:461:11
    #5 0x4ccbc8 in handle_builtin git.c:718:3
    #6 0x4cb0cc in run_argv git.c:785:4
    #7 0x4cb0cc in cmd_main git.c:916:19
    #8 0x6bf63d in main common-main.c:52:11
    #9 0x7f3222f5f349 in __libc_start_main (/lib64/libc.so.6+0x24349)

SUMMARY: AddressSanitizer: 6 byte(s) leaked in 1 allocation(s).

Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
---
    builtin/init-db: preemptively plug repo_fmt leak
    
    This patch preemptively plugs a leak which is currently only occuring in
    seen in hn/reftable.
    
    This patch is orthogonal to that series (that series starts allocating
    new memory into an already existing struct - and my patch only adds a
    clear call for that struct - in other words my patch is safe to use
    independent of hn/reftable). But there's also no good reason to use this
    patch independently of that series since we're not leaking prior to that
    series. I'm not sure what the optimal logistics with my patch are, feel
    free to integrate it into that series and/or to fold my change into the
    relevant commit if that's easiest!

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1018%2Fahunt%2Finit_db_leak-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1018/ahunt/init_db_leak-v1
Pull-Request: https://github.com/git/git/pull/1018

 builtin/init-db.c | 1 +
 1 file changed, 1 insertion(+)


base-commit: c3901c8daf043cdc16ffb20d3a410f3a2d5494fd

Comments

Johannes Schindelin May 5, 2021, 7:28 p.m. UTC | #1
Hi Andrzej,

On Wed, 5 May 2021, Andrzej Hunt via GitGitGadget wrote:

> diff --git a/builtin/init-db.c b/builtin/init-db.c
> index 31b718259a64..b25147ebaf59 100644
> --- a/builtin/init-db.c
> +++ b/builtin/init-db.c
> @@ -512,6 +512,7 @@ int init_db(const char *git_dir, const char *real_git_dir,
>  			       git_dir, len && git_dir[len-1] != '/' ? "/" : "");
>  	}
>
> +	clear_repository_format(&repo_fmt);

I am afraid that this might not be correct, as t0410.27 experienced a
segmentation fault (see
https://github.com/git/git/pull/1018/checks?check_run_id=2511749719#step:5:2845
for the full details):

-- snip --
[...]
 + repack_and_check -a bd6a66906d425462f8fbf1e3a61cda48cc4e456c
3251618c5193c77083052fb7d6ed5e8b0b227a94
+ rm -rf repo2
+ cp -r repo repo2
+ git -C repo2 repack -a -d
warning: reflog of 'HEAD' references pruned commits
warning: reflog of 'refs/heads/master' references pruned commits
Segmentation fault (core dumped)
error: last command exited with $?=139
not ok 27 - repack -d does not irreversibly delete promisor objects
-- snap --

When I encountered similar problems with my patches in the past, I found
the `--valgrind-only` option incredibly useful, i.e. something like

	cd t && sh t0410-*.sh -i -v -x --valgrind-only=27

Maybe you can give it a try?

Ciao,
Dscho

>  	free(original_git_dir);
>  	return 0;
>  }
>
> base-commit: c3901c8daf043cdc16ffb20d3a410f3a2d5494fd
> --
> gitgitgadget
>
>
Andrzej Hunt May 6, 2021, 5:40 a.m. UTC | #2
Hi Johannes,

On 05/05/2021 21:28, Johannes Schindelin wrote:
> Hi Andrzej,
> 
> On Wed, 5 May 2021, Andrzej Hunt via GitGitGadget wrote:
> 
>> diff --git a/builtin/init-db.c b/builtin/init-db.c
>> index 31b718259a64..b25147ebaf59 100644
>> --- a/builtin/init-db.c
>> +++ b/builtin/init-db.c
>> @@ -512,6 +512,7 @@ int init_db(const char *git_dir, const char *real_git_dir,
>>   			       git_dir, len && git_dir[len-1] != '/' ? "/" : "");
>>   	}
>>
>> +	clear_repository_format(&repo_fmt);
> 
> I am afraid that this might not be correct, as t0410.27 experienced a
> segmentation fault (see
> https://github.com/git/git/pull/1018/checks?check_run_id=2511749719#step:5:2845
> for the full details):


Thanks for spotting that. On further investigation this looks like a 
preexisting issue on seen (my github PR was based on seen - in hindsight 
that was probably not a good idea) - here's a CI run from seen this 
morning exhibiting the same failure:
   https://github.com/git/git/runs/2515095446?check_suite_focus=true

To gain more confidence, I've rebased my patch onto next, and I no 
longer hit any CI failures:
   https://github.com/ahunt/git/runs/2515309312?check_suite_focus=true
(I ran this on my fork because changing the base branch appears to have 
confused the PR's CI configuration.)


Nevertheless, I was still curious about what might be causing the 
failure in the first place: it appears to only happen in CI with gcc on 
Linux, and I was not able to repro on my machine using (tested with 
gcc-7.5 and gcc-10, with seen from this morning):
   make CC=gcc-10 T=t0410-partial-clone.sh test

I'm guessing that understanding this failure might require the help of 
someone with an Ubuntu install to debug (unless there's some easy way of 
using Github's Actions to run a bisect)?

-- snip --

ATB,

   Andrzej
Johannes Schindelin May 6, 2021, 2 p.m. UTC | #3
Hi Andrzej,

On Thu, 6 May 2021, Andrzej Hunt wrote:

> On 05/05/2021 21:28, Johannes Schindelin wrote:
>
> > On Wed, 5 May 2021, Andrzej Hunt via GitGitGadget wrote:
> >
> > > diff --git a/builtin/init-db.c b/builtin/init-db.c
> > > index 31b718259a64..b25147ebaf59 100644
> > > --- a/builtin/init-db.c
> > > +++ b/builtin/init-db.c
> > > @@ -512,6 +512,7 @@ int init_db(const char *git_dir, const char
> > > *real_git_dir,
> > >    		       git_dir, len && git_dir[len-1] != '/' ? "/" : "");
> > >    }
> > >
> > > +	clear_repository_format(&repo_fmt);
> >
> > I am afraid that this might not be correct, as t0410.27 experienced a
> > segmentation fault (see
> > https://github.com/git/git/pull/1018/checks?check_run_id=2511749719#step:5:2845
> > for the full details):
>
>
> Thanks for spotting that. On further investigation this looks like a
> preexisting issue on seen (my github PR was based on seen - in hindsight that
> was probably not a good idea) - here's a CI run from seen this morning
> exhibiting the same failure:
>   https://github.com/git/git/runs/2515095446?check_suite_focus=true
>
> To gain more confidence, I've rebased my patch onto next, and I no longer hit
> any CI failures:
>   https://github.com/ahunt/git/runs/2515309312?check_suite_focus=true
> (I ran this on my fork because changing the base branch appears to have
> confused the PR's CI configuration.)

Oh, I'm sorry! But it is great news that this is not _your_ patch's
problem (I had a brief look over it and wondered why it would cause this
segfault).

> Nevertheless, I was still curious about what might be causing the failure in
> the first place: it appears to only happen in CI with gcc on Linux, and I was
> not able to repro on my machine using (tested with gcc-7.5 and gcc-10, with
> seen from this morning):
>   make CC=gcc-10 T=t0410-partial-clone.sh test

There are more settings used in the CI run, indeed. So yeah, it would be
good to debug this in an environment where it does fail.

> I'm guessing that understanding this failure might require the help of someone
> with an Ubuntu install to debug (unless there's some easy way of using
> Github's Actions to run a bisect)?

As a matter of fact, there is a neat way in GitHub Actions to debug this:
the `action-tmate` Action. It installs and starts a `tmate` instance,
which is a `tmux` inside kind of a proxied SSH server.

This is what I usually do when I have time to track down such problems: I
edit `.github/workflows/main.yml` and push it into a branch in my fork.
The edits are essentially:

- I delete the `ci-config` job (because I know that I want to run this
  workflow and don't want to wait for that job to be queued and run)

- Obviously, this requires all the `needs: ci-config` and
  `if: needs.ci-config.outputs.enabled == 'yes'` lines to be deleted, too.

- I also delete the jobs, tasks, and matrix elements that are not
  necessary for my debugging session (in your case, that would be e.g. the
  `dockerized` job as well as the vectors in `regular` that are not
  `linux-gcc`).

- Then comes the most important part: after the
  `ci/print-test-failures.sh` task, I insert this task:

    - name: action-tmate
      if: failure()
      uses: mxschmitt/action-tmate@v3
      with:
        limit-access-to-actor: true

The `limit-access-to-actor` thing only applies if you uploaded your public
SSH key to your GitHub profile. Otherwise, anybody who monitors the run
could see the command-line to connect to your debugging instance and
interfere with you.

For details, see https://mxschmitt.github.io/action-tmate/

Thanks,
Dscho
diff mbox series

Patch

diff --git a/builtin/init-db.c b/builtin/init-db.c
index 31b718259a64..b25147ebaf59 100644
--- a/builtin/init-db.c
+++ b/builtin/init-db.c
@@ -512,6 +512,7 @@  int init_db(const char *git_dir, const char *real_git_dir,
 			       git_dir, len && git_dir[len-1] != '/' ? "/" : "");
 	}
 
+	clear_repository_format(&repo_fmt);
 	free(original_git_dir);
 	return 0;
 }