[v2,1/1] maintenance: fix SEGFAULT when no repository

Message ID	20201126204141.1438-2-rafaeloliveira.cs@gmail.com (mailing list archive)
State	Accepted
Commit	e72f7defc4f454cc7ad512d9a16a78b83f2606d8
Headers	show Return-Path: <git-owner@kernel.org> From: Rafael Silva <rafaeloliveira.cs@gmail.com> To: git@vger.kernel.org Cc: Derrick Stolee <dstolee@microsoft.com>, =?utf-8?b?Tmd1eeG7hW4gVGjDoWkg?= =?utf-8?b?Tmfhu41jIER1eQ==?= <pclouds@gmail.com>, Junio C Hamano <gitster@pobox.com>, Eric Sunshine <sunshine@sunshineco.com>, =?utf-8?q?Martin_=C3=85gren?= <martin.agren@gmail.com>, =?utf-8?q?SZEDER_G?= =?utf-8?q?=C3=A1bor?= <szeder.dev@gmail.com>, Rafael Silva <rafaeloliveira.cs@gmail.com> Subject: [PATCH v2 1/1] maintenance: fix SEGFAULT when no repository Date: Thu, 26 Nov 2020 20:41:41 +0000 Message-Id: <20201126204141.1438-2-rafaeloliveira.cs@gmail.com> In-Reply-To: <20201126204141.1438-1-rafaeloliveira.cs@gmail.com> References: <20201124164405.29327-1-rafaeloliveira.cs@gmail.com> <20201126204141.1438-1-rafaeloliveira.cs@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	maintenance: Fix SEGFAULT when running outside of a repository \| expand [v2,0/1] maintenance: Fix SEGFAULT when running outside of a repository [v2,1/1] maintenance: fix SEGFAULT when no repository

Rafael Silva Nov. 26, 2020, 8:41 p.m. UTC

The "git maintenance run" and "git maintenance start/stop" commands
holds a file-based lock at the .git/maintenance.lock and
.git/schedule.lock respectively. These locks are used to ensure only
one maintenance process is executed at the time as both operations
involves writing data into the git repository.

The path to the lock file is built using
"the_repository->objects->odb->path" that results in SEGFAULT when we
have no repository available as "the_repository->objects->odb" is
set to NULL.

Let's teach maintenance command to use RUN_SETUP option that will
provide the validation and fail when running outside of a repository.
Hence fixing the SEGFAULT for all three operations and making the
behaviour consistent across all subcommands.

Setting the RUN_SETUP also provides the same protection for all
subcommands given that the "register" and "unregister" also requires to
be executed inside a repository.

Furthermore let's remove the local validation implemented by the
"register" and "unregister" as this will not be required anymore with
the new option.

Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com>
---
 builtin/gc.c           | 7 -------
 git.c                  | 2 +-
 t/t7900-maintenance.sh | 8 ++++++++
 3 files changed, 9 insertions(+), 8 deletions(-)

Derrick Stolee Nov. 27, 2020, 8:43 p.m. UTC | #1

On 11/26/2020 3:41 PM, Rafael Silva wrote:
> The "git maintenance run" and "git maintenance start/stop" commands
> holds a file-based lock at the .git/maintenance.lock and
> .git/schedule.lock respectively. These locks are used to ensure only
> one maintenance process is executed at the time as both operations
> involves writing data into the git repository.
> 
> The path to the lock file is built using
> "the_repository->objects->odb->path" that results in SEGFAULT when we
> have no repository available as "the_repository->objects->odb" is
> set to NULL.
> 
> Let's teach maintenance command to use RUN_SETUP option that will
> provide the validation and fail when running outside of a repository.
> Hence fixing the SEGFAULT for all three operations and making the
> behaviour consistent across all subcommands.
> 
> Setting the RUN_SETUP also provides the same protection for all
> subcommands given that the "register" and "unregister" also requires to
> be executed inside a repository.
> 
> Furthermore let's remove the local validation implemented by the
> "register" and "unregister" as this will not be required anymore with
> the new option.

Thank you for this very clean patch!

Reviewed-by: Derrick Stolee <dstolee@microsoft.com>

Josh Steadmon Dec. 8, 2020, 8:12 p.m. UTC | #2

On 2020.11.26 20:41, Rafael Silva wrote:
> diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
> index d9e68bb2bf..ae5c29b0ff 100755
> --- a/t/t7900-maintenance.sh
> +++ b/t/t7900-maintenance.sh
> @@ -441,4 +441,12 @@ test_expect_success 'register preserves existing strategy' '
>  	test_config maintenance.strategy incremental
>  '
>  
> +test_execpt_success 'fails when running outside of a repository' '
> +	nongit test_must_fail git maintenance run &&
> +	nongit test_must_fail git maintenance stop &&
> +	nongit test_must_fail git maintenance start &&
> +	nongit test_must_fail git maintenance register &&
> +	nongit test_must_fail git maintenance unregister
> +'
> +
>  test_done

Caught a typo here, sending this as a squash patch since it's already in
next:

-- >8 --
Subject: [PATCH] t7900: fix typo: "test_execpt_success"

Signed-off-by: Josh Steadmon <steadmon@google.com>
---
 t/t7900-maintenance.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
index 4ca3617173..8c061197a6 100755
--- a/t/t7900-maintenance.sh
+++ b/t/t7900-maintenance.sh
@@ -441,7 +441,7 @@ test_expect_success 'register preserves existing strategy' '
 	test_config maintenance.strategy incremental
 '
 
-test_execpt_success 'fails when running outside of a repository' '
+test_expect_success 'fails when running outside of a repository' '
 	nongit test_must_fail git maintenance run &&
 	nongit test_must_fail git maintenance stop &&
 	nongit test_must_fail git maintenance start &&

Junio C Hamano Dec. 8, 2020, 9:58 p.m. UTC | #3

Josh Steadmon <steadmon@google.com> writes:

> Caught a typo here, sending this as a squash patch since it's already in
> next:

The breakage and the fix looks obvious to me, but...

How did CI allow 'next' to pass with such a typo, I wonder?
Moreover, my pre-push tests of all the integration branches
I didn't notice this to fail, but I cannot see how it could
have been succeeded.  Puzzled...

Will queue, thanks.

>
> -- >8 --
> Subject: [PATCH] t7900: fix typo: "test_execpt_success"
>
> Signed-off-by: Josh Steadmon <steadmon@google.com>
> ---
>  t/t7900-maintenance.sh | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
> index 4ca3617173..8c061197a6 100755
> --- a/t/t7900-maintenance.sh
> +++ b/t/t7900-maintenance.sh
> @@ -441,7 +441,7 @@ test_expect_success 'register preserves existing strategy' '
>  	test_config maintenance.strategy incremental
>  '
>  
> -test_execpt_success 'fails when running outside of a repository' '
> +test_expect_success 'fails when running outside of a repository' '
>  	nongit test_must_fail git maintenance run &&
>  	nongit test_must_fail git maintenance stop &&
>  	nongit test_must_fail git maintenance start &&

Junio C Hamano Dec. 8, 2020, 10:17 p.m. UTC | #4

Junio C Hamano <gitster@pobox.com> writes:

> Josh Steadmon <steadmon@google.com> writes:
>
>> Caught a typo here, sending this as a squash patch since it's already in
>> next:
>
> The breakage and the fix looks obvious to me, but...
>
> How did CI allow 'next' to pass with such a typo, I wonder?
> Moreover, my pre-push tests of all the integration branches
> I didn't notice this to fail, but I cannot see how it could
> have been succeeded.  Puzzled...

That is because of this:

    $ (sh t7900-maintenance.sh 2>&1; echo $?) | tail -5
    ok 25 - register preserves existing strategy
    t7900-maintenance.sh: line 444: test_execpt_success: command not found
    # passed all 25 test(s)
    1..25
    0

The story is the same with prove.

    $ prove t7900-maintenance-sh
    t7900-maintenance.sh .. 24/? t7900-maintenance.sh: line 444: test_execpt_success: command not found
    t7900-maintenance.sh .. ok
    All tests successful.
    Files=1, Tests=25,  2 wallclock secs ( 0.02 usr  0.01 sys +  0.97 cusr  0.97 csys =  1.97 CPU)
    Result: PASS

Since this typo appeared immediately before test_done, we _could_
improve test_done to pay attention to $? when it starts (and in a
similar fashion, we _could_ also check $? at the beginning of the
test_expect_* for the previous step), but I do not think that is a
good approach that would scale well.  There are legitimate reasons
we have to write things other than test_expect_* at the top level
of the script (e.g. test helper function may have to be defined to
be shared amongst the test pieces in the same script).

I wonder if it is a good direction to go to run the tests with the
"set -e" option on, and accept its peculiarities.

Rafael Silva Dec. 9, 2020, 9:29 a.m. UTC | #5

Josh Steadmon writes:

> On 2020.11.26 20:41, Rafael Silva wrote:
>> diff --git a/t/t7900-maintenance.sh b/t/t7900-maintenance.sh
>> index d9e68bb2bf..ae5c29b0ff 100755
>> --- a/t/t7900-maintenance.sh
>> +++ b/t/t7900-maintenance.sh
>> @@ -441,4 +441,12 @@ test_expect_success 'register preserves existing strategy' '
>>  	test_config maintenance.strategy incremental
>>  '
>>  
>> +test_execpt_success 'fails when running outside of a repository' '
>> +	nongit test_must_fail git maintenance run &&
>> +	nongit test_must_fail git maintenance stop &&
>> +	nongit test_must_fail git maintenance start &&
>> +	nongit test_must_fail git maintenance register &&
>> +	nongit test_must_fail git maintenance unregister
>> +'
>> +
>>  test_done
>
> Caught a typo here, sending this as a squash patch since it's already in
> next:
>

Ufff. For some reason I completely missed the test error message when
working on the v2.

Thank you Josh, for the catch and quick patch.

Apologize guys for such mistake.

Christian Couder Dec. 24, 2020, 8:12 a.m. UTC | #6

On Wed, Dec 9, 2020 at 2:16 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Junio C Hamano <gitster@pobox.com> writes:
>
> > Josh Steadmon <steadmon@google.com> writes:
> >
> >> Caught a typo here, sending this as a squash patch since it's already in
> >> next:
> >
> > The breakage and the fix looks obvious to me, but...
> >
> > How did CI allow 'next' to pass with such a typo, I wonder?
> > Moreover, my pre-push tests of all the integration branches
> > I didn't notice this to fail, but I cannot see how it could
> > have been succeeded.  Puzzled...
>
> That is because of this:
>
>     $ (sh t7900-maintenance.sh 2>&1; echo $?) | tail -5
>     ok 25 - register preserves existing strategy
>     t7900-maintenance.sh: line 444: test_execpt_success: command not found
>     # passed all 25 test(s)
>     1..25
>     0
>
> The story is the same with prove.
>
>     $ prove t7900-maintenance-sh
>     t7900-maintenance.sh .. 24/? t7900-maintenance.sh: line 444: test_execpt_success: command not found
>     t7900-maintenance.sh .. ok
>     All tests successful.
>     Files=1, Tests=25,  2 wallclock secs ( 0.02 usr  0.01 sys +  0.97 cusr  0.97 csys =  1.97 CPU)
>     Result: PASS
>
> Since this typo appeared immediately before test_done, we _could_
> improve test_done to pay attention to $? when it starts (and in a
> similar fashion, we _could_ also check $? at the beginning of the
> test_expect_* for the previous step), but I do not think that is a
> good approach that would scale well.  There are legitimate reasons
> we have to write things other than test_expect_* at the top level
> of the script (e.g. test helper function may have to be defined to
> be shared amongst the test pieces in the same script).
>
> I wonder if it is a good direction to go to run the tests with the
> "set -e" option on, and accept its peculiarities.

Another solution could be to define a command_not_found_handle
function as bourne shells should call that.

By the way it's not the first time we get such an issue, see:

https://lore.kernel.org/git/CAP8UFD15+p+xKwJ=B9WVsrc+2TvLHKmu78SBCLUFZVSYoTtbbg@mail.gmail.com/

Junio C Hamano Dec. 24, 2020, 2:14 p.m. UTC | #7

Christian Couder <christian.couder@gmail.com> writes:

>> I wonder if it is a good direction to go to run the tests with the
>> "set -e" option on, and accept its peculiarities.
>
> Another solution could be to define a command_not_found_handle
> function as bourne shells should call that.

I am reasonably sure it is bash-ism and a rather stale stackexchange
question seems to say that command_not_found_handler function (note
the 'r' at the end) is its equivalent in zsh.

Having said that, we already have other bash-specific debugging
support in our test harness, and it may not be a bad idea to use the
facility to catch these bugs, even if the support is available only
when running the tests under bash and no other shell.

> By the way it's not the first time we get such an issue, see:
>
> https://lore.kernel.org/git/CAP8UFD15+p+xKwJ=B9WVsrc+2TvLHKmu78SBCLUFZVSYoTtbbg@mail.gmail.com/

Excellent memory.  Thanks.

[v2,1/1] maintenance: fix SEGFAULT when no repository

Commit Message

Comments

Patch