diff mbox series

[v2,3/4] commit-graph.c: write non-split graphs as read-only

Message ID 86cf29ce9c1e6dc1fc881458c18850c2893b092a.1588004647.git.me@ttaylorr.com (mailing list archive)
State New, archived
Headers show
Series commit-graph: write non-split graphs as read-only | expand

Commit Message

Taylor Blau April 27, 2020, 4:28 p.m. UTC
In the previous commit, Git learned 'hold_lock_file_for_update_mode' to
allow the caller to specify the permission bits (prior to further
adjustment by the umask and shared repository permissions) used when
acquiring a temporary file.

Use this in the commit-graph machinery for writing a non-split graph to
acquire an opened temporary file with permissions read-only permissions
to match the split behavior. (In the split case, Git uses
git_mkstemp_mode' for each of the commit-graph layers with permission
bits '0444').

One can notice this discrepancy when moving a non-split graph to be part
of a new chain. This causes a commit-graph chain where all layers have
read-only permission bits, except for the base layer, which is writable
for the current user.

Resolve this discrepancy by using the new
'hold_lock_file_for_update_mode' and passing the desired permission
bits.

Doing so causes some test fallout in t5318 and t6600. In t5318, this
occurs in tests that corrupt a commit-graph file by writing into it. For
these, 'chmod u+w'-ing the file beforehand resolves the issue. The
additional spot in 'corrupt_graph_verify' is necessary because of the
extra 'git commit-graph write' beforehand (which *does* rewrite the
commit-graph file). In t6600, this is caused by copying a read-only
commit-graph file into place and then trying to replace it. For these,
make these files writable.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 commit-graph.c          |  3 ++-
 t/t5318-commit-graph.sh | 11 ++++++++++-
 t/t6600-test-reach.sh   |  2 ++
 3 files changed, 14 insertions(+), 2 deletions(-)

Comments

Junio C Hamano April 27, 2020, 11:54 p.m. UTC | #1
Taylor Blau <me@ttaylorr.com> writes:

> In the previous commit, Git learned 'hold_lock_file_for_update_mode' to
> allow the caller to specify the permission bits (prior to further
> adjustment by the umask and shared repository permissions) used when
> acquiring a temporary file.
>
> Use this in the commit-graph machinery for writing a non-split graph to
> acquire an opened temporary file with permissions read-only permissions
> to match the split behavior. (In the split case, Git uses
> git_mkstemp_mode' for each of the commit-graph layers with permission
> bits '0444').
>
> One can notice this discrepancy when moving a non-split graph to be part
> of a new chain. This causes a commit-graph chain where all layers have
> read-only permission bits, except for the base layer, which is writable
> for the current user.
>
> Resolve this discrepancy by using the new
> 'hold_lock_file_for_update_mode' and passing the desired permission
> bits.
>
> Doing so causes some test fallout in t5318 and t6600. In t5318, this
> occurs in tests that corrupt a commit-graph file by writing into it. For
> these, 'chmod u+w'-ing the file beforehand resolves the issue. The
> additional spot in 'corrupt_graph_verify' is necessary because of the
> extra 'git commit-graph write' beforehand (which *does* rewrite the
> commit-graph file). In t6600, this is caused by copying a read-only
> commit-graph file into place and then trying to replace it. For these,
> make these files writable.
>
> Signed-off-by: Taylor Blau <me@ttaylorr.com>
> ---
>  commit-graph.c          |  3 ++-
>  t/t5318-commit-graph.sh | 11 ++++++++++-
>  t/t6600-test-reach.sh   |  2 ++
>  3 files changed, 14 insertions(+), 2 deletions(-)

This step, queued as 3a5c7d70 (commit-graph.c: write non-split
graphs as read-only, 2020-04-27), starts failing 5318#9 at least for
me.  Do we need to loosen umask while running this test to something
not more strict than 022 or something silly like that?

Here is what I'll use as a workaround for today's pushout.

commit f062d1c028bcc839f961e8904c38c476b9deeec3
Author: Junio C Hamano <gitster@pobox.com>
Date:   Mon Apr 27 16:50:30 2020 -0700

    SQUASH??? force known umask if you are going to check the resulting mode bits

diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
index fb0aae61c3..901eb3ecfb 100755
--- a/t/t5318-commit-graph.sh
+++ b/t/t5318-commit-graph.sh
@@ -12,6 +12,10 @@ test_expect_success 'setup full repo' '
 	test_oid_init
 '
 
+test_expect_success POSIXPERM 'tweak umask for modebit tests' '
+	umask 022
+'
+
 test_expect_success 'verify graph with no graph file' '
 	cd "$TRASH_DIRECTORY/full" &&
 	git commit-graph verify
Taylor Blau April 27, 2020, 11:59 p.m. UTC | #2
On Mon, Apr 27, 2020 at 04:54:09PM -0700, Junio C Hamano wrote:
> This step, queued as 3a5c7d70 (commit-graph.c: write non-split
> graphs as read-only, 2020-04-27), starts failing 5318#9 at least for
> me.  Do we need to loosen umask while running this test to something
> not more strict than 022 or something silly like that?
>
> Here is what I'll use as a workaround for today's pushout.
>
> commit f062d1c028bcc839f961e8904c38c476b9deeec3
> Author: Junio C Hamano <gitster@pobox.com>
> Date:   Mon Apr 27 16:50:30 2020 -0700
>
>     SQUASH??? force known umask if you are going to check the resulting mode bits
>
> diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
> index fb0aae61c3..901eb3ecfb 100755
> --- a/t/t5318-commit-graph.sh
> +++ b/t/t5318-commit-graph.sh
> @@ -12,6 +12,10 @@ test_expect_success 'setup full repo' '
>  	test_oid_init
>  '
>
> +test_expect_success POSIXPERM 'tweak umask for modebit tests' '
> +	umask 022
> +'
> +
>  test_expect_success 'verify graph with no graph file' '
>  	cd "$TRASH_DIRECTORY/full" &&
>  	git commit-graph verify

Looks good to me; this is definitely necessary. For what it's worth, it
passes for me, but my system may not have as restrictive a umask as
others.

I'd be happy to re-send this patch, but alternatively if you want to
just squash this in, I'd be just as happy.

Thanks,
Taylor
Junio C Hamano April 28, 2020, 12:25 a.m. UTC | #3
Taylor Blau <me@ttaylorr.com> writes:

> Looks good to me; this is definitely necessary. For what it's worth, it
> passes for me, but my system may not have as restrictive a umask as
> others.

Note that umask is not a system thing, but personal.  When I was
with smaller company, I used to have 002 as my umask, but these days
I use 077.

> I'd be happy to re-send this patch, but alternatively if you want to
> just squash this in, I'd be just as happy.

I'll keep it queued, until we need to replace it with an undated
series.

Thanks.
Jeff King April 28, 2020, 3:34 a.m. UTC | #4
On Mon, Apr 27, 2020 at 05:59:35PM -0600, Taylor Blau wrote:

> > diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
> > index fb0aae61c3..901eb3ecfb 100755
> > --- a/t/t5318-commit-graph.sh
> > +++ b/t/t5318-commit-graph.sh
> > @@ -12,6 +12,10 @@ test_expect_success 'setup full repo' '
> >  	test_oid_init
> >  '
> >
> > +test_expect_success POSIXPERM 'tweak umask for modebit tests' '
> > +	umask 022
> > +'
> > +
> >  test_expect_success 'verify graph with no graph file' '
> >  	cd "$TRASH_DIRECTORY/full" &&
> >  	git commit-graph verify
> 
> Looks good to me; this is definitely necessary. For what it's worth, it
> passes for me, but my system may not have as restrictive a umask as
> others.

If we're just doing this for a single test, perhaps it would be better
to set the umask in that test (perhaps even in a subshell to avoid
touching other tests). I guess that's a little awkward here because the
write and the mode-check happen in separate snippets.

Or going in the opposite direction: if we think that covering the rest
of the test suite with a diversity of umasks isn't worthwhile, we could
just set "umask" in test-lib.sh. That would solve this problem and any
future ones.

I also wondered if it would be simpler to just limit the scope of the
test, like so:

diff --git b/t/t5318-commit-graph.sh a/t/t5318-commit-graph.sh
index fb0aae61c3..5f1c746ad1 100755
--- b/t/t5318-commit-graph.sh
+++ a/t/t5318-commit-graph.sh
@@ -98,9 +98,10 @@ test_expect_success 'write graph' '
 
 test_expect_success POSIXPERM 'write graph has correct permissions' '
 	test_path_is_file $objdir/info/commit-graph &&
-	echo "-r--r--r--" >expect &&
 	test_modebits $objdir/info/commit-graph >actual &&
-	test_cmp expect actual
+	# check only user mode bits, as we do not want to rely on
+	# test environment umask
+	grep ^-r-- actual
 '
 
 graph_git_behavior 'graph exists' full commits/3 commits/1

but maybe there's some value in checking the whole thing.

-Peff
Junio C Hamano April 28, 2020, 4:50 p.m. UTC | #5
Jeff King <peff@peff.net> writes:

> If we're just doing this for a single test, perhaps it would be better
> to set the umask in that test (perhaps even in a subshell to avoid
> touching other tests). I guess that's a little awkward here because the
> write and the mode-check happen in separate snippets.

Yes, and we cannot afford to place the writing side under POSIXPERM
prerequisite.

> Or going in the opposite direction: if we think that covering the rest
> of the test suite with a diversity of umasks isn't worthwhile, we could
> just set "umask" in test-lib.sh. That would solve this problem and any
> future ones.

Seen from the point of view of giving us a stable testing
environment, it certainly feels like an easy and simple thing to do.
I do not offhand see any downsides in that approach.

On the other hand, we use and rely on test-specified umask only in a
few tests (t0001, t1301 and t1304).  Perhaps we should discourage
tests outside these to rely too heavily on exact perm bits?

For example, I wonder if we should have used

	test -r commit-graph && ! test -w commit-graph 

to ensure the file is read-only to the user who is testing, instead
of relying on parsing "ls -l" output, which IIRC has bitten us with
xattr and forced us to revise test_modebits helper in 5610e3b0 (Fix
testcase failure when extended attributes are in use, 2008-10-19).
That would make the test require SANITY instead, though.

> I also wondered if it would be simpler to just limit the scope of the
> test, like so:
> ...
> +	# check only user mode bits, as we do not want to rely on
> +	# test environment umask
> +	grep ^-r-- actual
>  '
> ...
> but maybe there's some value in checking the whole thing.

Yeah, I guess we are wondering about the same thing.

Among various approaches on plate, my preference is to use "umask
022" around the place where we prepare the $TRASH_DIRECTORY and do
so only when POSIXPERM is there in the test-lib.sh.  I do not know
if we should do so before or after creating the $TRASH_DIRECTORY;
my gut feeling is that in the ideal world, we should be able to

 - create trash directory

 - use the directory to automatically figure out POSIXPERM

 - if POSIXPERM is set, use umask 022 and chmod og=rx the trash
   directory

Automatically figuring out POSIXPERM the above approach shoots for
is a much larger change, so I am not in a haste to implement it and
it may be OK to only do "umask 022" after we set POSIXPERM for
everybody but MINGW at least as the initial cut, but that would mean
we would run for quite a long time with the testing user's umask
during the setup process, which is unsatisfying.
Jeff King April 28, 2020, 8:59 p.m. UTC | #6
On Tue, Apr 28, 2020 at 09:50:02AM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > If we're just doing this for a single test, perhaps it would be better
> > to set the umask in that test (perhaps even in a subshell to avoid
> > touching other tests). I guess that's a little awkward here because the
> > write and the mode-check happen in separate snippets.
> 
> Yes, and we cannot afford to place the writing side under POSIXPERM
> prerequisite.

Do we need POSIXPERM just to call umask?

We call it unconditionally in t1304, for example. I could certainly
believe it doesn't do anything useful or predictable on other systems,
but it would not surprise me if it is a silent noop. It might return
non-zero, though (the call in t1304 is not inside a test snippet).

> Among various approaches on plate, my preference is to use "umask
> 022" around the place where we prepare the $TRASH_DIRECTORY and do
> so only when POSIXPERM is there in the test-lib.sh.  I do not know
> if we should do so before or after creating the $TRASH_DIRECTORY;
> my gut feeling is that in the ideal world, we should be able to
> 
>  - create trash directory
> 
>  - use the directory to automatically figure out POSIXPERM
> 
>  - if POSIXPERM is set, use umask 022 and chmod og=rx the trash
>    directory

I don't think we do any actual filesystem tests for POSIXPERM. It's
purely based on "uname -s", and we could check it much earlier. So
unless actually probing the filesystem is worth doing, we could just
punt on that part easily.

That said, I think this does get complicated when interacting with
t1304, for example, which explicitly creates an 077 umask for the trash
directory.

This is looking like a much deeper rabbit hole than it's worth going
down. I think the pragmatic thing is to just stick a "umask 022" near
the new test (or possibly "test_might_fail umask 022" inside the
commit-graph writing test).

-Peff
Junio C Hamano April 28, 2020, 9:05 p.m. UTC | #7
Jeff King <peff@peff.net> writes:

> On Tue, Apr 28, 2020 at 09:50:02AM -0700, Junio C Hamano wrote:
>
>> Jeff King <peff@peff.net> writes:
>> 
>> > If we're just doing this for a single test, perhaps it would be better
>> > to set the umask in that test (perhaps even in a subshell to avoid
>> > touching other tests). I guess that's a little awkward here because the
>> > write and the mode-check happen in separate snippets.
>> 
>> Yes, and we cannot afford to place the writing side under POSIXPERM
>> prerequisite.
>
> Do we need POSIXPERM just to call umask?

I checked "git grep umask t/" hits, and I thought everybody was
using it inside POSIXPERM.

> We call it unconditionally in t1304, for example. I could certainly
> believe it doesn't do anything useful or predictable on other systems,
> but it would not surprise me if it is a silent noop. It might return
> non-zero, though (the call in t1304 is not inside a test snippet).

That is sloppy, but it might be deliberate that it does not check
the result?

> I don't think we do any actual filesystem tests for POSIXPERM. It's
> purely based on "uname -s", and we could check it much earlier. So
> unless actually probing the filesystem is worth doing, we could just
> punt on that part easily.

Yup.

> That said, I think this does get complicated when interacting with
> t1304, for example, which explicitly creates an 077 umask for the trash
> directory.
>
> This is looking like a much deeper rabbit hole than it's worth going
> down. I think the pragmatic thing is to just stick a "umask 022" near
> the new test (or possibly "test_might_fail umask 022" inside the
> commit-graph writing test).

I think the most pragmatic would be to just squash in what is
already there ;-)
Jeff King April 28, 2020, 9:08 p.m. UTC | #8
On Tue, Apr 28, 2020 at 02:05:21PM -0700, Junio C Hamano wrote:

> > This is looking like a much deeper rabbit hole than it's worth going
> > down. I think the pragmatic thing is to just stick a "umask 022" near
> > the new test (or possibly "test_might_fail umask 022" inside the
> > commit-graph writing test).
> 
> I think the most pragmatic would be to just squash in what is
> already there ;-)

That is OK with me. :)

-Peff
Taylor Blau April 28, 2020, 9:44 p.m. UTC | #9
On Tue, Apr 28, 2020 at 05:08:21PM -0400, Jeff King wrote:
> On Tue, Apr 28, 2020 at 02:05:21PM -0700, Junio C Hamano wrote:
>
> > > This is looking like a much deeper rabbit hole than it's worth going
> > > down. I think the pragmatic thing is to just stick a "umask 022" near
> > > the new test (or possibly "test_might_fail umask 022" inside the
> > > commit-graph writing test).
> >
> > I think the most pragmatic would be to just squash in what is
> > already there ;-)
>
> That is OK with me. :)

Thanks for an interesting discussion. I squashed Junio's fix into the
third patch, but the fourth patch suffers from the same problem (so I
stuck another POSIXPERM test to tweak the umask there, too).

What do you want to do about the final patch that I stuck on the end of
this series in [1]? If I don't hear from anybody, I'll send it as 5/5 in
v3 and we can feel free to not apply it if it's controversial.

> -Peff

Thanks,
Taylor

[1]: https://lore.kernel.org/git/20200427172111.GA58509@syl.local/
Jeff King April 28, 2020, 9:58 p.m. UTC | #10
On Tue, Apr 28, 2020 at 03:44:13PM -0600, Taylor Blau wrote:

> What do you want to do about the final patch that I stuck on the end of
> this series in [1]? If I don't hear from anybody, I'll send it as 5/5 in
> v3 and we can feel free to not apply it if it's controversial.

I have to admit I don't care that much either way about it (see my
earlier response on three mental models). I'm happy for you or Junio to
decide. :)

-Peff
Junio C Hamano April 28, 2020, 11:22 p.m. UTC | #11
Jeff King <peff@peff.net> writes:

> On Tue, Apr 28, 2020 at 03:44:13PM -0600, Taylor Blau wrote:
>
>> What do you want to do about the final patch that I stuck on the end of
>> this series in [1]? If I don't hear from anybody, I'll send it as 5/5 in
>> v3 and we can feel free to not apply it if it's controversial.
>
> I have to admit I don't care that much either way about it (see my
> earlier response on three mental models). I'm happy for you or Junio to
> decide. :)

My gut feeling is that our longer term goal (if we had timeperiod
during which the codebase is quiescent enough and infinite energy to
dedicate on code clean-up) among one or your options should be to
consistently create files that are rewritten-and-renamed read-only,
to discourage casual tampering, so I am OK with that 5th patch.

Having said that, I suspect that Derrick and friends are larger
stakeholders in the "chain" file, so I'd prefer to see us basing
the choice on their input.

Thanks.
Derrick Stolee April 29, 2020, 11:52 a.m. UTC | #12
On 4/28/2020 7:22 PM, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
> 
>> On Tue, Apr 28, 2020 at 03:44:13PM -0600, Taylor Blau wrote:
>>
>>> What do you want to do about the final patch that I stuck on the end of
>>> this series in [1]? If I don't hear from anybody, I'll send it as 5/5 in
>>> v3 and we can feel free to not apply it if it's controversial.
>>
>> I have to admit I don't care that much either way about it (see my
>> earlier response on three mental models). I'm happy for you or Junio to
>> decide. :)
> 
> My gut feeling is that our longer term goal (if we had timeperiod
> during which the codebase is quiescent enough and infinite energy to
> dedicate on code clean-up) among one or your options should be to
> consistently create files that are rewritten-and-renamed read-only,
> to discourage casual tampering, so I am OK with that 5th patch.
> 
> Having said that, I suspect that Derrick and friends are larger
> stakeholders in the "chain" file, so I'd prefer to see us basing
> the choice on their input.

I'm happy with how this discussion has gone. I'm sure the only reason
for the permissions I wrote was because I found them somewhere else in
the codebase and I copied them from there. Memory is fuzzy, but I can
guarantee the deviation from the norm was not in purpose.

Thanks,
-Stolee
diff mbox series

Patch

diff --git a/commit-graph.c b/commit-graph.c
index f013a84e29..5b5047a7dd 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -1388,7 +1388,8 @@  static int write_commit_graph_file(struct write_commit_graph_context *ctx)
 
 		f = hashfd(fd, ctx->graph_name);
 	} else {
-		hold_lock_file_for_update(&lk, ctx->graph_name, LOCK_DIE_ON_ERROR);
+		hold_lock_file_for_update_mode(&lk, ctx->graph_name,
+					       LOCK_DIE_ON_ERROR, 0444);
 		fd = lk.tempfile->fd;
 		f = hashfd(lk.tempfile->fd, lk.tempfile->filename.buf);
 	}
diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
index 9bf920ae17..fb0aae61c3 100755
--- a/t/t5318-commit-graph.sh
+++ b/t/t5318-commit-graph.sh
@@ -96,6 +96,13 @@  test_expect_success 'write graph' '
 	graph_read_expect "3"
 '
 
+test_expect_success POSIXPERM 'write graph has correct permissions' '
+	test_path_is_file $objdir/info/commit-graph &&
+	echo "-r--r--r--" >expect &&
+	test_modebits $objdir/info/commit-graph >actual &&
+	test_cmp expect actual
+'
+
 graph_git_behavior 'graph exists' full commits/3 commits/1
 
 test_expect_success 'Add more commits' '
@@ -421,7 +428,8 @@  GRAPH_BYTE_FOOTER=$(($GRAPH_OCTOPUS_DATA_OFFSET + 4 * $NUM_OCTOPUS_EDGES))
 corrupt_graph_setup() {
 	cd "$TRASH_DIRECTORY/full" &&
 	test_when_finished mv commit-graph-backup $objdir/info/commit-graph &&
-	cp $objdir/info/commit-graph commit-graph-backup
+	cp $objdir/info/commit-graph commit-graph-backup &&
+	chmod u+w $objdir/info/commit-graph
 }
 
 corrupt_graph_verify() {
@@ -435,6 +443,7 @@  corrupt_graph_verify() {
 	fi &&
 	git status --short &&
 	GIT_TEST_COMMIT_GRAPH_DIE_ON_LOAD=true git commit-graph write &&
+	chmod u+w $objdir/info/commit-graph &&
 	git commit-graph verify
 }
 
diff --git a/t/t6600-test-reach.sh b/t/t6600-test-reach.sh
index b24d850036..475564bee7 100755
--- a/t/t6600-test-reach.sh
+++ b/t/t6600-test-reach.sh
@@ -51,8 +51,10 @@  test_expect_success 'setup' '
 	done &&
 	git commit-graph write --reachable &&
 	mv .git/objects/info/commit-graph commit-graph-full &&
+	chmod u+w commit-graph-full &&
 	git show-ref -s commit-5-5 | git commit-graph write --stdin-commits &&
 	mv .git/objects/info/commit-graph commit-graph-half &&
+	chmod u+w commit-graph-half &&
 	git config core.commitGraph true
 '