diff mbox series

[v2,2/3] safe.directory: normalize the configured path

Message ID 20240723021900.388020-3-gitster@pobox.com (mailing list archive)
State Superseded
Headers show
Series safe.directory clean-up | expand

Commit Message

Junio C Hamano July 23, 2024, 2:18 a.m. UTC
The pathname of a repository comes from getcwd() and it could be a
path aliased via symbolic links, e.g., the real directory may be
/home/u/repository but a symbolic link /home/u/repo may point at it,
and the clone request may come as "git clone file:///home/u/repo/"

A request to check if /home/u/repository is safe would be rejected
if the safe.directory configuration allows /home/u/repo/ but not its
alias /home/u/repository/.  Normalize the paths configured for the
safe.directory configuration variable before comparing them with the
path being checked.

Suggested-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 setup.c                   | 12 +++++++++
 t/t0033-safe-directory.sh | 57 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 69 insertions(+)

Comments

Phillip Wood July 25, 2024, 9:45 a.m. UTC | #1
Hi Junio

On 23/07/2024 03:18, Junio C Hamano wrote:
> The pathname of a repository comes from getcwd() and it could be a
> path aliased via symbolic links, e.g., the real directory may be
> /home/u/repository but a symbolic link /home/u/repo may point at it,
> and the clone request may come as "git clone file:///home/u/repo/"
> 
> A request to check if /home/u/repository is safe would be rejected
> if the safe.directory configuration allows /home/u/repo/ but not its
> alias /home/u/repository/.  Normalize the paths configured for the
> safe.directory configuration variable before comparing them with the
> path being checked.

I think this is sensible if the key is an absolute path, if the key is a 
relative path I think we should ignore it as it is not clear which 
directory the user meant.

Best Wishes

Phillip

> Suggested-by: Phillip Wood <phillip.wood123@gmail.com>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
>   setup.c                   | 12 +++++++++
>   t/t0033-safe-directory.sh | 57 +++++++++++++++++++++++++++++++++++++++
>   2 files changed, 69 insertions(+)
> 
> diff --git a/setup.c b/setup.c
> index 45bbbe329f..29304d7452 100644
> --- a/setup.c
> +++ b/setup.c
> @@ -1236,6 +1236,16 @@ static int safe_directory_cb(const char *key, const char *value,
>   
>   		if (!git_config_pathname(&allowed, key, value)) {
>   			const char *check = allowed ? allowed : value;
> +			char *to_free = real_pathdup(check, 0);
> +
> +			if (!to_free) {
> +				warning(_("safe.directory '%s' cannot be normalized"),
> +					check);
> +				goto next;
> +			} else {
> +				check = to_free;
> +			}
> +
>   			if (ends_with(check, "/*")) {
>   				size_t len = strlen(check);
>   				if (!fspathncmp(check, data->path, len - 1))
> @@ -1243,7 +1253,9 @@ static int safe_directory_cb(const char *key, const char *value,
>   			} else if (!fspathcmp(data->path, check)) {
>   				data->is_safe = 1;
>   			}
> +			free(to_free);
>   		}
> +	next:
>   		if (allowed != value)
>   			free(allowed);
>   	}
> diff --git a/t/t0033-safe-directory.sh b/t/t0033-safe-directory.sh
> index 07ac0f9a01..ea74657255 100755
> --- a/t/t0033-safe-directory.sh
> +++ b/t/t0033-safe-directory.sh
> @@ -176,4 +176,61 @@ test_expect_success SYMLINKS 'checked leading paths are normalized' '
>   	git -C repo/s/.git/ for-each-ref
>   '
>   
> +test_expect_success SYMLINKS 'configured paths are normalized' '
> +	test_when_finished "rm -rf repository; rm -f repo" &&
> +	(
> +		sane_unset GIT_TEST_ASSUME_DIFFERENT_OWNER &&
> +		git config --global --unset-all safe.directory
> +	) &&
> +	git init repository &&
> +	ln -s repository repo &&
> +	(
> +		cd repository &&
> +		sane_unset GIT_TEST_ASSUME_DIFFERENT_OWNER &&
> +		test_commit sample
> +	) &&
> +
> +	(
> +		sane_unset GIT_TEST_ASSUME_DIFFERENT_OWNER &&
> +		git config --global safe.directory "$(pwd)/repo"
> +	) &&
> +	git -C repository for-each-ref &&
> +	git -C repository/ for-each-ref &&
> +	git -C repo for-each-ref &&
> +	git -C repo/ for-each-ref &&
> +	test_must_fail git -C repository/.git for-each-ref &&
> +	test_must_fail git -C repository/.git/ for-each-ref &&
> +	test_must_fail git -C repo/.git for-each-ref &&
> +	test_must_fail git -C repo/.git/ for-each-ref
> +'
> +
> +test_expect_success SYMLINKS 'configured leading paths are normalized' '
> +	test_when_finished "rm -rf repository; rm -f repo" &&
> +	(
> +		sane_unset GIT_TEST_ASSUME_DIFFERENT_OWNER &&
> +		git config --global --unset-all safe.directory
> +	) &&
> +	mkdir -p repository &&
> +	git init repository/s &&
> +	ln -s repository repo &&
> +	(
> +		cd repository/s &&
> +		sane_unset GIT_TEST_ASSUME_DIFFERENT_OWNER &&
> +		test_commit sample
> +	) &&
> +
> +	(
> +		sane_unset GIT_TEST_ASSUME_DIFFERENT_OWNER &&
> +		git config --global safe.directory "$(pwd)/repo/*"
> +	) &&
> +	git -C repository/s for-each-ref &&
> +	git -C repository/s/ for-each-ref &&
> +	git -C repository/s/.git for-each-ref &&
> +	git -C repository/s/.git/ for-each-ref &&
> +	git -C repo/s for-each-ref &&
> +	git -C repo/s/ for-each-ref &&
> +	git -C repo/s/.git for-each-ref &&
> +	git -C repo/s/.git/ for-each-ref
> +'
> +
>   test_done
Junio C Hamano July 25, 2024, 4:11 p.m. UTC | #2
Phillip Wood <phillip.wood123@gmail.com> writes:

> I think this is sensible if the key is an absolute path, if the key is
> a relative path I think we should ignore it as it is not clear which
> directory the user meant.

The only thing that worries me in that proposal is that doing so
would break a configuration that used to work.  I'd rather leave
the tightening of it to future work with its own justification.

Thanks.
Jeff King July 26, 2024, 5:02 a.m. UTC | #3
On Mon, Jul 22, 2024 at 07:18:59PM -0700, Junio C Hamano wrote:

> The pathname of a repository comes from getcwd() and it could be a
> path aliased via symbolic links, e.g., the real directory may be
> /home/u/repository but a symbolic link /home/u/repo may point at it,
> and the clone request may come as "git clone file:///home/u/repo/"
> 
> A request to check if /home/u/repository is safe would be rejected
> if the safe.directory configuration allows /home/u/repo/ but not its
> alias /home/u/repository/.  Normalize the paths configured for the
> safe.directory configuration variable before comparing them with the
> path being checked.

This may be a dumb question, but... will this always work, if the config
option is not necessarily an exact path, and might have globbing
characters in it?

We don't currently treat it like a wildmatch glob, but:

  1. We do allow "/*" on the end. Should we strip that off so it doesn't
     confuse the realpath lookup?

  2. This is shutting the door on using wildmatch or similar in the
     future. Is that OK?

> @@ -1236,6 +1236,16 @@ static int safe_directory_cb(const char *key, const char *value,
>  
>  		if (!git_config_pathname(&allowed, key, value)) {
>  			const char *check = allowed ? allowed : value;
> +			char *to_free = real_pathdup(check, 0);
> +
> +			if (!to_free) {
> +				warning(_("safe.directory '%s' cannot be normalized"),
> +					check);
> +				goto next;
> +			} else {
> +				check = to_free;
> +			}

I'm not sure about this warning. I don't know how people use this config
in the real world, but it seems plausible to me they might have:

  [safe]
  directory = /home/me/this/always/exists
  directory = /home/me/this/might/not/exist

perhaps because they're using a generic config file across multiple
machines. And then they'd get:

  warning: safe.directory '/home/me/this/might/not/exist' cannot be normalized

on every invocation (at least ones that need to consult safe.directory).
Should we be quiet there, and maybe fall back to using the
non-normalized path (though I guess if we could not normalize it, that
probably means it could never match anyway)?

-Peff
Junio C Hamano July 26, 2024, 3:02 p.m. UTC | #4
Jeff King <peff@peff.net> writes:

> This may be a dumb question, but... will this always work, if the config
> option is not necessarily an exact path, and might have globbing
> characters in it?
>
> We don't currently treat it like a wildmatch glob, but:
>
>   1. We do allow "/*" on the end. Should we strip that off so it doesn't
>      confuse the realpath lookup?

This one I wondered, too, but the test seems to show that it is OK ;-)

>   2. This is shutting the door on using wildmatch or similar in the
>      future. Is that OK?

I am inclined to keep this part and any other logic that are meant
to address "security" things simple and stupid.  So in that sense,
I am not so worried that it would be hard to retrofit wildmatch to
this codepath.

> Should we be quiet there, and maybe fall back to using the
> non-normalized path (though I guess if we could not normalize it, that
> probably means it could never match anyway)?

The only reason I did the warning was because it makes me feel a bit
uneasy to have a configuration item that either gets ignored or
triggers a fallback behaviour and not to tell users about it.  Other
than that, I have no strong preference.

Unfortunately this is not a very good fit for the advise mechanism,
as we do not even have a repository yet.
Jeff King July 27, 2024, 10:05 p.m. UTC | #5
On Fri, Jul 26, 2024 at 08:02:02AM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > This may be a dumb question, but... will this always work, if the config
> > option is not necessarily an exact path, and might have globbing
> > characters in it?
> >
> > We don't currently treat it like a wildmatch glob, but:
> >
> >   1. We do allow "/*" on the end. Should we strip that off so it doesn't
> >      confuse the realpath lookup?
> 
> This one I wondered, too, but the test seems to show that it is OK ;-)

Hmm. Would it depend on what's in the filesystem, though? That is, if I
had a file named "*", would it get confused?

That is unlikely in normal use, of course, but can we do something
tricky with it? I think probably no? Accessing "/path/to/*" implies that
the user meant to allow all of "/path/to", which would include that. So
the most an attacker could do is _disable_ safe.directory for something
that should have it on.

> >   2. This is shutting the door on using wildmatch or similar in the
> >      future. Is that OK?
> 
> I am inclined to keep this part and any other logic that are meant
> to address "security" things simple and stupid.  So in that sense,
> I am not so worried that it would be hard to retrofit wildmatch to
> this codepath.

That's my general instinct, too. But I wanted to make sure we were
explicit about the choice.

> > Should we be quiet there, and maybe fall back to using the
> > non-normalized path (though I guess if we could not normalize it, that
> > probably means it could never match anyway)?
> 
> The only reason I did the warning was because it makes me feel a bit
> uneasy to have a configuration item that either gets ignored or
> triggers a fallback behaviour and not to tell users about it.  Other
> than that, I have no strong preference.

Yes, that was my first thought, too, but I noticed the bogus messages
while playing with it. I _think_ if it triggers it means that it points
to something that doesn't exist (or you don't have permission to read),
in which case it would by definition not have matched. So silently
ignoring might not be so bad.

-Peff
Phillip Wood Aug. 14, 2024, 1:20 p.m. UTC | #6
Hi Junio

On 25/07/2024 17:11, Junio C Hamano wrote:
> Phillip Wood <phillip.wood123@gmail.com> writes:
> 
>> I think this is sensible if the key is an absolute path, if the key is
>> a relative path I think we should ignore it as it is not clear which
>> directory the user meant.
> 
> The only thing that worries me in that proposal is that doing so
> would break a configuration that used to work.  I'd rather leave
> the tightening of it to future work with its own justification.

As far as I know the only caller that tried to match a relative 
directory was enter_repo(), all of the other code paths pass an absolute 
path from getcwd(). Before this series git daemon required the relative 
directory "." to be specified in addition to the absolute path so that 
will not be broken by removing support for relative directories. Are 
there other users of enter_repo() that still rely on being able to match 
relative paths?

Best Wishes

Phillip

> Thanks.
> 
>
Junio C Hamano Aug. 14, 2024, 5:15 p.m. UTC | #7
Phillip Wood <phillip.wood123@gmail.com> writes:

> On 25/07/2024 17:11, Junio C Hamano wrote:
>> Phillip Wood <phillip.wood123@gmail.com> writes:
>> 
>>> I think this is sensible if the key is an absolute path, if the key is
>>> a relative path I think we should ignore it as it is not clear which
>>> directory the user meant.
>> The only thing that worries me in that proposal is that doing so
>> would break a configuration that used to work.  I'd rather leave
>> the tightening of it to future work with its own justification.
>
> As far as I know the only caller that tried to match a relative
> directory was enter_repo(), all of the other code paths pass an
> absolute path from getcwd(). Before this series git daemon required
> the relative directory "." to be specified in addition to the absolute
> path so that will not be broken by removing support for relative
> directories. Are there other users of enter_repo() that still rely on
> being able to match relative paths?

Offhand I do not know of any, but no guarantees.  I am tempted to
leave it as a prerequisite task for those who want to tighten this
codepath further.
Phillip Wood Aug. 15, 2024, 9:51 a.m. UTC | #8
On 14/08/2024 18:15, Junio C Hamano wrote:
> Phillip Wood <phillip.wood123@gmail.com> writes:
> 
>> On 25/07/2024 17:11, Junio C Hamano wrote:
>>> Phillip Wood <phillip.wood123@gmail.com> writes:
>>>
>>>> I think this is sensible if the key is an absolute path, if the key is
>>>> a relative path I think we should ignore it as it is not clear which
>>>> directory the user meant.
>>> The only thing that worries me in that proposal is that doing so
>>> would break a configuration that used to work.  I'd rather leave
>>> the tightening of it to future work with its own justification.
>>
>> As far as I know the only caller that tried to match a relative
>> directory was enter_repo(), all of the other code paths pass an
>> absolute path from getcwd(). Before this series git daemon required
>> the relative directory "." to be specified in addition to the absolute
>> path so that will not be broken by removing support for relative
>> directories. Are there other users of enter_repo() that still rely on
>> being able to match relative paths?
> 
> Offhand I do not know of any, but no guarantees.  I am tempted to
> leave it as a prerequisite task for those who want to tighten this
> codepath further.

Fair enough. I will note though that this change makes it much more 
likely that "safe.directory=." will match as it now behaves like "*". 
Previously it only matched when we tried to match it against "." whereas 
now it matches any directory.

Best Wishes

Phillip
Junio C Hamano Aug. 15, 2024, 2:43 p.m. UTC | #9
Phillip Wood <phillip.wood123@gmail.com> writes:

> Fair enough. I will note though that this change makes it much more
> likely that "safe.directory=." will match as it now behaves like
> "*". Previously it only matched when we tried to match it against "."
> whereas now it matches any directory.

Unless you are commenting on my earlier unpublished draft, I doubt
that it is the case.  

If I did the code analysis correctly when I responded to Peff in
https://lore.kernel.org/git/xmqq4j86g948.fsf@gitster.g/, that is.

There are some negative tests added by the series to verify that dot
is really "current directory" and not "any directory" which is what
'*' is for (and as I always tell my contributors, we should really
get in the habit of writing more negative tests, exactly for things
like this where we do not want a "feature" to trigger in cases we do
not want them to).

Ahh, if your e-mail header is correct, you are commenting on the
implementation in v2, which did not care if the input were absolute
or not?  If so, then your confusion is somewhat understandable.

The things changed before the final edition v4 happened and it is
slightly tigher than the version you are commenting on.  We refuse
to match any non-absolute path, except for ".", since v3.  But I
think even with the looser implementation of v2, the negative test
would have made sure that "." was really "current and not any".

Thanks.
diff mbox series

Patch

diff --git a/setup.c b/setup.c
index 45bbbe329f..29304d7452 100644
--- a/setup.c
+++ b/setup.c
@@ -1236,6 +1236,16 @@  static int safe_directory_cb(const char *key, const char *value,
 
 		if (!git_config_pathname(&allowed, key, value)) {
 			const char *check = allowed ? allowed : value;
+			char *to_free = real_pathdup(check, 0);
+
+			if (!to_free) {
+				warning(_("safe.directory '%s' cannot be normalized"),
+					check);
+				goto next;
+			} else {
+				check = to_free;
+			}
+
 			if (ends_with(check, "/*")) {
 				size_t len = strlen(check);
 				if (!fspathncmp(check, data->path, len - 1))
@@ -1243,7 +1253,9 @@  static int safe_directory_cb(const char *key, const char *value,
 			} else if (!fspathcmp(data->path, check)) {
 				data->is_safe = 1;
 			}
+			free(to_free);
 		}
+	next:
 		if (allowed != value)
 			free(allowed);
 	}
diff --git a/t/t0033-safe-directory.sh b/t/t0033-safe-directory.sh
index 07ac0f9a01..ea74657255 100755
--- a/t/t0033-safe-directory.sh
+++ b/t/t0033-safe-directory.sh
@@ -176,4 +176,61 @@  test_expect_success SYMLINKS 'checked leading paths are normalized' '
 	git -C repo/s/.git/ for-each-ref
 '
 
+test_expect_success SYMLINKS 'configured paths are normalized' '
+	test_when_finished "rm -rf repository; rm -f repo" &&
+	(
+		sane_unset GIT_TEST_ASSUME_DIFFERENT_OWNER &&
+		git config --global --unset-all safe.directory
+	) &&
+	git init repository &&
+	ln -s repository repo &&
+	(
+		cd repository &&
+		sane_unset GIT_TEST_ASSUME_DIFFERENT_OWNER &&
+		test_commit sample
+	) &&
+
+	(
+		sane_unset GIT_TEST_ASSUME_DIFFERENT_OWNER &&
+		git config --global safe.directory "$(pwd)/repo"
+	) &&
+	git -C repository for-each-ref &&
+	git -C repository/ for-each-ref &&
+	git -C repo for-each-ref &&
+	git -C repo/ for-each-ref &&
+	test_must_fail git -C repository/.git for-each-ref &&
+	test_must_fail git -C repository/.git/ for-each-ref &&
+	test_must_fail git -C repo/.git for-each-ref &&
+	test_must_fail git -C repo/.git/ for-each-ref
+'
+
+test_expect_success SYMLINKS 'configured leading paths are normalized' '
+	test_when_finished "rm -rf repository; rm -f repo" &&
+	(
+		sane_unset GIT_TEST_ASSUME_DIFFERENT_OWNER &&
+		git config --global --unset-all safe.directory
+	) &&
+	mkdir -p repository &&
+	git init repository/s &&
+	ln -s repository repo &&
+	(
+		cd repository/s &&
+		sane_unset GIT_TEST_ASSUME_DIFFERENT_OWNER &&
+		test_commit sample
+	) &&
+
+	(
+		sane_unset GIT_TEST_ASSUME_DIFFERENT_OWNER &&
+		git config --global safe.directory "$(pwd)/repo/*"
+	) &&
+	git -C repository/s for-each-ref &&
+	git -C repository/s/ for-each-ref &&
+	git -C repository/s/.git for-each-ref &&
+	git -C repository/s/.git/ for-each-ref &&
+	git -C repo/s for-each-ref &&
+	git -C repo/s/ for-each-ref &&
+	git -C repo/s/.git for-each-ref &&
+	git -C repo/s/.git/ for-each-ref
+'
+
 test_done