[v2,2/2] setup: allow Git.pm to do unsafe repo checking

Message ID	20221022011931.43992-3-michael@mcclimon.org (mailing list archive)
State	New, archived
Headers	show Return-Path: <git-owner@kernel.org> Feedback-ID: i9dc944d1:Fastmail From: Michael McClimon <michael@mcclimon.org> To: git@vger.kernel.org Cc: Michael McClimon <michael@mcclimon.org> Subject: [PATCH v2 2/2] setup: allow Git.pm to do unsafe repo checking Date: Fri, 21 Oct 2022 21:19:32 -0400 Message-Id: <20221022011931.43992-3-michael@mcclimon.org> In-Reply-To: <20221022011931.43992-1-michael@mcclimon.org> References: <20221016212236.12453-1-michael@mcclimon.org> <20221022011931.43992-1-michael@mcclimon.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	Fix behavior of Git.pm in unsafe bare repositories \| expand [v2,0/2] Fix behavior of Git.pm in unsafe bare repositories [v2,1/2] Git.pm: add semicolon after catch statement [v2,2/2] setup: allow Git.pm to do unsafe repo checking

Michael McClimon Oct. 22, 2022, 1:19 a.m. UTC

The previous commit exposes a security flaw: it is possible to bypass
unsafe repository checks by using Git.pm, where before the syntax error
accidentally prohibited it. This problem occurs because Git.pm sets
GIT_DIR explicitly, which bypasses the safe repository checks.

Fix this by introducing a new environment variable,
GIT_PERL_FORCE_OWNERSHIP_CHECK, which we set true in Git.pm. In setup.c,
if that environment variable is true, force ownership checks even if
an explicit GIT_DIR is provided.

Signed-off-by: Michael McClimon <michael@mcclimon.org>
---
 perl/Git.pm         |  1 +
 setup.c             |  3 +++
 t/t9700-perl-git.sh |  4 ++++
 t/t9700/test.pl     | 18 ++++++++++++++++++
 4 files changed, 26 insertions(+)

Junio C Hamano Oct. 22, 2022, 5:29 a.m. UTC | #1

Michael McClimon <michael@mcclimon.org> writes:

> The previous commit exposes a security flaw: it is possible to bypass
> unsafe repository checks by using Git.pm, where before the syntax error
> accidentally prohibited it. This problem occurs because Git.pm sets
> GIT_DIR explicitly, which bypasses the safe repository checks.
>
> Fix this by introducing a new environment variable,
> GIT_PERL_FORCE_OWNERSHIP_CHECK, which we set true in Git.pm. In setup.c,
> if that environment variable is true, force ownership checks even if
> an explicit GIT_DIR is provided.

Drop "PERL_" from the name, as any third-party tool that does its
own repository discovery and sets GIT_DIR before it invokes git
would need a similar "Yes, I am setting GIT_DIR, but I still want
you to check if that passes the usual ownership check", and there
is no reason to expect that these tools are always written in Perl.

How about "GIT_SAFE_DIRECTORY_STRICT" or something along that line,
instead?

I also have to wonder (and this is *not* a suggestion for inventing
an alternative fix for perl/Git.pm) if we were creating perl/Git.pm
from scratch today, we even need to be worried about this issue.  We
may have Git::repo_path() helper but if we call it in a natural way
(read: as if an interactive end-user would type commands), it is
likely that we would run "git rev-parse --git-dir" or something
without setting GIT_DIR, and when we need to run "git" command, say
"git diff", we would also run "git diff" as if the end user would
type from within their interactive session and without setting
GIT_DIR, and everything should work.  IOW, why do we even setting
and exporting the auto-detected value in GIT_DIR?

Also, if the end user had GIT_DIR in the environment _before_
calling something that uses "import Git", what should happen?  In
that case, the end-user is who is telling us that the named
directory is OK without any forced ownership check, so it feels
WRONG that this patch UNCONDITIONALLY exports FORCE_CHECK.  Only
when we did auto-discovery via Git::repo_path() without end-user
supplied GIT_DIR and exported GIT_DIR, we should also export
FORCE_CHECK to tell the underlying "git" that our auto-detection
may be flawed and it needs to double check, no?

> Signed-off-by: Michael McClimon <michael@mcclimon.org>
> ---
>  perl/Git.pm         |  1 +
>  setup.c             |  3 +++
>  t/t9700-perl-git.sh |  4 ++++
>  t/t9700/test.pl     | 18 ++++++++++++++++++
>  4 files changed, 26 insertions(+)
>
> diff --git a/perl/Git.pm b/perl/Git.pm
> index cf15ead6..002c29bb 100644
> --- a/perl/Git.pm
> +++ b/perl/Git.pm
> @@ -1674,6 +1674,7 @@ sub _cmd_exec {
>  sub _setup_git_cmd_env {
>  	my $self = shift;
>  	if ($self) {
> +		$ENV{GIT_PERL_FORCE_OWNERSHIP_CHECK} = 1;
>  		$self->repo_path() and $ENV{'GIT_DIR'} = $self->repo_path();
>  		$self->repo_path() and $self->wc_path()
>  			and $ENV{'GIT_WORK_TREE'} = $self->wc_path();
> diff --git a/setup.c b/setup.c
> index cefd5f63..33d4e6fd 100644
> --- a/setup.c
> +++ b/setup.c
> @@ -1250,6 +1250,9 @@ static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir,
>  	 */
>  	gitdirenv = getenv(GIT_DIR_ENVIRONMENT);
>  	if (gitdirenv) {
> +		if (git_env_bool("GIT_PERL_FORCE_OWNERSHIP_CHECK", 0) &&
> +		    !ensure_valid_ownership(NULL, NULL, gitdirenv, report))
> +			return GIT_DIR_INVALID_OWNERSHIP;
>  		strbuf_addstr(gitdir, gitdirenv);
>  		return GIT_DIR_EXPLICIT;
>  	}
> diff --git a/t/t9700-perl-git.sh b/t/t9700-perl-git.sh
> index 4aa5d90d..b14a40b1 100755
> --- a/t/t9700-perl-git.sh
> +++ b/t/t9700-perl-git.sh
> @@ -45,6 +45,10 @@ test_expect_success \
>       git config --add test.pathmulti bar
>       '
>  
> +test_expect_success 'set up bare repository' '
> +     git init --bare bare.git
> +'
> +
>  test_expect_success 'use t9700/test.pl to test Git.pm' '
>  	"$PERL_PATH" "$TEST_DIRECTORY"/t9700/test.pl 2>stderr &&
>  	test_must_be_empty stderr
> diff --git a/t/t9700/test.pl b/t/t9700/test.pl
> index e046f7db..1c91019f 100755
> --- a/t/t9700/test.pl
> +++ b/t/t9700/test.pl
> @@ -142,6 +142,24 @@ sub adjust_dirsep {
>  		     "abc\"\\ \x07\x08\x09\x0a\x0b\x0c\x0d\x01 ",
>  		     'unquote escape sequences');
>  
> +# safe directory
> +{
> +	local $ENV{GIT_TEST_ASSUME_DIFFERENT_OWNER} = 1;
> +	# Save stderr to a tempfile so we can check the contents
> +	open our $tmpstderr2, ">&STDERR" or die "cannot save STDERR";
> +	my $tmperr = "unsafeerr.tmp";
> +	open STDERR, ">", "$tmperr" or die "cannot redirect STDERR to $tmperr";
> +	my $failed = eval { Git->repository(Directory => "$abs_repo_dir/bare.git") };
> +	ok(!$failed, "reject unsafe repository");
> +	like($@, qr/not a git repository/i, "unsafe error message");
> +	open TEMPFILE, "<", "$tmperr" or die "Can't open $tmperr $!";
> +	my $errcontents;
> +	{ local $/; $errcontents = <TEMPFILE>; }
> +	like($errcontents, qr/dubious ownership/, "dubious ownership message");
> +	close STDERR or die "cannot close temp stderr";
> +	open STDERR, ">&", $tmpstderr2 or die "cannot restore STDERR";
> +}
> +
>  printf "1..%d\n", Test::More->builder->current_test;
>  
>  my $is_passing = eval { Test::More->is_passing };

Ævar Arnfjörð Bjarmason Oct. 22, 2022, 7:45 p.m. UTC | #2

On Fri, Oct 21 2022, Michael McClimon wrote:

> The previous commit exposes a security flaw: it is possible to bypass
> unsafe repository checks by using Git.pm, where before the syntax error
> accidentally prohibited it. This problem occurs because Git.pm sets
> GIT_DIR explicitly, which bypasses the safe repository checks.
>
> Fix this by introducing a new environment variable,
> GIT_PERL_FORCE_OWNERSHIP_CHECK, which we set true in Git.pm. In setup.c,
> if that environment variable is true, force ownership checks even if
> an explicit GIT_DIR is provided.

The 1/2 is unambiguously good, thanks.

As for this step, I think it's good if we want to go in this direction.

As to whether it's the direction we want...

The vulnerability safe.directory was supposed to address was one where
you'd set your fsmonitor hook via a config variable, so running "diff",
"status" etc. would unexpectedly execute arbitrary code.

Especially on Windows where apparently the equivalent of the root of a
shared mounted volume routinely has global write permissions (user's
subdirectories being less permissive).

An alternative I raised on the security list was to narrowly target just
the fsmonitor config, since that was the vulnerability.

But it was decided to cast a wider net, as it wasn't disclosed yet, and
the list members might have missed some other config variable that would
allow shenanigans, fair enough, especially for a time constrained
security fix.

Now that the cat's thoroughly out of the bag I think we'd do well to
reconsider that.

I'm unaware of any other variable(s) that provide a similar vector, and
safe.directory is encouraging users (especially in core.sharedRepository
settings) to mark a dir as "safe", and we'd then later have an exploit
from a user with rw access who'd use the fsmonitor hook vector.

Better have them "safe by default", and start refusing to read the
config when we detect that fsmonitor setting, and insist the user mark
*that* safe.

Anyway, that's all on the general topic, and the above is just my
opinion on it.

But on this much more narrow topic: If we couldn't come up with an issue
other than the fsmonitor config for git's C codebase, I think extending
the same mitigation to the very small part of git that's still in Perl
is thoroughly into the tail wagging the dog territory.

I.e. what we'd presumably get out of this is mitigation against some
exploit via a variable that git-send-email or git-svn use (or the
handful of other more obscure Perl tooling we have).

Can we think of one that could be an issue, and if not is the current
behavior in 1/2 maybe OK as-is?

>  test_expect_success 'use t9700/test.pl to test Git.pm' '
>  	"$PERL_PATH" "$TEST_DIRECTORY"/t9700/test.pl 2>stderr &&
>  	test_must_be_empty stderr
> diff --git a/t/t9700/test.pl b/t/t9700/test.pl
> index e046f7db..1c91019f 100755
> --- a/t/t9700/test.pl
> +++ b/t/t9700/test.pl
> @@ -142,6 +142,24 @@ sub adjust_dirsep {
>  		     "abc\"\\ \x07\x08\x09\x0a\x0b\x0c\x0d\x01 ",
>  		     'unquote escape sequences');
>  
> +# safe directory
> +{
> +	local $ENV{GIT_TEST_ASSUME_DIFFERENT_OWNER} = 1;
> +	# Save stderr to a tempfile so we can check the contents
> +	open our $tmpstderr2, ">&STDERR" or die "cannot save STDERR";
> +	my $tmperr = "unsafeerr.tmp";
> +	open STDERR, ">", "$tmperr" or die "cannot redirect STDERR to $tmperr";
> +	my $failed = eval { Git->repository(Directory => "$abs_repo_dir/bare.git") };
> +	ok(!$failed, "reject unsafe repository");
> +	like($@, qr/not a git repository/i, "unsafe error message");
> +	open TEMPFILE, "<", "$tmperr" or die "Can't open $tmperr $!";
> +	my $errcontents;
> +	{ local $/; $errcontents = <TEMPFILE>; }
> +	like($errcontents, qr/dubious ownership/, "dubious ownership message");
> +	close STDERR or die "cannot close temp stderr";
> +	open STDERR, ">&", $tmpstderr2 or die "cannot restore STDERR";
> +}

This whole "save stderr to a file" etc. seems much more complex than
just writing the same test in our normal *.sh tests, and doing something
like:

	! GIT_TEST_ASSUME_DIFFERENT_OWNER=1 perl -MGit -we 'Git->repository(Directory => shift)' 2>expect &&
	grep "reject unusafe" ...

I.e. you're spending a lot of effort on capturing STDERR within a single
Perl process, then restoring it etc., when we can just run one-off
command and have it die (no eval), and just inspect the exit code &
stderr perl emitted.

Jeff King Oct. 22, 2022, 8:55 p.m. UTC | #3

On Sat, Oct 22, 2022 at 09:45:14PM +0200, Ævar Arnfjörð Bjarmason wrote:

> The vulnerability safe.directory was supposed to address was one where
> you'd set your fsmonitor hook via a config variable, so running "diff",
> "status" etc. would unexpectedly execute arbitrary code.
> 
> Especially on Windows where apparently the equivalent of the root of a
> shared mounted volume routinely has global write permissions (user's
> subdirectories being less permissive).
> 
> An alternative I raised on the security list was to narrowly target just
> the fsmonitor config, since that was the vulnerability.
>
> [...]
>
> I'm unaware of any other variable(s) that provide a similar vector, and
> safe.directory is encouraging users (especially in core.sharedRepository
> settings) to mark a dir as "safe", and we'd then later have an exploit
> from a user with rw access who'd use the fsmonitor hook vector.

Here are a few off the top of my head that you can trigger via git-diff:

  - core.pager will run an arbitrary program

  - pager.diff will run an arbitrary program

  - diff.*.textconv runs an arbitrary program; you need matching
    .gitattributes, but those are under the control of the repository.
    (not diff.*.command, though, as you need to pass --ext-diff)

  - browser/man paths if you run "git diff --help"

And of course as you expand the set of commands there are more options.
E.g., credential helper commands if you do anything that wants auth,
interactive diff-filter for "add -p", hooks for git-commit, git-push,
etc. Those commands are less likely to be run in an untrusted repository
than inspection commands like "status" or "diff", but the boundary is
getting quite fuzzy.

fsmonitor _might_ be the only one that is triggered by git-prompt.sh,
because it has a limited command palette, generally reads (or sends to
/dev/null) the stdout of commands (preventing pager invocation), and
doesn't do text diffs (so no textconv). Even if true, I'm not sure if
that's a good place to draw the line, though. People do tend to run "git
log" themselves.

-Peff

Jeff King Oct. 22, 2022, 9:16 p.m. UTC | #4

On Fri, Oct 21, 2022 at 09:19:32PM -0400, Michael McClimon wrote:

> diff --git a/perl/Git.pm b/perl/Git.pm
> index cf15ead6..002c29bb 100644
> --- a/perl/Git.pm
> +++ b/perl/Git.pm
> @@ -1674,6 +1674,7 @@ sub _cmd_exec {
>  sub _setup_git_cmd_env {
>  	my $self = shift;
>  	if ($self) {
> +		$ENV{GIT_PERL_FORCE_OWNERSHIP_CHECK} = 1;
>  		$self->repo_path() and $ENV{'GIT_DIR'} = $self->repo_path();
>  		$self->repo_path() and $self->wc_path()
>  			and $ENV{'GIT_WORK_TREE'} = $self->wc_path();

I'm not familiar enough with Git.pm to know if this is the right spot.
But we'd not want to break the case where GIT_DIR is set already. I.e.:

  GIT_DIR=/path/to/repo.git perl -MGit -e 'Git->repository'

should continue to work regardless of the ownership of repo.git. Only
the repo-discovery phase would want to force the ownership check.

Again, I'm not too familiar with Git.pm, but it seems it ought to be
asking Git: are we in a valid Git repo, and if so where is it? Something
like:

  my $git_dir = `git rev-parse --absolute-git-dir`;
  $? and die "nope, not in a git repo";

  # later, when we run git commands, we do specify this; the script may
  # have chdir()'d in the meantime, and we want to make sure we are
  # referring to the same repo via the object.
  local $ENV{GIT_DIR} = abs_path($git_dir);
  ...run some git command...

Looking at the code, we even seem to do that first part! But if it
returns an error, then we go on to check for a bare repository
ourselves by looking for refs/, objects/, etc. Which is just...weird.

It feels like this try/catch should just go away:

diff --git a/perl/Git.pm b/perl/Git.pm
index cf15ead664..7a7d8a2987 100644
--- a/perl/Git.pm
+++ b/perl/Git.pm
@@ -177,13 +177,7 @@ sub repository {
 		-d $opts{Directory} or throw Error::Simple("Directory not found: $opts{Directory} $!");

 		my $search = Git->repository(WorkingCopy => $opts{Directory});
-		my $dir;
-		try {
-			$dir = $search->command_oneline(['rev-parse', '--git-dir'],
-			                                STDERR => 0);
-		} catch Git::Error::Command with {
-			$dir = undef;
-		};
+		my $dir = $search->command_oneline(['rev-parse', '--git-dir']);

 		require Cwd;
 		if ($dir) {

And then the code below that to check for bare/not-bare should be using
"git rev-parse --is-bare-repository" or similar. Something like:

diff --git a/perl/Git.pm b/perl/Git.pm
index 7a7d8a2987..280df9cee1 100644
--- a/perl/Git.pm
+++ b/perl/Git.pm
@@ -179,8 +179,14 @@ sub repository {
 		my $search = Git->repository(WorkingCopy => $opts{Directory});
 		my $dir = $search->command_oneline(['rev-parse', '--git-dir']);

+		# could be merged with command above to be more efficient; or
+		# could probably use --show-toplevel to avoid prefix query
+		# below
+		my $bare = $search->command_oneline(['rev-parse', '--is-bare-repository'])
+		             eq 'true';
+
 		require Cwd;
-		if ($dir) {
+		if (!$bare) {
 			require File::Spec;
 			File::Spec->file_name_is_absolute($dir) or $dir = $opts{Directory} . '/' . $dir;
 			$opts{Repository} = Cwd::abs_path($dir);
@@ -198,21 +204,6 @@ sub repository {
 			$opts{WorkingSubdir} = $prefix;

 		} else {
-			# A bare repository? Let's see...
-			$dir = $opts{Directory};
-
-			unless (-d "$dir/refs" and -d "$dir/objects" and -e "$dir/HEAD") {
-				# Mimic git-rev-parse --git-dir error message:
-				throw Error::Simple("fatal: Not a git repository: $dir");
-			}
-			my $search = Git->repository(Repository => $dir);
-			try {
-				$search->command('symbolic-ref', 'HEAD');
-			} catch Git::Error::Command with {
-				# Mimic git-rev-parse --git-dir error message:
-				throw Error::Simple("fatal: Not a git repository: $dir");
-			};
-
 			$opts{Repository} = Cwd::abs_path($dir);
 		}

But given how much more complicated the current code is, I wonder if I
am missing some case. Or perhaps this code is just so old that it used
to do this stuff itself (because rev-parse didn't give us so much help).

-Peff

Jeff King Oct. 22, 2022, 9:18 p.m. UTC | #5

On Fri, Oct 21, 2022 at 10:29:33PM -0700, Junio C Hamano wrote:

> I also have to wonder (and this is *not* a suggestion for inventing
> an alternative fix for perl/Git.pm) if we were creating perl/Git.pm
> from scratch today, we even need to be worried about this issue.  We
> may have Git::repo_path() helper but if we call it in a natural way
> (read: as if an interactive end-user would type commands), it is
> likely that we would run "git rev-parse --git-dir" or something
> without setting GIT_DIR, and when we need to run "git" command, say
> "git diff", we would also run "git diff" as if the end user would
> type from within their interactive session and without setting
> GIT_DIR, and everything should work.  IOW, why do we even setting
> and exporting the auto-detected value in GIT_DIR?

I think it has to in order to avoid surprises. If I do this:

  perl -MGit -e '
    my $r = Git->repository;
    chdir("/somewhere/else");
    $r->git_command(...);
  '

that command ought to run in the repository I opened earlier. So I think
to keep the illusion of a lib-ified object, creating that object has to
lock in the path.

But it really seems like we should be asking rev-parse what that path
is, not trying to do any magic ourselves.

-Peff

Jeff King Oct. 22, 2022, 10:08 p.m. UTC | #6

On Sat, Oct 22, 2022 at 05:16:38PM -0400, Jeff King wrote:

> Again, I'm not too familiar with Git.pm, but it seems it ought to be
> asking Git: are we in a valid Git repo, and if so where is it? Something
> like:
> 
>   my $git_dir = `git rev-parse --absolute-git-dir`;
>   $? and die "nope, not in a git repo";
> 
>   # later, when we run git commands, we do specify this; the script may
>   # have chdir()'d in the meantime, and we want to make sure we are
>   # referring to the same repo via the object.
>   local $ENV{GIT_DIR} = abs_path($git_dir);
>   ...run some git command...
> 
> Looking at the code, we even seem to do that first part! But if it
> returns an error, then we go on to check for a bare repository
> ourselves by looking for refs/, objects/, etc. Which is just...weird.
> 
> It feels like this try/catch should just go away:

It's a little more complicated than that, because presumably people rely
on the error handling for a missing repo to not be noisy. So here's a
polished version of what I showed, along with the tests we were
discussing earlier.

I prepared it on top of your fix in the mm/git-pm-try-catch-syntax-fix
branch. That's not strictly necessary, since my patch deletes the line
you fixed. :) But I think it's nicer to use your fix as the starting
point, since it means the test runs but produces the wrong behavior,
rather than barfing with a syntax error.

-- >8 --
Subject: [PATCH] Git.pm: trust rev-parse to find bare repositories

When initializing a repository object, we run "git rev-parse --git-dir"
to let the C version of Git find the correct directory. But curiously,
if this fails we don't automatically say "not a git repository".
Instead, we do our own pure-perl check to see if we're in a bare
repository.

This makes little sense, as rev-parse will report both bare and non-bare
directories. This logic comes from d5c7721d58 (Git.pm: Add support for
subdirectories inside of working copies, 2006-06-24), but I don't see
any reason given why we can't just rely on rev-parse. Worse, because we
treat any non-error response from rev-parse as a non-bare repository,
we'll erroneously set the object's WorkingCopy, even in a bare
repository.

But it gets worse. Since 8959555cee (setup_git_directory(): add an owner
check for the top-level directory, 2022-03-02), it's actively wrong (and
dangerous). The perl code doesn't implement the same ownership checks.
And worse, after "finding" the bare repository, it sets GIT_DIR in the
environment, which tells any subsequent Git commands that we've
confirmed the directory is OK, and to trust us. I.e., it re-opens the
vulnerability plugged by 8959555cee when using Git.pm's repository
discovery code.

We can fix this by just relying on rev-parse to tell us when we're not
in a repository, which fixes the vulnerability. Furthermore, we'll ask
its --is-bare-repository function to tell us if we're bare or not, and
rely on that.

Signed-off-by: Jeff King <peff@peff.net>
---
I didn't dig into the "oops, we set WorkingCopy" thing beyond manually
verifying that it happens. It doesn't look like its really used beyond
the wc_path() method, so it's not like it would have been breaking git
sub-commands, etc. I guess we could add a test that wc_path() returns
undef in a bare repository, though.

 perl/Git.pm         | 36 ++++++++++++++++--------------------
 t/t9700-perl-git.sh |  4 ++++
 t/t9700/test.pl     | 12 ++++++++++++
 3 files changed, 32 insertions(+), 20 deletions(-)

diff --git a/perl/Git.pm b/perl/Git.pm
index cf15ead664..117765dc73 100644
--- a/perl/Git.pm
+++ b/perl/Git.pm
@@ -177,16 +177,27 @@ sub repository {
 		-d $opts{Directory} or throw Error::Simple("Directory not found: $opts{Directory} $!");
 
 		my $search = Git->repository(WorkingCopy => $opts{Directory});
-		my $dir;
+
+		# This rev-parse will throw an exception if we're not in a
+		# repository, which is what we want, but it's kind of noisy.
+		# Ideally we'd capture stderr and relay it, but doing so is
+		# awkward without depending on it fitting in a pipe buffer. So
+		# we just reproduce a plausible error message ourselves.
+		my $out;
 		try {
-			$dir = $search->command_oneline(['rev-parse', '--git-dir'],
-			                                STDERR => 0);
+		  # Note that "--is-bare-repository" must come first, as
+		  # --git-dir output could contain newlines.
+		  $out = $search->command([qw(rev-parse --is-bare-repository --git-dir)],
+			                  STDERR => 0);
 		} catch Git::Error::Command with {
-			$dir = undef;
+			throw Error::Simple("fatal: not a git repository: $opts{Directory}");
 		};
 
+		chomp $out;
+		my ($bare, $dir) = split /\n/, $out, 2;
+
 		require Cwd;
-		if ($dir) {
+		if ($bare ne 'true') {
 			require File::Spec;
 			File::Spec->file_name_is_absolute($dir) or $dir = $opts{Directory} . '/' . $dir;
 			$opts{Repository} = Cwd::abs_path($dir);
@@ -204,21 +215,6 @@ sub repository {
 			$opts{WorkingSubdir} = $prefix;
 
 		} else {
-			# A bare repository? Let's see...
-			$dir = $opts{Directory};
-
-			unless (-d "$dir/refs" and -d "$dir/objects" and -e "$dir/HEAD") {
-				# Mimic git-rev-parse --git-dir error message:
-				throw Error::Simple("fatal: Not a git repository: $dir");
-			}
-			my $search = Git->repository(Repository => $dir);
-			try {
-				$search->command('symbolic-ref', 'HEAD');
-			} catch Git::Error::Command with {
-				# Mimic git-rev-parse --git-dir error message:
-				throw Error::Simple("fatal: Not a git repository: $dir");
-			};
-
 			$opts{Repository} = Cwd::abs_path($dir);
 		}
 
diff --git a/t/t9700-perl-git.sh b/t/t9700-perl-git.sh
index 4aa5d90d32..b105d6d9d5 100755
--- a/t/t9700-perl-git.sh
+++ b/t/t9700-perl-git.sh
@@ -45,6 +45,10 @@ test_expect_success \
      git config --add test.pathmulti bar
      '
 
+test_expect_success 'set up bare repository' '
+	git init --bare bare.git
+'
+
 test_expect_success 'use t9700/test.pl to test Git.pm' '
 	"$PERL_PATH" "$TEST_DIRECTORY"/t9700/test.pl 2>stderr &&
 	test_must_be_empty stderr
diff --git a/t/t9700/test.pl b/t/t9700/test.pl
index e046f7db76..6d753708d2 100755
--- a/t/t9700/test.pl
+++ b/t/t9700/test.pl
@@ -30,6 +30,18 @@ sub adjust_dirsep {
 # set up
 our $abs_repo_dir = cwd();
 ok(our $r = Git->repository(Directory => "."), "open repository");
+{
+	local $ENV{GIT_TEST_ASSUME_DIFFERENT_OWNER} = 1;
+	my $failed;
+
+	$failed = eval { Git->repository(Directory => $abs_repo_dir) };
+	ok(!$failed, "reject unsafe non-bare repository");
+	like($@, qr/not a git repository/i, "unsafe error message");
+
+	$failed = eval { Git->repository(Directory => "$abs_repo_dir/bare.git") };
+	ok(!$failed, "reject unsafe bare repository");
+	like($@, qr/not a git repository/i, "unsafe error message");
+}
 
 # config
 is($r->config("test.string"), "value", "config scalar: string");

Junio C Hamano Oct. 22, 2022, 11:14 p.m. UTC | #7

Jeff King <peff@peff.net> writes:

> It feels like this try/catch should just go away:
> ...
> And then the code below that to check for bare/not-bare should be using
> "git rev-parse --is-bare-repository" or similar. Something like:

Yeah, exactly.  I very much like the straight-forward way of
thinking to have "git" do the real work.

Thanks.

Junio C Hamano Oct. 22, 2022, 11:17 p.m. UTC | #8

Jeff King <peff@peff.net> writes:

> I think it has to in order to avoid surprises. If I do this:
>
>   perl -MGit -e '
>     my $r = Git->repository;
>     chdir("/somewhere/else");
>     $r->git_command(...);
>   '
>
> that command ought to run in the repository I opened earlier. So I think
> to keep the illusion of a lib-ified object, creating that object has to
> lock in the path.
>
> But it really seems like we should be asking rev-parse what that path
> is, not trying to do any magic ourselves.

Yeah, whichever caller doing the chdir() needs to take the
responsibility of adjusting the future use of git, e.g. going back
to the original before spawning git or whatever.

Or having the original Git->repository call bail out by having "git"
figure out where it is, while honoring safe.directory or any future
protection underlying "git" offers.

Thanks.

Michael McClimon Oct. 22, 2022, 11:19 p.m. UTC | #9

> I prepared it on top of your fix in the mm/git-pm-try-catch-syntax-fix
> branch. That's not strictly necessary, since my patch deletes the line
> you fixed. :) But I think it's nicer to use your fix as the starting
> point, since it means the test runs but produces the wrong behavior,
> rather than barfing with a syntax error.

My vanity thanks you for this, even if it's not strictly necessary. As a
professional programmer with roughly no C chops and a long-time admirer of the
Git project, all I _really_ wanted to do was to fix a thing that was in my
wheelhouse so that I could say I have a commit in the history. (This isn't a
good reason on its own, of course, but I'm happy it was useful even if the
line is immediately deleted!)

> We can fix this by just relying on rev-parse to tell us when we're not
> in a repository, which fixes the vulnerability. Furthermore, we'll ask
> its --is-bare-repository function to tell us if we're bare or not, and
> rely on that.

Your suggested patch seems fine to me, and indeed I think if we were writing
it today we'd just rely on rev-parse to do the heavy lifting. It looks like
the code in question -- and indeed, the syntax error in question --  blames to
d5c7721d (Git.pm: Add support for subdirectories inside of working copies,
2006-06-23), at which point rev-parse did not appear to have any special
handling for bare repositories.

Ævar Arnfjörð Bjarmason Oct. 24, 2022, 10:57 a.m. UTC | #10

On Sat, Oct 22 2022, Jeff King wrote:

> On Sat, Oct 22, 2022 at 09:45:14PM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> The vulnerability safe.directory was supposed to address was one where
>> you'd set your fsmonitor hook via a config variable, so running "diff",
>> "status" etc. would unexpectedly execute arbitrary code.
>> 
>> Especially on Windows where apparently the equivalent of the root of a
>> shared mounted volume routinely has global write permissions (user's
>> subdirectories being less permissive).
>> 
>> An alternative I raised on the security list was to narrowly target just
>> the fsmonitor config, since that was the vulnerability.
>>
>> [...]
>>
>> I'm unaware of any other variable(s) that provide a similar vector, and
>> safe.directory is encouraging users (especially in core.sharedRepository
>> settings) to mark a dir as "safe", and we'd then later have an exploit
>> from a user with rw access who'd use the fsmonitor hook vector.
>
> Here are a few off the top of my head that you can trigger via git-diff:
>
>   - core.pager will run an arbitrary program
>
>   - pager.diff will run an arbitrary program
>
>   - diff.*.textconv runs an arbitrary program; you need matching
>     .gitattributes, but those are under the control of the repository.
>     (not diff.*.command, though, as you need to pass --ext-diff)
>
>   - browser/man paths if you run "git diff --help"
>
> And of course as you expand the set of commands there are more options.
> E.g., credential helper commands if you do anything that wants auth,
> interactive diff-filter for "add -p", hooks for git-commit, git-push,
> etc. Those commands are less likely to be run in an untrusted repository
> than inspection commands like "status" or "diff", but the boundary is
> getting quite fuzzy.
>
> fsmonitor _might_ be the only one that is triggered by git-prompt.sh,
> because it has a limited command palette, generally reads (or sends to
> /dev/null) the stdout of commands (preventing pager invocation), and
> doesn't do text diffs (so no textconv). Even if true, I'm not sure if
> that's a good place to draw the line, though. People do tend to run "git
> log" themselves.

Right, by "a similar" vector I meant the unexpected execution of
fsmonitor as there was software in the wild that was running "status"
for the user.

These are also a problem, but my understanding of that issue is that if
it wasn't for the fsmonitor issue we'd have put that in the bucket of
not running arbitrary commands on a .git you just copied to somewhere,
i.e. that issue was already known.

The difference being that users might have that implicit "status"
running if they cd'd to /mnt/$USER/subdir and there was a hostile
/mnt/.git, but would be much less likely to run "git diff" or whatever
in such a subdir, unless they'd initialized a .git in say
/mnt/$USER/subdir, at which point we'd ignore the /mnt/.git.

Anyway, that's more into the "would it have been a CVE?" territory, so
let's leave that for now.

The important point/question I have is whether we can think of any such
config variable understood by the code that uses Git.pm.

The only ones I can think are the "sendemail.{to,cc}Cmd" variables.

I'm just pointing out that the reason we ended up with the facility (per
my understand) was among other things:

 * A. There were too many config variables to exhaustively audit on the
   security list and be sure we caught them all, and ...
 * B. ...such a change would probably be larger, which ...
 * C. ...given the embargo & desire to keep security patches minimal
   warranted the more general safe.directory approach.
 * D. You can talk about on the public list, or not have a zero-day, but
   not both :)

Now, we may come up with a reason "E" for extending this at this point,
e.g. maybe just being consistent is reason enough.

But in this case "A" doesn't apply, it's maybe 20-30 config variables,
and it's easy to skim the git-{send-email,svn} docs to see what they
are. "B" might be the case, but taht's OK since we're past "D" here,
ditto "C" not applying.

Jeff King Oct. 24, 2022, 11:33 p.m. UTC | #11

On Sat, Oct 22, 2022 at 07:19:21PM -0400, Michael McClimon wrote:

> > I prepared it on top of your fix in the mm/git-pm-try-catch-syntax-fix
> > branch. That's not strictly necessary, since my patch deletes the line
> > you fixed. :) But I think it's nicer to use your fix as the starting
> > point, since it means the test runs but produces the wrong behavior,
> > rather than barfing with a syntax error.
> 
> My vanity thanks you for this, even if it's not strictly necessary. As a
> professional programmer with roughly no C chops and a long-time admirer of the
> Git project, all I _really_ wanted to do was to fix a thing that was in my
> wheelhouse so that I could say I have a commit in the history. (This isn't a
> good reason on its own, of course, but I'm happy it was useful even if the
> line is immediately deleted!)

:) I think it makes my patch easier to understand, but I'm glad you are
getting some you, too. Finding the original issue and the ensuing
discussion was as much (or more) effort than the actual fix. Thanks for
starting it!

-Peff

Jeff King Oct. 24, 2022, 11:38 p.m. UTC | #12

On Mon, Oct 24, 2022 at 12:57:29PM +0200, Ævar Arnfjörð Bjarmason wrote:

> The important point/question I have is whether we can think of any such
> config variable understood by the code that uses Git.pm.

I don't think that matters. Before the CVE fix, Git.pm scripts were just
as vulnerable as all the other parts of Git. After, they were broken
because of the syntax error. Fixing the syntax error re-opened the bug
there, but as long as we close it again before releasing, we don't have
to care.

You can argue that the CVE wasn't that important for Git.pm, and thus
not that big a deal to re-open. But I think post-CVE we're making the
stronger promise that Git won't discover a repo directory with funky
ownership. And Git.pm is violating that (or would be after the syntax
fix if we don't go further).

> The only ones I can think are the "sendemail.{to,cc}Cmd" variables.

I don't think we can be that exhaustive. It's also any programs that are
called by scripts using Git.pm. But even that is not a closed set, since
we ship Git.pm for people to use in their own scripts. We don't know
what those scripts might be doing.

-Peff

[v2,2/2] setup: allow Git.pm to do unsafe repo checking

Commit Message

Comments

Patch