diff mbox series

[v2,2/3] setup: add discover_git_directory_reason()

Message ID 787af0f9744e2f18c9ab680886650278a9d2f8d0.1692725056.git.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series scalar: two downstream improvements | expand

Commit Message

Derrick Stolee Aug. 22, 2023, 5:24 p.m. UTC
From: Derrick Stolee <derrickstolee@github.com>

There are many reasons why discovering a Git directory may fail. In
particular, 8959555cee7 (setup_git_directory(): add an owner check for
the top-level directory, 2022-03-02) added ownership checks as a
security precaution.

Callers attempting to set up a Git directory may want to inform the user
about the reason for the failure. For that, expose the enum
discovery_result from within setup.c and into cache.h where
discover_git_directory() is defined.

I initially wanted to change the return type of discover_git_directory()
to be this enum, but several callers rely upon the "zero means success".
The two problems with this are:

1. The zero value of the enum is actually GIT_DIR_NONE, so nonpositive
   results are errors.

2. There are multiple successful states; positive results are
   successful.

It is worth noting that GIT_DIR_NONE is not returned, so we remove this
option from the enum. We must be careful to keep the successful reasons
as positive values, so they are given explicit positive values.
Further, a use in scalar.c was previously impossible, so it is removed.

Instead of updating all callers immediately, add a new method,
discover_git_directory_reason(), and convert discover_git_directory() to
be a thin shim on top of it.

One thing that is important to note is that discover_git_directory()
previously returned -1 on error, so let's continue that into the future.
There is only one caller (in scalar.c) that depends on that signedness
instead of a non-zero check, so clean that up, too.

Because there are extra checks that discover_git_directory_reason() does
after setup_git_directory_gently_1(), there are other modes that can be
returned for failure states. Add these modes to the enum, but be sure to
explicitly add them as BUG() states in the switch of
setup_git_directory_gently().

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 scalar.c |  3 ---
 setup.c  | 34 ++++++++++++----------------------
 setup.h  | 35 ++++++++++++++++++++++++++++++++---
 3 files changed, 44 insertions(+), 28 deletions(-)

Comments

Junio C Hamano Aug. 22, 2023, 7:30 p.m. UTC | #1
"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Derrick Stolee <derrickstolee@github.com>
>
> There are many reasons why discovering a Git directory may fail. In
> particular, 8959555cee7 (setup_git_directory(): add an owner check for
> the top-level directory, 2022-03-02) added ownership checks as a
> security precaution.
>
> Callers attempting to set up a Git directory may want to inform the user
> about the reason for the failure. For that, expose the enum
> discovery_result from within setup.c and into cache.h where
> discover_git_directory() is defined.
>
> I initially wanted to change the return type of discover_git_directory()
> to be this enum, but several callers rely upon the "zero means success".
> The two problems with this are:
>
> 1. The zero value of the enum is actually GIT_DIR_NONE, so nonpositive
>    results are errors.
>
> 2. There are multiple successful states; positive results are
>    successful.
>
> It is worth noting that GIT_DIR_NONE is not returned, so we remove this
> option from the enum. We must be careful to keep the successful reasons
> as positive values, so they are given explicit positive values.
> Further, a use in scalar.c was previously impossible, so it is removed.
>
> Instead of updating all callers immediately, add a new method,
> discover_git_directory_reason(), and convert discover_git_directory() to
> be a thin shim on top of it.
>
> One thing that is important to note is that discover_git_directory()
> previously returned -1 on error, so let's continue that into the future.
> There is only one caller (in scalar.c) that depends on that signedness
> instead of a non-zero check, so clean that up, too.
>
> Because there are extra checks that discover_git_directory_reason() does
> after setup_git_directory_gently_1(), there are other modes that can be
> returned for failure states. Add these modes to the enum, but be sure to
> explicitly add them as BUG() states in the switch of
> setup_git_directory_gently().
>
> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
> ---
>  scalar.c |  3 ---
>  setup.c  | 34 ++++++++++++----------------------
>  setup.h  | 35 ++++++++++++++++++++++++++++++++---
>  3 files changed, 44 insertions(+), 28 deletions(-)
>
> diff --git a/scalar.c b/scalar.c
> index 938bb73f3ce..02a38e845e1 100644
> --- a/scalar.c
> +++ b/scalar.c
> @@ -686,9 +686,6 @@ static int cmd_reconfigure(int argc, const char **argv)
>  				warning(_("removing stale scalar.repo '%s'"),
>  					dir);
>  			strbuf_release(&buf);
> -		} else if (discover_git_directory(&commondir, &gitdir) < 0) {
> -			warning_errno(_("git repository gone in '%s'"), dir);
> -			res = -1;
>  		} else {
>  			git_config_clear();
>  

In the original before this series, and also after applying [PATCH
3/3], the reconfiguration is a three-step process:

 - what if we cannot go to that directory?
 - what if the directory is not a usable repository?
 - now we are in a usable repository, let's reconfigure.

and this seems to lose the second step tentatively?  It is
resurrected in a more enhanced form in [PATCH 3/3] so it may not be
a huge deal, but it looks like it is not intended lossage.
Derrick Stolee Aug. 22, 2023, 7:39 p.m. UTC | #2
On 8/22/2023 3:30 PM, Junio C Hamano wrote:
> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>> From: Derrick Stolee <derrickstolee@github.com>

>> It is worth noting that GIT_DIR_NONE is not returned, so we remove this
>> option from the enum. We must be careful to keep the successful reasons
>> as positive values, so they are given explicit positive values.
>> Further, a use in scalar.c was previously impossible, so it is removed.

(Relevant bit from the message.)

>> diff --git a/scalar.c b/scalar.c
>> index 938bb73f3ce..02a38e845e1 100644
>> --- a/scalar.c
>> +++ b/scalar.c
>> @@ -686,9 +686,6 @@ static int cmd_reconfigure(int argc, const char **argv)
>>  				warning(_("removing stale scalar.repo '%s'"),
>>  					dir);
>>  			strbuf_release(&buf);
>> -		} else if (discover_git_directory(&commondir, &gitdir) < 0) {
>> -			warning_errno(_("git repository gone in '%s'"), dir);
>> -			res = -1;
>>  		} else {
>>  			git_config_clear();
>>  
> 
> In the original before this series, and also after applying [PATCH
> 3/3], the reconfiguration is a three-step process:
> 
>  - what if we cannot go to that directory?
>  - what if the directory is not a usable repository?
>  - now we are in a usable repository, let's reconfigure.
> 
> and this seems to lose the second step tentatively?  It is
> resurrected in a more enhanced form in [PATCH 3/3] so it may not be
> a huge deal, but it looks like it is not intended lossage.

I know what happened here. At some point during my edits, this line was
changed to "else if (!discover_git_directory(...))" which became an
unreachable case.

So, based on the intermediate patch that did not survive, it made sense,
but you are right that this hunk of this patch is behaving badly. Good
eye!

Thanks,
-Stolee
Oswald Buddenhagen Aug. 23, 2023, 9:58 a.m. UTC | #3
On Tue, Aug 22, 2023 at 05:24:14PM +0000, Derrick Stolee via GitGitGadget wrote:
>From: Derrick Stolee <derrickstolee@github.com>
>
>There are many reasons why discovering a Git directory may fail. In
>particular, 8959555cee7 (setup_git_directory(): add an owner check for
>the top-level directory, 2022-03-02) added ownership checks as a
>security precaution.
>
>Callers attempting to set up a Git directory may want to inform the user
>about the reason for the failure. For that, expose the enum
>discovery_result from within setup.c

>and
>
"by moving it"

>into cache.h where
>discover_git_directory() is defined.
>
>I initially wanted to change the return type of discover_git_directory()
>to be this enum, but several callers rely upon the "zero means success".
>The two problems with this are:
>
>1. The zero value of the enum is actually GIT_DIR_NONE, so nonpositive
>   results are errors.
>
>2. There are multiple successful states; positive results are
>   successful.
>
>It is worth noting that GIT_DIR_NONE is not returned, so we remove this
>option from the enum. We must be careful to keep the successful reasons
>as positive values, so they are given explicit positive values.

>Further, a use in scalar.c was previously impossible, so it is removed.
>
i have no clue wha this means. what is "it"?

>Instead of updating all callers immediately, add a new method,
>discover_git_directory_reason(), and convert discover_git_directory() to
>be a thin shim on top of it.
>
is this really worth it, given that there are just two callers, and the 
adjustment is trivial?
if you insist, note that the function name can be easily misread as 
"discover_git_(directory_reason)", which is unhelpful. i typically use 
an _impl suffix in such "thin wrapper" scenarios.

regards
diff mbox series

Patch

diff --git a/scalar.c b/scalar.c
index 938bb73f3ce..02a38e845e1 100644
--- a/scalar.c
+++ b/scalar.c
@@ -686,9 +686,6 @@  static int cmd_reconfigure(int argc, const char **argv)
 				warning(_("removing stale scalar.repo '%s'"),
 					dir);
 			strbuf_release(&buf);
-		} else if (discover_git_directory(&commondir, &gitdir) < 0) {
-			warning_errno(_("git repository gone in '%s'"), dir);
-			res = -1;
 		} else {
 			git_config_clear();
 
diff --git a/setup.c b/setup.c
index 18927a847b8..2e607632dbd 100644
--- a/setup.c
+++ b/setup.c
@@ -1221,19 +1221,6 @@  static const char *allowed_bare_repo_to_string(
 	return NULL;
 }
 
-enum discovery_result {
-	GIT_DIR_NONE = 0,
-	GIT_DIR_EXPLICIT,
-	GIT_DIR_DISCOVERED,
-	GIT_DIR_BARE,
-	/* these are errors */
-	GIT_DIR_HIT_CEILING = -1,
-	GIT_DIR_HIT_MOUNT_POINT = -2,
-	GIT_DIR_INVALID_GITFILE = -3,
-	GIT_DIR_INVALID_OWNERSHIP = -4,
-	GIT_DIR_DISALLOWED_BARE = -5,
-};
-
 /*
  * We cannot decide in this function whether we are in the work tree or
  * not, since the config can only be read _after_ this function was called.
@@ -1385,21 +1372,23 @@  static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir,
 	}
 }
 
-int discover_git_directory(struct strbuf *commondir,
-			   struct strbuf *gitdir)
+enum discovery_result discover_git_directory_reason(struct strbuf *commondir,
+						    struct strbuf *gitdir)
 {
 	struct strbuf dir = STRBUF_INIT, err = STRBUF_INIT;
 	size_t gitdir_offset = gitdir->len, cwd_len;
 	size_t commondir_offset = commondir->len;
 	struct repository_format candidate = REPOSITORY_FORMAT_INIT;
+	enum discovery_result result;
 
 	if (strbuf_getcwd(&dir))
-		return -1;
+		return GIT_DIR_CWD_FAILURE;
 
 	cwd_len = dir.len;
-	if (setup_git_directory_gently_1(&dir, gitdir, NULL, 0) <= 0) {
+	result = setup_git_directory_gently_1(&dir, gitdir, NULL, 0);
+	if (result <= 0) {
 		strbuf_release(&dir);
-		return -1;
+		return result;
 	}
 
 	/*
@@ -1429,11 +1418,11 @@  int discover_git_directory(struct strbuf *commondir,
 		strbuf_setlen(commondir, commondir_offset);
 		strbuf_setlen(gitdir, gitdir_offset);
 		clear_repository_format(&candidate);
-		return -1;
+		return GIT_DIR_INVALID_FORMAT;
 	}
 
 	clear_repository_format(&candidate);
-	return 0;
+	return result;
 }
 
 const char *setup_git_directory_gently(int *nongit_ok)
@@ -1515,10 +1504,11 @@  const char *setup_git_directory_gently(int *nongit_ok)
 		}
 		*nongit_ok = 1;
 		break;
-	case GIT_DIR_NONE:
+	case GIT_DIR_CWD_FAILURE:
+	case GIT_DIR_INVALID_FORMAT:
 		/*
 		 * As a safeguard against setup_git_directory_gently_1 returning
-		 * this value, fallthrough to BUG. Otherwise it is possible to
+		 * these values, fallthrough to BUG. Otherwise it is possible to
 		 * set startup_info->have_repository to 1 when we did nothing to
 		 * find a repository.
 		 */
diff --git a/setup.h b/setup.h
index 58fd2605dd2..b48cf1c43b5 100644
--- a/setup.h
+++ b/setup.h
@@ -42,16 +42,45 @@  const char *resolve_gitdir_gently(const char *suspect, int *return_error_code);
 #define resolve_gitdir(path) resolve_gitdir_gently((path), NULL)
 
 void setup_work_tree(void);
+
+/*
+ * discover_git_directory_reason() is similar to discover_git_directory(),
+ * except it returns an enum value instead. It is important to note that
+ * a zero-valued return here is actually GIT_DIR_NONE, which is different
+ * from discover_git_directory.
+ */
+enum discovery_result {
+	GIT_DIR_EXPLICIT = 1,
+	GIT_DIR_DISCOVERED = 2,
+	GIT_DIR_BARE = 3,
+	/* these are errors */
+	GIT_DIR_HIT_CEILING = -1,
+	GIT_DIR_HIT_MOUNT_POINT = -2,
+	GIT_DIR_INVALID_GITFILE = -3,
+	GIT_DIR_INVALID_OWNERSHIP = -4,
+	GIT_DIR_DISALLOWED_BARE = -5,
+	GIT_DIR_INVALID_FORMAT = -6,
+	GIT_DIR_CWD_FAILURE = -7,
+};
+enum discovery_result discover_git_directory_reason(struct strbuf *commondir,
+						    struct strbuf *gitdir);
+
 /*
  * Find the commondir and gitdir of the repository that contains the current
  * working directory, without changing the working directory or other global
  * state. The result is appended to commondir and gitdir.  If the discovered
  * gitdir does not correspond to a worktree, then 'commondir' and 'gitdir' will
  * both have the same result appended to the buffer.  The return value is
- * either 0 upon success and non-zero if no repository was found.
+ * either 0 upon success and -1 if no repository was found.
  */
-int discover_git_directory(struct strbuf *commondir,
-			   struct strbuf *gitdir);
+static inline int discover_git_directory(struct strbuf *commondir,
+					 struct strbuf *gitdir)
+{
+	if (discover_git_directory_reason(commondir, gitdir) <= 0)
+		return -1;
+	return 0;
+}
+
 const char *setup_git_directory_gently(int *);
 const char *setup_git_directory(void);
 char *prefix_path(const char *prefix, int len, const char *path);