diff mbox series

[3/3] repository: create read_replace_refs setting

Message ID 481a81a515efb29bc4eb0b1a09b7d1df3f3c074b.1685126618.git.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series Create stronger guard rails on replace refs | expand

Commit Message

Derrick Stolee May 26, 2023, 6:43 p.m. UTC
From: Derrick Stolee <derrickstolee@github.com>

The 'read_replace_refs' global specifies whether or not we should
respect the references of the form 'refs/replace/<oid>' to replace which
object we look up when asking for '<oid>'. This global has caused issues
when it is not initialized properly, such as in b6551feadfd (merge-tree:
load default git config, 2023-05-10).

To make this more robust, move its config-based initialization out of
git_default_config and into prepare_repo_settings(). This provides a
repository-scoped version of the 'read_replace_refs' global.

The global still has its purpose: it is disabled process-wide by the
GIT_NO_REPLACE_OBJECTS environment variable or by a call to
disable_replace_refs() in some specific Git commands.

Since we already encapsulated the use of the constant inside
replace_refs_enabled(), we can perform the initialization inside that
method, if necessary. This solves the problem of forgetting to check the
config, as we will check it before returning this value.

There is an interesting behavior change possible here: we now have a
repository-scoped understanding of this config value. Thus, if there was
a command that recurses into submodules and might follow replace refs,
then it would now respect the core.useReplaceRefs config value in each
repository.

Unfortunately, the existing processes that recurse into submodules do
not appear to follow object IDs to their contents, so this behavior
change is not visible in the current implementation. It is something
valuable for future behavior changes.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 config.c         |  5 -----
 replace-object.c | 13 +++++++++++--
 replace-object.h |  8 --------
 repo-settings.c  |  1 +
 repository.h     |  8 ++++++++
 5 files changed, 20 insertions(+), 15 deletions(-)

Comments

Victoria Dye June 1, 2023, 4:36 p.m. UTC | #1
Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <derrickstolee@github.com>
> 
> The 'read_replace_refs' global specifies whether or not we should
> respect the references of the form 'refs/replace/<oid>' to replace which
> object we look up when asking for '<oid>'. This global has caused issues
> when it is not initialized properly, such as in b6551feadfd (merge-tree:
> load default git config, 2023-05-10).
> 
> To make this more robust, move its config-based initialization out of
> git_default_config and into prepare_repo_settings(). This provides a
> repository-scoped version of the 'read_replace_refs' global.

As you noted in [1], this could be clearer. I think the most confusing part
is referring to it as "a repository-scoped version of the...global" because
it implies that the global and repo-scoped setting do the same thing/take
the same precedence (when, in reality, if replace refs are disabled
globally, the config doesn't do anything). Maybe something like this would
make that clearer?

"This provides a repository-scoped configuration that's only used if replace
refs are not already disabled process-wide with the global
'read_replace_refs'."

[1] https://lore.kernel.org/git/ae89feda-0a76-29d7-14ce-662214414638@github.com/

> 
> The global still has its purpose: it is disabled process-wide by the
> GIT_NO_REPLACE_OBJECTS environment variable or by a call to
> disable_replace_refs() in some specific Git commands.
> 
> Since we already encapsulated the use of the constant inside
> replace_refs_enabled(), we can perform the initialization inside that
> method, if necessary. This solves the problem of forgetting to check the
> config, as we will check it before returning this value.
> 
> There is an interesting behavior change possible here: we now have a
> repository-scoped understanding of this config value. Thus, if there was
> a command that recurses into submodules and might follow replace refs,
> then it would now respect the core.useReplaceRefs config value in each
> repository.
> 
> Unfortunately, the existing processes that recurse into submodules do
> not appear to follow object IDs to their contents, so this behavior
> change is not visible in the current implementation. It is something
> valuable for future behavior changes.

AFAIK, the only '--recurse-submodules' commands that recurse in-process are
'ls-files' and 'grep'. However, 'grep' does call 'parse_object_or_die()',
which (further down in the call stack) calls 'lookup_replace_object()'.
Maybe I'm misreading and the replaced object isn't actually used, but could
'git grep --recurse-submodules' be used to test this?

> @@ -94,5 +94,14 @@ void disable_replace_refs(void)
>  
>  int replace_refs_enabled(struct repository *r)
>  {
> -	return read_replace_refs;
> +	if (!read_replace_refs)
> +		return 0;
> +
> +	if (r->gitdir) {
> +		prepare_repo_settings(r);
> +		return r->settings.read_replace_refs;
> +	}
> +
> +	/* repository has no objects or refs. */
> +	return 0;
>  }

This implementation matches the intent outlined in this patch/the cover
letter:

- if replace refs are disabled process-wide, always return 0
- if the gitdir is present, return the value of 'core.usereplacerefs'
- if there's no gitdir, there's no repository set up (and therefore no
  config to read/objects to replace), so return 0

I was a bit unsure about whether 'r->gitdir' was the right check to make,
but it's consistent with other gates to 'prepare_repo_settings()' (e.g.
those added in 059fda19021 (checkout/fetch/pull/pack-objects: allow `-h`
outside a repository, 2022-02-08)), so I'm happy with it.

> diff --git a/repo-settings.c b/repo-settings.c
> index 1df0320bf33..5a7c990300d 100644
> --- a/repo-settings.c
> +++ b/repo-settings.c
> @@ -68,6 +68,7 @@ void prepare_repo_settings(struct repository *r)
>  	repo_cfg_bool(r, "pack.usebitmapboundarytraversal",
>  		      &r->settings.pack_use_bitmap_boundary_traversal,
>  		      r->settings.pack_use_bitmap_boundary_traversal);
> +	repo_cfg_bool(r, "core.usereplacerefs", &r->settings.read_replace_refs, 1);

This defaults to enabling replace refs, consistent with the (intended)
behavior prior to this series. Good!

>  
>  	/*
>  	 * The GIT_TEST_MULTI_PACK_INDEX variable is special in that
> diff --git a/repository.h b/repository.h
> index c42f7ab6bdc..13fefa540bc 100644
> --- a/repository.h
> +++ b/repository.h
> @@ -39,6 +39,14 @@ struct repo_settings {
>  	int pack_read_reverse_index;
>  	int pack_use_bitmap_boundary_traversal;
>  
> +	/*
> +	 * Do replace refs need to be checked this run?  This variable is
> +	 * initialized to true unless --no-replace-object is used or
> +	 * $GIT_NO_REPLACE_OBJECTS is set, but is set to false by some
> +	 * commands that do not want replace references to be active.
> +	 */
> +	int read_replace_refs;

I don't think this comment is accurate anymore, since the repo-scoped
'read_replace_refs' value is determined *only* by the 'core.usereplacerefs'
config. It's 'replace_refs_enabled()' that makes the overall determination
(taking into account 'GIT_NO_REPLACE_OBJECTS'/'--no-replace-object').

> +
>  	struct fsmonitor_settings *fsmonitor; /* lazily loaded */
>  
>  	int index_version;
Derrick Stolee June 1, 2023, 7:52 p.m. UTC | #2
On 6/1/2023 12:36 PM, Victoria Dye wrote:
> Derrick Stolee via GitGitGadget wrote:
>> From: Derrick Stolee <derrickstolee@github.com>

>> Unfortunately, the existing processes that recurse into submodules do
>> not appear to follow object IDs to their contents, so this behavior
>> change is not visible in the current implementation. It is something
>> valuable for future behavior changes.
> 
> AFAIK, the only '--recurse-submodules' commands that recurse in-process are
> 'ls-files' and 'grep'. However, 'grep' does call 'parse_object_or_die()',
> which (further down in the call stack) calls 'lookup_replace_object()'.
> Maybe I'm misreading and the replaced object isn't actually used, but could
> 'git grep --recurse-submodules' be used to test this?

You're right. I was laser-focused on 'ls-files', but it shouldn't be hard
to construct an example where 'git grep --recurse-submodules' would show
different behavior when this config is toggled.

>> +	/*
>> +	 * Do replace refs need to be checked this run?  This variable is
>> +	 * initialized to true unless --no-replace-object is used or
>> +	 * $GIT_NO_REPLACE_OBJECTS is set, but is set to false by some
>> +	 * commands that do not want replace references to be active.
>> +	 */
>> +	int read_replace_refs;
> 
> I don't think this comment is accurate anymore, since the repo-scoped
> 'read_replace_refs' value is determined *only* by the 'core.usereplacerefs'
> config. It's 'replace_refs_enabled()' that makes the overall determination
> (taking into account 'GIT_NO_REPLACE_OBJECTS'/'--no-replace-object').

Thank you for catching my mistake here. I'll be sure to update it in v2.

Thanks,
-Stolee
diff mbox series

Patch

diff --git a/config.c b/config.c
index 43b0d3fb573..d0ce902af39 100644
--- a/config.c
+++ b/config.c
@@ -1838,11 +1838,6 @@  static int git_default_core_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
-	if (!strcmp(var, "core.usereplacerefs")) {
-		read_replace_refs = git_config_bool(var, value);
-		return 0;
-	}
-
 	/* Add other config variables here and to Documentation/config.txt. */
 	return platform_core_config(var, value, cb);
 }
diff --git a/replace-object.c b/replace-object.c
index cf91c3ba456..c599729a281 100644
--- a/replace-object.c
+++ b/replace-object.c
@@ -64,7 +64,7 @@  void prepare_replace_object(struct repository *r)
  * replacement object's name (replaced recursively, if necessary).
  * The return value is either oid or a pointer to a
  * permanently-allocated value.  This function always respects replace
- * references, regardless of the value of read_replace_refs.
+ * references, regardless of the value of r->settings.read_replace_refs.
  */
 const struct object_id *do_lookup_replace_object(struct repository *r,
 						 const struct object_id *oid)
@@ -94,5 +94,14 @@  void disable_replace_refs(void)
 
 int replace_refs_enabled(struct repository *r)
 {
-	return read_replace_refs;
+	if (!read_replace_refs)
+		return 0;
+
+	if (r->gitdir) {
+		prepare_repo_settings(r);
+		return r->settings.read_replace_refs;
+	}
+
+	/* repository has no objects or refs. */
+	return 0;
 }
diff --git a/replace-object.h b/replace-object.h
index b141075023e..8813a75b96e 100644
--- a/replace-object.h
+++ b/replace-object.h
@@ -5,14 +5,6 @@ 
 #include "repository.h"
 #include "object-store.h"
 
-/*
- * Do replace refs need to be checked this run?  This variable is
- * initialized to true unless --no-replace-object is used or
- * $GIT_NO_REPLACE_OBJECTS is set, but is set to false by some
- * commands that do not want replace references to be active.
- */
-extern int read_replace_refs;
-
 struct replace_object {
 	struct oidmap_entry original;
 	struct object_id replacement;
diff --git a/repo-settings.c b/repo-settings.c
index 1df0320bf33..5a7c990300d 100644
--- a/repo-settings.c
+++ b/repo-settings.c
@@ -68,6 +68,7 @@  void prepare_repo_settings(struct repository *r)
 	repo_cfg_bool(r, "pack.usebitmapboundarytraversal",
 		      &r->settings.pack_use_bitmap_boundary_traversal,
 		      r->settings.pack_use_bitmap_boundary_traversal);
+	repo_cfg_bool(r, "core.usereplacerefs", &r->settings.read_replace_refs, 1);
 
 	/*
 	 * The GIT_TEST_MULTI_PACK_INDEX variable is special in that
diff --git a/repository.h b/repository.h
index c42f7ab6bdc..13fefa540bc 100644
--- a/repository.h
+++ b/repository.h
@@ -39,6 +39,14 @@  struct repo_settings {
 	int pack_read_reverse_index;
 	int pack_use_bitmap_boundary_traversal;
 
+	/*
+	 * Do replace refs need to be checked this run?  This variable is
+	 * initialized to true unless --no-replace-object is used or
+	 * $GIT_NO_REPLACE_OBJECTS is set, but is set to false by some
+	 * commands that do not want replace references to be active.
+	 */
+	int read_replace_refs;
+
 	struct fsmonitor_settings *fsmonitor; /* lazily loaded */
 
 	int index_version;