mbox series

[v2,0/3] Create stronger guard rails on replace refs

Message ID pull.1537.v2.git.1685716157.gitgitgadget@gmail.com (mailing list archive)
Headers show
Series Create stronger guard rails on replace refs | expand

Message

Jean-Noël Avila via GitGitGadget June 2, 2023, 2:29 p.m. UTC
(This series is based on tb/pack-bitmap-traversal-with-boundary due to
wanting to modify prepare_repo_settings() in a similar way.)

The replace-refs can be ignored via the core.useReplaceRefs=false config
setting. This setting is possible to miss in some Git commands if they do
not load default config at the appropriate time. See [1] for a recent
example of this.

[1]
https://lore.kernel.org/git/pull.1530.git.1683745654800.gitgitgadget@gmail.com/

This series aims to avoid this kind of error from happening in the future.
The idea is to encapsulate the setting in such a way that we can guarantee
that config has been checked before using the in-memory value.

Further, we must be careful that some Git commands want to disable replace
refs unconditionally, as if GIT_NO_REPLACE_REFS was enabled in the
environment.

The approach taken here is to split the global into two different sources.
First, read_replace_refs is kept (but moved to replace-objects.c scope) and
reflects whether or not the feature is permitted by the environment and the
current command. Second, a new value is added to repo-settings and this is
checked after using prepare_repo_settings() to guarantee the config has been
read.

This presents a potential behavior change, in that now core.useReplaceRefs
is specific to each in-memory repository instead of applying the
superproject value to all submodules. I could not find a Git command that
has multiple in-memory repositories and follows OIDs to object contents, so
I'm not sure how to demonstrate it in a test.

Here is the breakdown of the series:

 * Patch 1 creates disable_replace_refs() to encapsulate the global
   disabling of the feature.
 * Patch 2 creates replace_refs_enabled() to check if the feature is enabled
   (with respect to a given repository). This is a thin wrapper of the
   global at this point, but does allow us to remove it from environment.h.
 * Patch 3 creates the value in repo-settings as well as ensures that the
   repo settings have been prepared before accessing the value within
   replace_refs_enabled(). A test is added to demonstrate how the config
   value is now scoped on a per-repository basis.


Updates in v2
=============

Thanks for the careful review on v1!

 * disable_replace_refs() now replaces "read_replace_refs = 0" in the exact
   same line to avoid possible behavior change.
 * Stale comments, include headers, and commit messages are updated to
   include the latest status.
 * Patch 3 contains a test of the repo-scoped value using 'git grep'.

Thanks, -Stolee

Derrick Stolee (3):
  repository: create disable_replace_refs()
  replace-objects: create wrapper around setting
  repository: create read_replace_refs setting

 builtin/cat-file.c                 |  2 +-
 builtin/commit-graph.c             |  2 +-
 builtin/fsck.c                     |  2 +-
 builtin/index-pack.c               |  2 +-
 builtin/pack-objects.c             |  2 +-
 builtin/prune.c                    |  2 +-
 builtin/replace.c                  |  2 +-
 builtin/unpack-objects.c           |  2 +-
 builtin/upload-pack.c              |  2 +-
 commit-graph.c                     |  4 +--
 config.c                           |  5 ----
 environment.c                      |  3 +--
 git.c                              |  2 +-
 log-tree.c                         |  2 +-
 replace-object.c                   | 23 ++++++++++++++++-
 replace-object.h                   | 31 ++++++++++++++++-------
 repo-settings.c                    |  1 +
 repository.h                       |  9 +++++++
 t/t7814-grep-recurse-submodules.sh | 40 ++++++++++++++++++++++++++++++
 19 files changed, 107 insertions(+), 31 deletions(-)


base-commit: b0afdce5dab61f224fd66c13768facc36a7f8705
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1537%2Fderrickstolee%2Freplace-refs-safety-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1537/derrickstolee/replace-refs-safety-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1537

Range-diff vs v1:

 1:  56544abc15d ! 1:  0616fdbf303 repository: create disable_replace_refs()
     @@ Commit message
          transition by abstracting the purpose of these global assignments with a
          method call.
      
     -    We will never scope this to an in-memory repository as we want to make
     -    sure that we never use replace refs throughout the life of the process
     -    if this method is called.
     +    We will need to keep this read_replace_refs global forever, as we want
     +    to make sure that we never use replace refs throughout the life of the
     +    process if this method is called. Future changes may present a
     +    repository-scoped version of the variable to represent that repository's
     +    core.useReplaceRefs config value, but a zero-valued read_replace_refs
     +    will always override such a setting.
      
          Signed-off-by: Derrick Stolee <derrickstolee@github.com>
      
     @@ builtin/cat-file.c: static int batch_objects(struct batch_options *opt)
       		cb.expand = &data;
      
       ## builtin/commit-graph.c ##
     -@@ builtin/commit-graph.c: static int graph_verify(int argc, const char **argv, const char *prefix)
     - 	return ret;
     - }
     - 
     --extern int read_replace_refs;
     - static struct commit_graph_opts write_opts;
     - 
     - static int write_option_parse_split(const struct option *opt, const char *arg,
      @@ builtin/commit-graph.c: int cmd_commit_graph(int argc, const char **argv, const char *prefix)
     - 	struct option *options = parse_options_concat(builtin_commit_graph_options, common_opts);
       
       	git_config(git_default_config, NULL);
     --
     + 
      -	read_replace_refs = 0;
     ++	disable_replace_refs();
       	save_commit_buffer = 0;
       
       	argc = parse_options(argc, argv, prefix, options,
     - 			     builtin_commit_graph_usage, 0);
     - 	FREE_AND_NULL(options);
     - 
     -+	disable_replace_refs();
     -+
     - 	return fn(argc, argv, prefix);
     - }
      
       ## builtin/fsck.c ##
      @@ builtin/fsck.c: int cmd_fsck(int argc, const char **argv, const char *prefix)
     @@ builtin/fsck.c: int cmd_fsck(int argc, const char **argv, const char *prefix)
       
       	errors_found = 0;
      -	read_replace_refs = 0;
     ++	disable_replace_refs();
       	save_commit_buffer = 0;
       
       	argc = parse_options(argc, argv, prefix, fsck_opts, fsck_usage, 0);
     -@@ builtin/fsck.c: int cmd_fsck(int argc, const char **argv, const char *prefix)
     - 
     - 	git_config(git_fsck_config, &fsck_obj_options);
     - 	prepare_repo_settings(the_repository);
     -+	disable_replace_refs();
     - 
     - 	if (connectivity_only) {
     - 		for_each_loose_object(mark_loose_for_connectivity, NULL, 0);
      
       ## builtin/index-pack.c ##
      @@ builtin/index-pack.c: int cmd_index_pack(int argc, const char **argv, const char *prefix)
     @@ builtin/pack-objects.c: int cmd_pack_objects(int argc, const char **argv, const
       		BUG("too many dfs states, increase OE_DFS_STATE_BITS");
       
      -	read_replace_refs = 0;
     --
     - 	sparse = git_env_bool("GIT_TEST_PACK_SPARSE", -1);
      +	disable_replace_refs();
     + 
     + 	sparse = git_env_bool("GIT_TEST_PACK_SPARSE", -1);
       	if (the_repository->gitdir) {
     - 		prepare_repo_settings(the_repository);
     - 		if (sparse < 0)
      
       ## builtin/prune.c ##
      @@ builtin/prune.c: int cmd_prune(int argc, const char **argv, const char *prefix)
     @@ builtin/prune.c: int cmd_prune(int argc, const char **argv, const char *prefix)
       	expire = TIME_MAX;
       	save_commit_buffer = 0;
      -	read_replace_refs = 0;
     ++	disable_replace_refs();
       	repo_init_revisions(the_repository, &revs, prefix);
       
       	argc = parse_options(argc, argv, prefix, options, prune_usage, 0);
     -@@ builtin/prune.c: int cmd_prune(int argc, const char **argv, const char *prefix)
     - 	if (repository_format_precious_objects)
     - 		die(_("cannot prune in a precious-objects repo"));
     - 
     -+	disable_replace_refs();
     -+
     - 	while (argc--) {
     - 		struct object_id oid;
     - 		const char *name = *argv++;
      
       ## builtin/replace.c ##
      @@ builtin/replace.c: int cmd_replace(int argc, const char **argv, const char *prefix)
     @@ builtin/replace.c: int cmd_replace(int argc, const char **argv, const char *pref
       	};
       
      -	read_replace_refs = 0;
     ++	disable_replace_refs();
       	git_config(git_default_config, NULL);
       
       	argc = parse_options(argc, argv, prefix, options, git_replace_usage, 0);
     - 
     -+	disable_replace_refs();
     -+
     - 	if (!cmdmode)
     - 		cmdmode = argc ? MODE_REPLACE : MODE_LIST;
     - 
      
       ## builtin/unpack-objects.c ##
      @@ builtin/unpack-objects.c: int cmd_unpack_objects(int argc, const char **argv, const char *prefix UNUSED)
     @@ builtin/unpack-objects.c: int cmd_unpack_objects(int argc, const char **argv, co
       	struct object_id oid;
       
      -	read_replace_refs = 0;
     --
     - 	git_config(git_default_config, NULL);
      +	disable_replace_refs();
       
     - 	quiet = !isatty(2);
     + 	git_config(git_default_config, NULL);
       
      
       ## builtin/upload-pack.c ##
     @@ builtin/upload-pack.c: int cmd_upload_pack(int argc, const char **argv, const ch
       
       	packet_trace_identity("upload-pack");
      -	read_replace_refs = 0;
     ++	disable_replace_refs();
       
       	argc = parse_options(argc, argv, prefix, options, upload_pack_usage, 0);
       
     -@@ builtin/upload-pack.c: int cmd_upload_pack(int argc, const char **argv, const char *prefix)
     - 	if (!enter_repo(dir, strict))
     - 		die("'%s' does not appear to be a git repository", dir);
     - 
     -+	disable_replace_refs();
     -+
     - 	switch (determine_protocol_version_server()) {
     - 	case protocol_v2:
     - 		if (advertise_refs)
      
       ## environment.c ##
      @@ environment.c: void setup_git_env(const char *git_dir)
     @@ replace-object.h: static inline const struct object_id *lookup_replace_object(st
      +void disable_replace_refs(void);
      +
       #endif /* REPLACE_OBJECT_H */
     -
     - ## repo-settings.c ##
     -@@
     - #include "repository.h"
     - #include "midx.h"
     - #include "compat/fsmonitor/fsm-listen.h"
     -+#include "environment.h"
     - 
     - static void repo_cfg_bool(struct repository *r, const char *key, int *dest,
     - 			  int def)
 2:  5fc2f923d9e = 2:  0831e7f8b5e replace-objects: create wrapper around setting
 3:  481a81a515e ! 3:  4c7dbeb8c6d repository: create read_replace_refs setting
     @@ Commit message
          then it would now respect the core.useReplaceRefs config value in each
          repository.
      
     -    Unfortunately, the existing processes that recurse into submodules do
     -    not appear to follow object IDs to their contents, so this behavior
     -    change is not visible in the current implementation. It is something
     -    valuable for future behavior changes.
     +    'git grep --recurse-submodules' is such a command that recurses into
     +    submodules in-process. We can demonstrate the granularity of this config
     +    value via a test in t7814.
      
          Signed-off-by: Derrick Stolee <derrickstolee@github.com>
      
     @@ repository.h: struct repo_settings {
       	int pack_use_bitmap_boundary_traversal;
       
      +	/*
     -+	 * Do replace refs need to be checked this run?  This variable is
     -+	 * initialized to true unless --no-replace-object is used or
     -+	 * $GIT_NO_REPLACE_OBJECTS is set, but is set to false by some
     -+	 * commands that do not want replace references to be active.
     ++	 * Has this repository have core.useReplaceRefs=true (on by
     ++	 * default)? This provides a repository-scoped version of this
     ++	 * config, though it could be disabled process-wide via some Git
     ++	 * builtins or the --no-replace-objects option. See
     ++	 * replace_refs_enabled() for more details.
      +	 */
      +	int read_replace_refs;
      +
       	struct fsmonitor_settings *fsmonitor; /* lazily loaded */
       
       	int index_version;
     +
     + ## t/t7814-grep-recurse-submodules.sh ##
     +@@ t/t7814-grep-recurse-submodules.sh: test_expect_success 'grep partially-cloned submodule' '
     + 	)
     + '
     + 
     ++test_expect_success 'check scope of core.useReplaceRefs' '
     ++	git init base &&
     ++	git init base/sub &&
     ++
     ++	echo A >base/a &&
     ++	echo B >base/b &&
     ++	echo C >base/sub/c &&
     ++	echo D >base/sub/d &&
     ++
     ++	git -C base/sub add c d &&
     ++	git -C base/sub commit -m "Add files" &&
     ++
     ++	git -C base submodule add ./sub &&
     ++	git -C base add a b sub &&
     ++	git -C base commit -m "Add files and submodule" &&
     ++
     ++	A=$(git -C base rev-parse HEAD:a) &&
     ++	B=$(git -C base rev-parse HEAD:b) &&
     ++	C=$(git -C base/sub rev-parse HEAD:c) &&
     ++	D=$(git -C base/sub rev-parse HEAD:d) &&
     ++
     ++	git -C base replace $A $B &&
     ++	git -C base/sub replace $C $D &&
     ++
     ++	test_must_fail git -C base grep --cached --recurse-submodules A &&
     ++	test_must_fail git -C base grep --cached --recurse-submodules C &&
     ++
     ++	git -C base config core.useReplaceRefs false &&
     ++	git -C base grep --recurse-submodules A &&
     ++	test_must_fail git -C base grep --cached --recurse-submodules C &&
     ++
     ++	git -C base/sub config core.useReplaceRefs false &&
     ++	git -C base grep --cached --recurse-submodules A &&
     ++	git -C base grep --cached --recurse-submodules C &&
     ++
     ++	git -C base config --unset core.useReplaceRefs &&
     ++	test_must_fail git -C base grep --cached --recurse-submodules A &&
     ++	git -C base grep --cached --recurse-submodules C
     ++'
     ++
     + test_done

Comments

Elijah Newren June 3, 2023, 1:52 a.m. UTC | #1
Hi,

On Fri, Jun 2, 2023 at 7:29 AM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
[...]
> Updates in v2
> =============
>
> Thanks for the careful review on v1!
>
>  * disable_replace_refs() now replaces "read_replace_refs = 0" in the exact
>    same line to avoid possible behavior change.
>  * Stale comments, include headers, and commit messages are updated to
>    include the latest status.
>  * Patch 3 contains a test of the repo-scoped value using 'git grep'.

This version looks good to me; thanks for working on it!

(As a sidenote, it might be nice to call out Victoria's questions
about possibly moving stuff to environment.h[1], so that if others
have opinions on the matter they can comment on it.  I don't have one
at this time.)

[1] Last paragraph of
https://lore.kernel.org/git/49ea603b-ebbd-4a14-e0dd-07078e56de0a@github.com/