diff mbox series

[v2,2/2] commit-graph.c: no lazy fetch in lookup_commit_in_graph()

Message ID d3a99a5c5ae538b626e04d7069dd2fc316605dfc.1656044659.git.hanxin.hx@bytedance.com (mailing list archive)
State New, archived
Headers show
Series commit-graph.c: no lazy fetch in lookup_commit_in_graph() | expand

Commit Message

Han Xin June 24, 2022, 5:27 a.m. UTC
If a commit is in the commit graph, we would expect the commit to also
be present. So we should use has_object() instead of
repo_has_object_file(), which will help us avoid getting into an endless
loop of lazy fetch.

When we found the commit in the graph in lookup_commit_in_graph(), but
the commit is missing from the repository, we will try
promisor_remote_get_direct() and then enter another loop. While
sometimes it will finally succeed because it cannot fork subprocess,
it has exhausted the local process resources and can be harmful to the
remote service.

Signed-off-by: Han Xin <hanxin.hx@bytedance.com>
---
 commit-graph.c                             |  2 +-
 t/t5329-no-lazy-fetch-with-commit-graph.sh | 47 ++++++++++++++++++++++
 2 files changed, 48 insertions(+), 1 deletion(-)
 create mode 100755 t/t5329-no-lazy-fetch-with-commit-graph.sh

Comments

Junio C Hamano June 24, 2022, 4:56 p.m. UTC | #1
Han Xin <hanxin.hx@bytedance.com> writes:

> If a commit is in the commit graph, we would expect the commit to also
> be present.

Hmph, is that a fundamental requirement, or is that a limitation of
the current implementation?  Naïvely, I do not quite see why we
cannot first partially clone from a remote, access objects that
locally do not exist and lazily fetch them from the promissor, and
then build a reachability graph.  I expect that the resulting commit
graph records the lazily fetched objects at that point.  And then we
should be able to "lose" the lazily fetched objects that we know we
can fetch from the promissor again when we need them in the future.
And we would be in a situation where commits are pruned away, not
locally available in our object store, but can be (re)fetched from
the promisor, no?

> So we should use has_object() instead of
> repo_has_object_file(), which will help us avoid getting into an endless
> loop of lazy fetch.

It all depends on the reason we call lookup_commit_in_graph(), I
think.  Is there an easy way to remember the fact that we are
checking if object X is here with repo_has_object_file(X), so that
an on-demand fetch that happens when X does not locally exist would
not bother checking with lookup_commit_in_graph()?  IOW, temporarily
disable the use of commit-graph when we are lazily fetching?

> When we found the commit in the graph in lookup_commit_in_graph(),
> but the commit is missing from the repository, we will try
> promisor_remote_get_direct() and then enter another loop.  While
> sometimes it will finally succeed because it cannot fork
> subprocess,

Is that a mode of "succeed"-ing?  Or merely a way to exit an endless
loop that does not make any progress with a failure?

> it has exhausted the local process resources and can be harmful to the
> remote service.
>
> Signed-off-by: Han Xin <hanxin.hx@bytedance.com>
> ---

I think the single-liner change in the patch is a good one, but I am
having a hard time to agree with the reasoning above that explains
why it is a good change.

Here is an attempt to express a reasoning I can understand, can
agree with, and (I think) better describes why the change is a good
one.  Does my understanding of the problem and the solution totally
misses the mark?

	The commit-graph is used to opportunistically optimize
	accesses to certain pieces of information on commit objects,
	and lookup_commit_in_graph() tries to say "no" when the
	requested commit does not locally exist by returning NULL,
	in which case the caller can ask for (which may result in
	on-demand fetching from a promisor remote) and parse the
	commit object itself.

	However, it uses a wrong helper, repo_has_object_file(), to
	do so.  This helper not only checks if an object is
	immediately available in the local object store, but also
	tries to fetch from a promisor remote.  But the fetch
	machinery calls lookup_commit_in_graph(), thus causing an
	infinite loop.

	We should make lookup_commit_in_graph() expect that a commit
	given to it can be legitimately missing from the local
	object store, by using the has_object_file() helper instead.
	
> diff --git a/t/t5329-no-lazy-fetch-with-commit-graph.sh b/t/t5329-no-lazy-fetch-with-commit-graph.sh
> new file mode 100755
> index 0000000000..4d25d2c950
> --- /dev/null
> +++ b/t/t5329-no-lazy-fetch-with-commit-graph.sh

Hmph, does this short-test need a completely new file?

> @@ -0,0 +1,47 @@
> +#!/bin/sh
> +
> +test_description='test for no lazy fetch with the commit-graph'
> +
> +. ./test-lib.sh
> +
> +test_expect_success 'setup: prepare a repository with a commit' '
> +	git init with-commit &&
> +	test_commit -C with-commit the-commit &&
> +	oid=$(git -C with-commit rev-parse HEAD)
> +'
> +
> +test_expect_success 'setup: prepare a repository with commit-graph contains the commit' '
> +	git init with-commit-graph &&
> +	echo "$(pwd)/with-commit/.git/objects" \
> +		>with-commit-graph/.git/objects/info/alternates &&
> +	# create a ref that points to the commit in alternates
> +	git -C with-commit-graph update-ref refs/ref_to_the_commit "$oid" &&
> +	# prepare some other objects to commit-graph
> +	test_commit -C with-commit-graph somthing &&

somthing? something?

> +	git -c gc.writeCommitGraph=true -C with-commit-graph gc &&
> +	test_path_is_file with-commit-graph/.git/objects/info/commit-graph
> +'
> +
> +test_expect_success 'setup: change the alternates to what without the commit' '
> +	git init --bare without-commit &&
> +	echo "$(pwd)/without-commit/objects" \
> +		>with-commit-graph/.git/objects/info/alternates &&

Doesn't this deliberately _corrupt_ the with-commit-graph repository
that depended on the object whose name is $oid in with-commit
repository?  Do we require a corrupt repository to trigger the "bug"?

> +	test_must_fail git -C with-commit-graph cat-file -e $oid
> +'
> +
> +test_expect_success 'setup: prepare another commit to fetch' '
> +	test_commit -C with-commit another-commit &&
> +	anycommit=$(git -C with-commit rev-parse HEAD)

anycommit?  another_commit?  Be consistent in naming.

> +'
> +
> +test_expect_success ULIMIT_PROCESSES 'fetch any commit from promisor with the usage of the commit graph' '

So we did all of the above set-up sequences only to skip the most
interesting test, if we were testing with "dash"?  I suspect that it
may be cleaner to put the prerequisite to the whole file with the
"early test_done" trick like t0051 and t3008.

> +	git -C with-commit-graph remote add origin "$(pwd)/with-commit" &&
> +	git -C with-commit-graph config remote.origin.promisor true &&
> +	git -C with-commit-graph config remote.origin.partialclonefilter blob:none &&
> +	GIT_TRACE="$(pwd)/trace" run_with_limited_processses \
> +		git -C with-commit-graph fetch origin $anycommit 2>err &&
> +	test_i18ngrep ! "fatal: promisor-remote: unable to fork off fetch subprocess" err &&
> +	test $(grep "fetch origin" trace | wc -l) -eq 1
> +'
> +
> +test_done

Thanks.
Han Xin June 25, 2022, 2:25 a.m. UTC | #2
On Sat, Jun 25, 2022 at 12:56 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Han Xin <hanxin.hx@bytedance.com> writes:
>
> > If a commit is in the commit graph, we would expect the commit to also
> > be present.
>
> > When we found the commit in the graph in lookup_commit_in_graph(),
> > but the commit is missing from the repository, we will try
> > promisor_remote_get_direct() and then enter another loop.  While
> > sometimes it will finally succeed because it cannot fork
> > subprocess,
>
> Is that a mode of "succeed"-ing?  Or merely a way to exit an endless
> loop that does not make any progress with a failure?

For the user, "fetch-pack" does succeed, because in
deref_without_lazy_fetch(), even if lookup_commit_in_graph() fails to
lazy fetch the lost commit, the following oid_object_info_extended()
will help us complete the previous work.

In a sense, this infinite loop is based on the fact that infinite processes
can be created.

However, your attempt to express the reasoning bellow is clearer.

>
> > it has exhausted the local process resources and can be harmful to the
> > remote service.
> >
> > Signed-off-by: Han Xin <hanxin.hx@bytedance.com>
> > ---
>
> I think the single-liner change in the patch is a good one, but I am
> having a hard time to agree with the reasoning above that explains
> why it is a good change.
>
> Here is an attempt to express a reasoning I can understand, can
> agree with, and (I think) better describes why the change is a good
> one.  Does my understanding of the problem and the solution totally
> misses the mark?
>
>         The commit-graph is used to opportunistically optimize
>         accesses to certain pieces of information on commit objects,
>         and lookup_commit_in_graph() tries to say "no" when the
>         requested commit does not locally exist by returning NULL,
>         in which case the caller can ask for (which may result in
>         on-demand fetching from a promisor remote) and parse the
>         commit object itself.
>
>         However, it uses a wrong helper, repo_has_object_file(), to
>         do so.  This helper not only checks if an object is
>         immediately available in the local object store, but also
>         tries to fetch from a promisor remote.  But the fetch
>         machinery calls lookup_commit_in_graph(), thus causing an
>         infinite loop.
>
>         We should make lookup_commit_in_graph() expect that a commit
>         given to it can be legitimately missing from the local
>         object store, by using the has_object_file() helper instead.
>
> > diff --git a/t/t5329-no-lazy-fetch-with-commit-graph.sh b/t/t5329-no-lazy-fetch-with-commit-graph.sh
> > new file mode 100755
> > index 0000000000..4d25d2c950
> > --- /dev/null
> > +++ b/t/t5329-no-lazy-fetch-with-commit-graph.sh
>
> Hmph, does this short-test need a completely new file?
>
> > @@ -0,0 +1,47 @@
> > +#!/bin/sh
> > +
> > +test_description='test for no lazy fetch with the commit-graph'
> > +
> > +. ./test-lib.sh
> > +
> > +test_expect_success 'setup: prepare a repository with a commit' '
> > +     git init with-commit &&
> > +     test_commit -C with-commit the-commit &&
> > +     oid=$(git -C with-commit rev-parse HEAD)
> > +'
> > +
> > +test_expect_success 'setup: prepare a repository with commit-graph contains the commit' '
> > +     git init with-commit-graph &&
> > +     echo "$(pwd)/with-commit/.git/objects" \
> > +             >with-commit-graph/.git/objects/info/alternates &&
> > +     # create a ref that points to the commit in alternates
> > +     git -C with-commit-graph update-ref refs/ref_to_the_commit "$oid" &&
> > +     # prepare some other objects to commit-graph
> > +     test_commit -C with-commit-graph somthing &&
>
> somthing? something?

Nod.

>
> > +     git -c gc.writeCommitGraph=true -C with-commit-graph gc &&
> > +     test_path_is_file with-commit-graph/.git/objects/info/commit-graph
> > +'
> > +
> > +test_expect_success 'setup: change the alternates to what without the commit' '
> > +     git init --bare without-commit &&
> > +     echo "$(pwd)/without-commit/objects" \
> > +             >with-commit-graph/.git/objects/info/alternates &&
>
> Doesn't this deliberately _corrupt_ the with-commit-graph repository
> that depended on the object whose name is $oid in with-commit
> repository?  Do we require a corrupt repository to trigger the "bug"?
>

The "bug" depends on the commit exist in the commit-graph but
missing in the repository.

I didn't find a better way to make this kind of scene.

This bug was first found when alternates and commit-graph were
both used. Since the promise did not maintain all the references,
I suspect that the "auto gc" during the update process of the promise
caused the loss of the unreachable commits in the promise.

> > +     test_must_fail git -C with-commit-graph cat-file -e $oid
> > +'
> > +
> > +test_expect_success 'setup: prepare another commit to fetch' '
> > +     test_commit -C with-commit another-commit &&
> > +     anycommit=$(git -C with-commit rev-parse HEAD)
>
> anycommit?  another_commit?  Be consistent in naming.
>

Nod.

> > +'
> > +
> > +test_expect_success ULIMIT_PROCESSES 'fetch any commit from promisor with the usage of the commit graph' '
>
> So we did all of the above set-up sequences only to skip the most
> interesting test, if we were testing with "dash"?  I suspect that it
> may be cleaner to put the prerequisite to the whole file with the
> "early test_done" trick like t0051 and t3008.
>

It make sense to me.

Thanks.
-Han Xin
Han Xin June 25, 2022, 2:31 a.m. UTC | #3
On Sat, Jun 25, 2022 at 10:25 AM Han Xin <hanxin.hx@bytedance.com> wrote:
>
> The "bug" depends on the commit exist in the commit-graph but
> missing in the repository.
>
> I didn't find a better way to make this kind of scene.
>
> This bug was first found when alternates and commit-graph were
> both used. Since the promise did not maintain all the references,
> I suspect that the "auto gc" during the update process of the promise
> caused the loss of the unreachable commits in the promise.
>

Some mistakes here:
This bug was first found when alternates and commit-graph were
both used. Since the promise did not maintain all the references,
I suspect that the "auto gc" during the update of the alternates
caused the loss of the unreachable commits.
diff mbox series

Patch

diff --git a/commit-graph.c b/commit-graph.c
index 2b52818731..2dd9bcc7ea 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -907,7 +907,7 @@  struct commit *lookup_commit_in_graph(struct repository *repo, const struct obje
 		return NULL;
 	if (!search_commit_pos_in_graph(id, repo->objects->commit_graph, &pos))
 		return NULL;
-	if (!repo_has_object_file(repo, id))
+	if (!has_object(repo, id, 0))
 		return NULL;
 
 	commit = lookup_commit(repo, id);
diff --git a/t/t5329-no-lazy-fetch-with-commit-graph.sh b/t/t5329-no-lazy-fetch-with-commit-graph.sh
new file mode 100755
index 0000000000..4d25d2c950
--- /dev/null
+++ b/t/t5329-no-lazy-fetch-with-commit-graph.sh
@@ -0,0 +1,47 @@ 
+#!/bin/sh
+
+test_description='test for no lazy fetch with the commit-graph'
+
+. ./test-lib.sh
+
+test_expect_success 'setup: prepare a repository with a commit' '
+	git init with-commit &&
+	test_commit -C with-commit the-commit &&
+	oid=$(git -C with-commit rev-parse HEAD)
+'
+
+test_expect_success 'setup: prepare a repository with commit-graph contains the commit' '
+	git init with-commit-graph &&
+	echo "$(pwd)/with-commit/.git/objects" \
+		>with-commit-graph/.git/objects/info/alternates &&
+	# create a ref that points to the commit in alternates
+	git -C with-commit-graph update-ref refs/ref_to_the_commit "$oid" &&
+	# prepare some other objects to commit-graph
+	test_commit -C with-commit-graph somthing &&
+	git -c gc.writeCommitGraph=true -C with-commit-graph gc &&
+	test_path_is_file with-commit-graph/.git/objects/info/commit-graph
+'
+
+test_expect_success 'setup: change the alternates to what without the commit' '
+	git init --bare without-commit &&
+	echo "$(pwd)/without-commit/objects" \
+		>with-commit-graph/.git/objects/info/alternates &&
+	test_must_fail git -C with-commit-graph cat-file -e $oid
+'
+
+test_expect_success 'setup: prepare another commit to fetch' '
+	test_commit -C with-commit another-commit &&
+	anycommit=$(git -C with-commit rev-parse HEAD)
+'
+
+test_expect_success ULIMIT_PROCESSES 'fetch any commit from promisor with the usage of the commit graph' '
+	git -C with-commit-graph remote add origin "$(pwd)/with-commit" &&
+	git -C with-commit-graph config remote.origin.promisor true &&
+	git -C with-commit-graph config remote.origin.partialclonefilter blob:none &&
+	GIT_TRACE="$(pwd)/trace" run_with_limited_processses \
+		git -C with-commit-graph fetch origin $anycommit 2>err &&
+	test_i18ngrep ! "fatal: promisor-remote: unable to fork off fetch subprocess" err &&
+	test $(grep "fetch origin" trace | wc -l) -eq 1
+'
+
+test_done