diff mbox series

[4/4] promisor-remote: teach lazy-fetch in any repo

Message ID b70a00b9b06e5142bb95891cfc2faa87d708ef0d.1622580781.git.jonathantanmy@google.com (mailing list archive)
State Superseded
Headers show
Series First steps towards partial clone submodules | expand

Commit Message

Jonathan Tan June 1, 2021, 9:34 p.m. UTC
This is one step towards supporting partial clone submodules.

Even after this patch, we will still lack partial clone submodules
support, primarily because a lot of Git code that accesses submodule
objects does so by adding their object stores as alternates, meaning
that any lazy fetches that would occur in the submodule would be done
based on the config of the superproject, not of the submodule. This also
prevents testing of the functionality in this patch by user-facing
commands. So for now, test this mechanism using a test helper.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 Makefile                      |  1 +
 object-file.c                 |  7 ++-----
 promisor-remote.c             | 14 +++++++++-----
 t/helper/test-partial-clone.c | 34 ++++++++++++++++++++++++++++++++++
 t/helper/test-tool.c          |  1 +
 t/helper/test-tool.h          |  1 +
 t/t0410-partial-clone.sh      | 24 ++++++++++++++++++++++++
 7 files changed, 72 insertions(+), 10 deletions(-)
 create mode 100644 t/helper/test-partial-clone.c

Comments

Taylor Blau June 4, 2021, 9:25 p.m. UTC | #1
On Tue, Jun 01, 2021 at 02:34:19PM -0700, Jonathan Tan wrote:
> This is one step towards supporting partial clone submodules.
>
> Even after this patch, we will still lack partial clone submodules
> support, primarily because a lot of Git code that accesses submodule
> objects does so by adding their object stores as alternates, meaning
> that any lazy fetches that would occur in the submodule would be done
> based on the config of the superproject, not of the submodule. This also
> prevents testing of the functionality in this patch by user-facing
> commands. So for now, test this mechanism using a test helper.

OK. Everything you wrote seemed reasonable to me, but I did have a
couple of questions on the test you added:

> diff --git a/t/helper/test-partial-clone.c b/t/helper/test-partial-clone.c
> new file mode 100644
> index 0000000000..e7bc7eb21f
> --- /dev/null
> +++ b/t/helper/test-partial-clone.c
> @@ -0,0 +1,34 @@
> +#include "cache.h"
> +#include "test-tool.h"
> +#include "repository.h"
> +#include "object-store.h"
> +
> +static void object_info(const char *gitdir, const char *oid_hex)
> +{
> +	struct repository r;
> +	struct object_id oid;
> +	unsigned long size;
> +	struct object_info oi = {.sizep = &size};
> +	const char *p;
> +
> +	if (repo_init(&r, gitdir, NULL))
> +		die("could not init repo");
> +	if (parse_oid_hex(oid_hex, &oid, &p))
> +		die("could not parse oid");
> +	if (oid_object_info_extended(&r, &oid, &oi, 0))
> +		die("could not obtain object info");
> +	printf("%d\n", (int) size);
> +}

Hmm. Is there a reason that the same couldn't be implemented by calling "git
cat-file -s" from the partial clone?

> diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
> index 584a039b85..e804d267e6 100755
> --- a/t/t0410-partial-clone.sh
> +++ b/t/t0410-partial-clone.sh
> @@ -604,6 +604,30 @@ test_expect_success 'do not fetch when checking existence of tree we construct o
>  	git -C repo cherry-pick side1
>  '
>
> +test_expect_success 'lazy-fetch when accessing object not in the_repository' '
> +	rm -rf full partial.git &&
> +	test_create_repo full &&
> +	printf 12345 >full/file.txt &&
> +	git -C full add file.txt &&
> +	git -C full commit -m "first commit" &&

This is a stylistic nit, but I think using test_commit is better here
for a non-superficial reason. My guess is that you wanted to avoid
specifying a message and file (which are required positional arguments
to test_commit before you can specify the contents). But I think there
are two good reasons to use test_commit here:

  - It saves three lines of test script here.
  - You don't have to make the expected size a magic number (i.e.,
    because you knew ahead of time that the contents was "12345").

I probably would have expected this test to end with:

  git -C full cat-file -s $FILE_HASH >expect &&
  git -C partial.git cat-file -s $FILE_HASH >actual &&
  test_cmp expect actual

which reads more clearly to me (although I think the much more important
test is that $FILE_HASH doesn't show up in the output of the rev-list
--missing=print that is run in the partial clone).

> +
> +	test_config -C full uploadpack.allowfilter 1 &&
> +	test_config -C full uploadpack.allowanysha1inwant 1 &&
> +	git clone --filter=blob:none --bare "file://$(pwd)/full" partial.git &&
> +	FILE_HASH=$(git hash-object --stdin <full/file.txt) &&

This works for me, although I wouldn't have been sad to see the
sub-shell contain "git -C full rev-parse HEAD:file.txt" instead.

> +	# Sanity check that the file is missing
> +	git -C partial.git rev-list --objects --missing=print HEAD >out &&
> +	grep "[?]$FILE_HASH" out &&
> +
> +	OUT=$(test-tool partial-clone object-info partial.git "$FILE_HASH") &&

Coming back to my point about the utility of the partial-clone helper,
could this be replaced by saying just OUT="$(git -C partial.git cat-file
-s "$FILE_HASH")" instead?

Thanks,
Taylor
Elijah Newren June 4, 2021, 9:35 p.m. UTC | #2
On Tue, Jun 1, 2021 at 2:38 PM Jonathan Tan <jonathantanmy@google.com> wrote:
>
> This is one step towards supporting partial clone submodules.
>
> Even after this patch, we will still lack partial clone submodules
> support, primarily because a lot of Git code that accesses submodule
> objects does so by adding their object stores as alternates, meaning
> that any lazy fetches that would occur in the submodule would be done
> based on the config of the superproject, not of the submodule. This also
> prevents testing of the functionality in this patch by user-facing
> commands. So for now, test this mechanism using a test helper.
>
> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> ---
>  Makefile                      |  1 +
>  object-file.c                 |  7 ++-----
>  promisor-remote.c             | 14 +++++++++-----
>  t/helper/test-partial-clone.c | 34 ++++++++++++++++++++++++++++++++++
>  t/helper/test-tool.c          |  1 +
>  t/helper/test-tool.h          |  1 +
>  t/t0410-partial-clone.sh      | 24 ++++++++++++++++++++++++
>  7 files changed, 72 insertions(+), 10 deletions(-)
>  create mode 100644 t/helper/test-partial-clone.c
>
> diff --git a/Makefile b/Makefile
> index c3565fc0f8..f6653bcd5e 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -725,6 +725,7 @@ TEST_BUILTINS_OBJS += test-oidmap.o
>  TEST_BUILTINS_OBJS += test-online-cpus.o
>  TEST_BUILTINS_OBJS += test-parse-options.o
>  TEST_BUILTINS_OBJS += test-parse-pathspec-file.o
> +TEST_BUILTINS_OBJS += test-partial-clone.o
>  TEST_BUILTINS_OBJS += test-path-utils.o
>  TEST_BUILTINS_OBJS += test-pcre2-config.o
>  TEST_BUILTINS_OBJS += test-pkt-line.o
> diff --git a/object-file.c b/object-file.c
> index f233b440b2..ebf273e9e7 100644
> --- a/object-file.c
> +++ b/object-file.c
> @@ -1570,15 +1570,12 @@ static int do_oid_object_info_extended(struct repository *r,
>                 }
>
>                 /* Check if it is a missing object */
> -               if (fetch_if_missing && has_promisor_remote() &&
> -                   !already_retried && r == the_repository &&
> +               if (fetch_if_missing && repo_has_promisor_remote(r) &&
> +                   !already_retried &&

So here you removed the special check against the_repository while
looking for promisor_remotes.  There are other such special checks in
the code; I also see:

diff.c: if (options->repo == the_repository && has_promisor_remote() &&
diffcore-break.c:       if (r == the_repository && has_promisor_remote()) {
diffcore-rename.c:      if (r == the_repository && has_promisor_remote()) {

and a series I'm planning to submit soon will add another to merge.ort.c.

Do these need to all be fixed as part of the partial clone submodule
support as well?  Do I need to change anything about my series?  I
guess since I'm asking that, I should probably submit it first so you
can actually see it and answer my question.  (And the timing may be
good since the area is fresh in your memory...)

>                     !(flags & OBJECT_INFO_SKIP_FETCH_OBJECT)) {
>                         /*
>                          * TODO Investigate checking promisor_remote_get_direct()
>                          * TODO return value and stopping on error here.
> -                        * TODO Pass a repository struct through
> -                        * promisor_remote_get_direct(), such that arbitrary
> -                        * repositories work.

Odd, it appears that when this comment was added (in commit b14ed5adaf
("Use promisor_remote_get_direct() and has_promisor_remote()",
2019-06-25)), a repository was passed to promisor_remote_get_direct().
Sure, it was just a transliteration of the comment that was there
before when fetch_objects() was the function being called, but since
the code was being changed and the comment being updated, it seems the
TODO should have been removed back then.

Oh, well, good to update it now at least.

>                          */
>                         promisor_remote_get_direct(r, real, 1);
>                         already_retried = 1;
> diff --git a/promisor-remote.c b/promisor-remote.c
> index 5819d2cf28..1601f05d79 100644
> --- a/promisor-remote.c
> +++ b/promisor-remote.c
> @@ -11,7 +11,8 @@ struct promisor_remote_config {
>         struct promisor_remote **promisors_tail;
>  };
>
> -static int fetch_objects(const char *remote_name,
> +static int fetch_objects(struct repository *repo,
> +                        const char *remote_name,
>                          const struct object_id *oids,
>                          int oid_nr)
>  {
> @@ -21,6 +22,11 @@ static int fetch_objects(const char *remote_name,
>
>         child.git_cmd = 1;
>         child.in = -1;
> +       if (repo != the_repository) {
> +               prepare_other_repo_env(&child.env_array);
> +               strvec_pushf(&child.env_array, "%s=%s", GIT_DIR_ENVIRONMENT,
> +                            repo->gitdir);
> +       }
>         strvec_pushl(&child.args, "-c", "fetch.negotiationAlgorithm=noop",
>                      "fetch", remote_name, "--no-tags",
>                      "--no-write-fetch-head", "--recurse-submodules=no",
> @@ -150,7 +156,7 @@ static void promisor_remote_init(struct repository *r)
>                 xcalloc(sizeof(*r->promisor_remote_config), 1);
>         config->promisors_tail = &config->promisors;
>
> -       git_config(promisor_remote_config, config);
> +       repo_config(r, promisor_remote_config, config);
>
>         if (config->repository_format_partial_clone) {
>                 struct promisor_remote *o, *previous;
> @@ -246,10 +252,8 @@ int promisor_remote_get_direct(struct repository *repo,
>
>         promisor_remote_init(repo);
>
> -       if (repo != the_repository)
> -               BUG("only the_repository is supported for now");
>         for (r = repo->promisor_remote_config->promisors; r; r = r->next) {
> -               if (fetch_objects(r->name, remaining_oids, remaining_nr) < 0) {
> +               if (fetch_objects(repo, r->name, remaining_oids, remaining_nr) < 0) {
>                         if (remaining_nr == 1)
>                                 continue;
>                         remaining_nr = remove_fetched_oids(repo, &remaining_oids,
> diff --git a/t/helper/test-partial-clone.c b/t/helper/test-partial-clone.c
> new file mode 100644
> index 0000000000..e7bc7eb21f
> --- /dev/null
> +++ b/t/helper/test-partial-clone.c
> @@ -0,0 +1,34 @@
> +#include "cache.h"
> +#include "test-tool.h"
> +#include "repository.h"
> +#include "object-store.h"
> +
> +static void object_info(const char *gitdir, const char *oid_hex)
> +{
> +       struct repository r;
> +       struct object_id oid;
> +       unsigned long size;
> +       struct object_info oi = {.sizep = &size};
> +       const char *p;
> +
> +       if (repo_init(&r, gitdir, NULL))
> +               die("could not init repo");
> +       if (parse_oid_hex(oid_hex, &oid, &p))
> +               die("could not parse oid");
> +       if (oid_object_info_extended(&r, &oid, &oi, 0))
> +               die("could not obtain object info");
> +       printf("%d\n", (int) size);
> +}
> +
> +int cmd__partial_clone(int argc, const char **argv)
> +{
> +       if (argc < 4)
> +               die("too few arguments");
> +
> +       if (!strcmp(argv[1], "object-info"))
> +               object_info(argv[2], argv[3]);
> +       else
> +               die("invalid argument '%s'", argv[1]);
> +
> +       return 0;
> +}
> diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
> index c5bd0c6d4c..b21e8f1519 100644
> --- a/t/helper/test-tool.c
> +++ b/t/helper/test-tool.c
> @@ -46,6 +46,7 @@ static struct test_cmd cmds[] = {
>         { "online-cpus", cmd__online_cpus },
>         { "parse-options", cmd__parse_options },
>         { "parse-pathspec-file", cmd__parse_pathspec_file },
> +       { "partial-clone", cmd__partial_clone },
>         { "path-utils", cmd__path_utils },
>         { "pcre2-config", cmd__pcre2_config },
>         { "pkt-line", cmd__pkt_line },
> diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
> index e8069a3b22..f845ced4b3 100644
> --- a/t/helper/test-tool.h
> +++ b/t/helper/test-tool.h
> @@ -35,6 +35,7 @@ int cmd__oidmap(int argc, const char **argv);
>  int cmd__online_cpus(int argc, const char **argv);
>  int cmd__parse_options(int argc, const char **argv);
>  int cmd__parse_pathspec_file(int argc, const char** argv);
> +int cmd__partial_clone(int argc, const char **argv);
>  int cmd__path_utils(int argc, const char **argv);
>  int cmd__pcre2_config(int argc, const char **argv);
>  int cmd__pkt_line(int argc, const char **argv);
> diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
> index 584a039b85..e804d267e6 100755
> --- a/t/t0410-partial-clone.sh
> +++ b/t/t0410-partial-clone.sh
> @@ -604,6 +604,30 @@ test_expect_success 'do not fetch when checking existence of tree we construct o
>         git -C repo cherry-pick side1
>  '
>
> +test_expect_success 'lazy-fetch when accessing object not in the_repository' '
> +       rm -rf full partial.git &&
> +       test_create_repo full &&
> +       printf 12345 >full/file.txt &&
> +       git -C full add file.txt &&
> +       git -C full commit -m "first commit" &&
> +
> +       test_config -C full uploadpack.allowfilter 1 &&
> +       test_config -C full uploadpack.allowanysha1inwant 1 &&
> +       git clone --filter=blob:none --bare "file://$(pwd)/full" partial.git &&
> +       FILE_HASH=$(git hash-object --stdin <full/file.txt) &&
> +
> +       # Sanity check that the file is missing
> +       git -C partial.git rev-list --objects --missing=print HEAD >out &&
> +       grep "[?]$FILE_HASH" out &&
> +
> +       OUT=$(test-tool partial-clone object-info partial.git "$FILE_HASH") &&
> +       test "$OUT" -eq 5 &&
> +
> +       # Sanity check that the file is now present
> +       git -C partial.git rev-list --objects --missing=print HEAD >out &&
> +       ! grep "[?]$FILE_HASH" out
> +'
> +
>  . "$TEST_DIRECTORY"/lib-httpd.sh
>  start_httpd
>
> --
> 2.32.0.rc0.204.g9fa02ecfa5-goog

Looks good to me.
Elijah Newren June 5, 2021, 12:22 a.m. UTC | #3
Hi,

On Tue, Jun 1, 2021 at 2:38 PM Jonathan Tan <jonathantanmy@google.com> wrote:
>
> This is one step towards supporting partial clone submodules.
>
> Even after this patch, we will still lack partial clone submodules
> support, primarily because a lot of Git code that accesses submodule
> objects does so by adding their object stores as alternates, meaning
> that any lazy fetches that would occur in the submodule would be done
> based on the config of the superproject, not of the submodule. This also
> prevents testing of the functionality in this patch by user-facing
> commands. So for now, test this mechanism using a test helper.
>
...
> diff --git a/t/helper/test-partial-clone.c b/t/helper/test-partial-clone.c
> new file mode 100644
> index 0000000000..e7bc7eb21f
> --- /dev/null
> +++ b/t/helper/test-partial-clone.c
> @@ -0,0 +1,34 @@
> +#include "cache.h"
> +#include "test-tool.h"
> +#include "repository.h"
> +#include "object-store.h"
> +
> +static void object_info(const char *gitdir, const char *oid_hex)
> +{
> +       struct repository r;
> +       struct object_id oid;
> +       unsigned long size;
> +       struct object_info oi = {.sizep = &size};
> +       const char *p;
> +
> +       if (repo_init(&r, gitdir, NULL))
> +               die("could not init repo");
> +       if (parse_oid_hex(oid_hex, &oid, &p))
> +               die("could not parse oid");
> +       if (oid_object_info_extended(&r, &oid, &oi, 0))
> +               die("could not obtain object info");
> +       printf("%d\n", (int) size);
> +}
> +
> +int cmd__partial_clone(int argc, const char **argv)
> +{
> +       if (argc < 4)
> +               die("too few arguments");
> +
> +       if (!strcmp(argv[1], "object-info"))
> +               object_info(argv[2], argv[3]);
> +       else
> +               die("invalid argument '%s'", argv[1]);
> +
> +       return 0;
> +}
> diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
> index c5bd0c6d4c..b21e8f1519 100644
> --- a/t/helper/test-tool.c
> +++ b/t/helper/test-tool.c
> @@ -46,6 +46,7 @@ static struct test_cmd cmds[] = {
>         { "online-cpus", cmd__online_cpus },
>         { "parse-options", cmd__parse_options },
>         { "parse-pathspec-file", cmd__parse_pathspec_file },
> +       { "partial-clone", cmd__partial_clone },
>         { "path-utils", cmd__path_utils },
>         { "pcre2-config", cmd__pcre2_config },
>         { "pkt-line", cmd__pkt_line },
> diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
> index e8069a3b22..f845ced4b3 100644
> --- a/t/helper/test-tool.h
> +++ b/t/helper/test-tool.h
> @@ -35,6 +35,7 @@ int cmd__oidmap(int argc, const char **argv);
>  int cmd__online_cpus(int argc, const char **argv);
>  int cmd__parse_options(int argc, const char **argv);
>  int cmd__parse_pathspec_file(int argc, const char** argv);
> +int cmd__partial_clone(int argc, const char **argv);
>  int cmd__path_utils(int argc, const char **argv);
>  int cmd__pcre2_config(int argc, const char **argv);
>  int cmd__pkt_line(int argc, const char **argv);
> diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
> index 584a039b85..e804d267e6 100755
> --- a/t/t0410-partial-clone.sh
> +++ b/t/t0410-partial-clone.sh
> @@ -604,6 +604,30 @@ test_expect_success 'do not fetch when checking existence of tree we construct o
>         git -C repo cherry-pick side1
>  '
>
> +test_expect_success 'lazy-fetch when accessing object not in the_repository' '
> +       rm -rf full partial.git &&
> +       test_create_repo full &&
> +       printf 12345 >full/file.txt &&
> +       git -C full add file.txt &&
> +       git -C full commit -m "first commit" &&
> +
> +       test_config -C full uploadpack.allowfilter 1 &&
> +       test_config -C full uploadpack.allowanysha1inwant 1 &&
> +       git clone --filter=blob:none --bare "file://$(pwd)/full" partial.git &&
> +       FILE_HASH=$(git hash-object --stdin <full/file.txt) &&
> +
> +       # Sanity check that the file is missing
> +       git -C partial.git rev-list --objects --missing=print HEAD >out &&
> +       grep "[?]$FILE_HASH" out &&
> +
> +       OUT=$(test-tool partial-clone object-info partial.git "$FILE_HASH") &&
> +       test "$OUT" -eq 5 &&
> +
> +       # Sanity check that the file is now present
> +       git -C partial.git rev-list --objects --missing=print HEAD >out &&
> +       ! grep "[?]$FILE_HASH" out
> +'
> +

Turns out that this test fails under GIT_TEST_DEFAULT_HASH=sha256; output:

error: wrong index v2 file size in /home/newren/floss/git/t/trash
directory.t0410-partial-clone/partial.git/objects/pack/pack-66a15be115d740341216938fb7abb31902e960bd6d464829d85164d1a4a25bec.idx
error: wrong index v2 file size in /home/newren/floss/git/t/trash
directory.t0410-partial-clone/partial.git/objects/pack/pack-66a15be115d740341216938fb7abb31902e960bd6d464829d85164d1a4a25bec.idx
fatal: couldn't find remote ref 74242c6e4a0d89f454d89d3496a1f7cb3f1f39f0
error: wrong index v2 file size in /home/newren/floss/git/t/trash
directory.t0410-partial-clone/partial.git/objects/pack/pack-66a15be115d740341216938fb7abb31902e960bd6d464829d85164d1a4a25bec.idx
error: wrong index v2 file size in /home/newren/floss/git/t/trash
directory.t0410-partial-clone/partial.git/objects/pack/pack-66a15be115d740341216938fb7abb31902e960bd6d464829d85164d1a4a25bec.idx
fatal: could not obtain object info
Jonathan Tan June 5, 2021, 2:11 a.m. UTC | #4
> > +static void object_info(const char *gitdir, const char *oid_hex)
> > +{
> > +	struct repository r;
> > +	struct object_id oid;
> > +	unsigned long size;
> > +	struct object_info oi = {.sizep = &size};
> > +	const char *p;
> > +
> > +	if (repo_init(&r, gitdir, NULL))
> > +		die("could not init repo");
> > +	if (parse_oid_hex(oid_hex, &oid, &p))
> > +		die("could not parse oid");
> > +	if (oid_object_info_extended(&r, &oid, &oi, 0))
> > +		die("could not obtain object info");
> > +	printf("%d\n", (int) size);
> > +}
> 
> Hmm. Is there a reason that the same couldn't be implemented by calling "git
> cat-file -s" from the partial clone?

I don't think "git cat-file" (when run in the superproject) by itself
can access an object from a submodule. "git -C name_of_submodule
cat-file $HASH" would access that object, but I specifically want to
test oid_object_info_extended() on a repo that is not the_repository
(which would not work with -C, because the_repository would then be the
submodule).

> > +test_expect_success 'lazy-fetch when accessing object not in the_repository' '
> > +	rm -rf full partial.git &&
> > +	test_create_repo full &&
> > +	printf 12345 >full/file.txt &&
> > +	git -C full add file.txt &&
> > +	git -C full commit -m "first commit" &&
> 
> This is a stylistic nit, but I think using test_commit is better here
> for a non-superficial reason. My guess is that you wanted to avoid
> specifying a message and file (which are required positional arguments
> to test_commit before you can specify the contents). But I think there
> are two good reasons to use test_commit here:
> 
>   - It saves three lines of test script here.
>   - You don't have to make the expected size a magic number (i.e.,
>     because you knew ahead of time that the contents was "12345").
> 
> I probably would have expected this test to end with:
> 
>   git -C full cat-file -s $FILE_HASH >expect &&
>   git -C partial.git cat-file -s $FILE_HASH >actual &&
>   test_cmp expect actual
> 
> which reads more clearly to me (although I think the much more important
> test is that $FILE_HASH doesn't show up in the output of the rev-list
> --missing=print that is run in the partial clone).

That makes sense.

> > +	test_config -C full uploadpack.allowfilter 1 &&
> > +	test_config -C full uploadpack.allowanysha1inwant 1 &&
> > +	git clone --filter=blob:none --bare "file://$(pwd)/full" partial.git &&
> > +	FILE_HASH=$(git hash-object --stdin <full/file.txt) &&
> 
> This works for me, although I wouldn't have been sad to see the
> sub-shell contain "git -C full rev-parse HEAD:file.txt" instead.

I'll do this too.

> > +	# Sanity check that the file is missing
> > +	git -C partial.git rev-list --objects --missing=print HEAD >out &&
> > +	grep "[?]$FILE_HASH" out &&
> > +
> > +	OUT=$(test-tool partial-clone object-info partial.git "$FILE_HASH") &&
> 
> Coming back to my point about the utility of the partial-clone helper,
> could this be replaced by saying just OUT="$(git -C partial.git cat-file
> -s "$FILE_HASH")" instead?
> 
> Thanks,
> Taylor

Same answer as above - I specifically want to test accessing (and
thereby lazy-fetching) an object when the object is not in
the_repository. I'll add some documentation to explain what it does and
that it's equivalent to using "git -C repo cat-file -s", except that
this specifically tests another code path.
Jonathan Tan June 5, 2021, 2:16 a.m. UTC | #5
> > diff --git a/object-file.c b/object-file.c
> > index f233b440b2..ebf273e9e7 100644
> > --- a/object-file.c
> > +++ b/object-file.c
> > @@ -1570,15 +1570,12 @@ static int do_oid_object_info_extended(struct repository *r,
> >                 }
> >
> >                 /* Check if it is a missing object */
> > -               if (fetch_if_missing && has_promisor_remote() &&
> > -                   !already_retried && r == the_repository &&
> > +               if (fetch_if_missing && repo_has_promisor_remote(r) &&
> > +                   !already_retried &&
> 
> So here you removed the special check against the_repository while
> looking for promisor_remotes.  There are other such special checks in
> the code; I also see:
> 
> diff.c: if (options->repo == the_repository && has_promisor_remote() &&
> diffcore-break.c:       if (r == the_repository && has_promisor_remote()) {
> diffcore-rename.c:      if (r == the_repository && has_promisor_remote()) {
> 
> and a series I'm planning to submit soon will add another to merge.ort.c.
> 
> Do these need to all be fixed as part of the partial clone submodule
> support as well?  Do I need to change anything about my series?  I
> guess since I'm asking that, I should probably submit it first so you
> can actually see it and answer my question.  (And the timing may be
> good since the area is fresh in your memory...)

Thanks for raising this. Looking at the 3 you listed, they all have to
do with prefetching. This is fine both now and later. Now, since partial
clones in submodules still don't work, any fetching of any sort (pre- or
not) will not work. Later, since this prefetching is just an
optimization. (Of course, we should come back and add prefetching for
submodule partial clones, but that is an optimization, not a correctness
issue.)

> >                     !(flags & OBJECT_INFO_SKIP_FETCH_OBJECT)) {
> >                         /*
> >                          * TODO Investigate checking promisor_remote_get_direct()
> >                          * TODO return value and stopping on error here.
> > -                        * TODO Pass a repository struct through
> > -                        * promisor_remote_get_direct(), such that arbitrary
> > -                        * repositories work.
> 
> Odd, it appears that when this comment was added (in commit b14ed5adaf
> ("Use promisor_remote_get_direct() and has_promisor_remote()",
> 2019-06-25)), a repository was passed to promisor_remote_get_direct().
> Sure, it was just a transliteration of the comment that was there
> before when fetch_objects() was the function being called, but since
> the code was being changed and the comment being updated, it seems the
> TODO should have been removed back then.
> 
> Oh, well, good to update it now at least.

Yes - perhaps the comment was emphasizing the "such that arbitrary
repositories work" part. But anyway, yes, it is now removed.

[snip]

> Looks good to me.

Thanks for taking a look.
Jonathan Tan June 5, 2021, 2:16 a.m. UTC | #6
> Turns out that this test fails under GIT_TEST_DEFAULT_HASH=sha256; output:
> 
> error: wrong index v2 file size in /home/newren/floss/git/t/trash
> directory.t0410-partial-clone/partial.git/objects/pack/pack-66a15be115d740341216938fb7abb31902e960bd6d464829d85164d1a4a25bec.idx
> error: wrong index v2 file size in /home/newren/floss/git/t/trash
> directory.t0410-partial-clone/partial.git/objects/pack/pack-66a15be115d740341216938fb7abb31902e960bd6d464829d85164d1a4a25bec.idx
> fatal: couldn't find remote ref 74242c6e4a0d89f454d89d3496a1f7cb3f1f39f0
> error: wrong index v2 file size in /home/newren/floss/git/t/trash
> directory.t0410-partial-clone/partial.git/objects/pack/pack-66a15be115d740341216938fb7abb31902e960bd6d464829d85164d1a4a25bec.idx
> error: wrong index v2 file size in /home/newren/floss/git/t/trash
> directory.t0410-partial-clone/partial.git/objects/pack/pack-66a15be115d740341216938fb7abb31902e960bd6d464829d85164d1a4a25bec.idx
> fatal: could not obtain object info

Thanks for noticing this. I'll take a look.
Elijah Newren June 5, 2021, 3:48 a.m. UTC | #7
A quick update...

On Fri, Jun 4, 2021 at 2:35 PM Elijah Newren <newren@gmail.com> wrote:
>
> On Tue, Jun 1, 2021 at 2:38 PM Jonathan Tan <jonathantanmy@google.com> wrote:
> >
...
> > diff --git a/object-file.c b/object-file.c
> > index f233b440b2..ebf273e9e7 100644
> > --- a/object-file.c
> > +++ b/object-file.c
> > @@ -1570,15 +1570,12 @@ static int do_oid_object_info_extended(struct repository *r,
> >                 }
> >
> >                 /* Check if it is a missing object */
> > -               if (fetch_if_missing && has_promisor_remote() &&
> > -                   !already_retried && r == the_repository &&
> > +               if (fetch_if_missing && repo_has_promisor_remote(r) &&
> > +                   !already_retried &&
>
> So here you removed the special check against the_repository while
> looking for promisor_remotes.  There are other such special checks in
> the code; I also see:
>
> diff.c: if (options->repo == the_repository && has_promisor_remote() &&
> diffcore-break.c:       if (r == the_repository && has_promisor_remote()) {
> diffcore-rename.c:      if (r == the_repository && has_promisor_remote()) {
>
> and a series I'm planning to submit soon will add another to merge.ort.c.
>
> Do these need to all be fixed as part of the partial clone submodule
> support as well?  Do I need to change anything about my series?  I
> guess since I'm asking that, I should probably submit it first so you
> can actually see it and answer my question.  (And the timing may be
> good since the area is fresh in your memory...)

That other topic is now over here:
https://lore.kernel.org/git/pull.969.git.1622856485.gitgitgadget@gmail.com/T/#t
Emily Shaffer June 8, 2021, 1:41 a.m. UTC | #8
On Tue, Jun 01, 2021 at 02:34:19PM -0700, Jonathan Tan wrote:
> 
> This is one step towards supporting partial clone submodules.
> 
> Even after this patch, we will still lack partial clone submodules
> support, primarily because a lot of Git code that accesses submodule
> objects does so by adding their object stores as alternates, meaning
> that any lazy fetches that would occur in the submodule would be done
> based on the config of the superproject, not of the submodule. This also
> prevents testing of the functionality in this patch by user-facing
> commands. So for now, test this mechanism using a test helper.
> 
> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> ---
>  Makefile                      |  1 +
>  object-file.c                 |  7 ++-----
>  promisor-remote.c             | 14 +++++++++-----
>  t/helper/test-partial-clone.c | 34 ++++++++++++++++++++++++++++++++++
>  t/helper/test-tool.c          |  1 +
>  t/helper/test-tool.h          |  1 +
>  t/t0410-partial-clone.sh      | 24 ++++++++++++++++++++++++
>  7 files changed, 72 insertions(+), 10 deletions(-)
>  create mode 100644 t/helper/test-partial-clone.c
> 
> diff --git a/Makefile b/Makefile
> index c3565fc0f8..f6653bcd5e 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -725,6 +725,7 @@ TEST_BUILTINS_OBJS += test-oidmap.o
>  TEST_BUILTINS_OBJS += test-online-cpus.o
>  TEST_BUILTINS_OBJS += test-parse-options.o
>  TEST_BUILTINS_OBJS += test-parse-pathspec-file.o
> +TEST_BUILTINS_OBJS += test-partial-clone.o
>  TEST_BUILTINS_OBJS += test-path-utils.o
>  TEST_BUILTINS_OBJS += test-pcre2-config.o
>  TEST_BUILTINS_OBJS += test-pkt-line.o
> diff --git a/object-file.c b/object-file.c
> index f233b440b2..ebf273e9e7 100644
> --- a/object-file.c
> +++ b/object-file.c
> @@ -1570,15 +1570,12 @@ static int do_oid_object_info_extended(struct repository *r,
>  		}
>  
>  		/* Check if it is a missing object */
> -		if (fetch_if_missing && has_promisor_remote() &&
> -		    !already_retried && r == the_repository &&
> +		if (fetch_if_missing && repo_has_promisor_remote(r) &&
> +		    !already_retried &&

So we remove the explicit "if we've got promisors and are operating on
the repo we launched in" and instead ask "if the repo we're operating on
has promisors" - definitely a step towards in-process submodule
happiness :)

>  		    !(flags & OBJECT_INFO_SKIP_FETCH_OBJECT)) {
>  			/*
>  			 * TODO Investigate checking promisor_remote_get_direct()
>  			 * TODO return value and stopping on error here.
> -			 * TODO Pass a repository struct through
> -			 * promisor_remote_get_direct(), such that arbitrary
> -			 * repositories work.
>  			 */
>  			promisor_remote_get_direct(r, real, 1);

And this seems like a stale comment, since I see we were already passing
'r' here. But arbitrary repositories still don't just work, right? Or, I
guess your point was "partial clone + submodules don't just work,
because of the alternates thing" - so maybe this part is OK?

> @@ -150,7 +156,7 @@ static void promisor_remote_init(struct repository *r)
>  		xcalloc(sizeof(*r->promisor_remote_config), 1);
>  	config->promisors_tail = &config->promisors;
>  
> -	git_config(promisor_remote_config, config);
> +	repo_config(r, promisor_remote_config, config);

Should this change have happened when we added 'r' to
promisor_remote_init? If r==the_repository then there's no difference
between these two calls, right?

> diff --git a/t/helper/test-partial-clone.c b/t/helper/test-partial-clone.c
> new file mode 100644
> index 0000000000..e7bc7eb21f
> --- /dev/null
> +++ b/t/helper/test-partial-clone.c
> @@ -0,0 +1,34 @@
> +#include "cache.h"
> +#include "test-tool.h"
> +#include "repository.h"
> +#include "object-store.h"
> +
> +static void object_info(const char *gitdir, const char *oid_hex)
> +{
> +	struct repository r;
> +	struct object_id oid;
> +	unsigned long size;
> +	struct object_info oi = {.sizep = &size};
> +	const char *p;
> +
> +	if (repo_init(&r, gitdir, NULL))
> +		die("could not init repo");
> +	if (parse_oid_hex(oid_hex, &oid, &p))
> +		die("could not parse oid");
> +	if (oid_object_info_extended(&r, &oid, &oi, 0))
> +		die("could not obtain object info");
> +	printf("%d\n", (int) size);
> +}
> +
> +int cmd__partial_clone(int argc, const char **argv)
> +{
> +	if (argc < 4)
> +		die("too few arguments");
> +
> +	if (!strcmp(argv[1], "object-info"))
> +		object_info(argv[2], argv[3]);
> +	else
> +		die("invalid argument '%s'", argv[1]);
> +
> +	return 0;
> +}
> diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
> index c5bd0c6d4c..b21e8f1519 100644
> --- a/t/helper/test-tool.c
> +++ b/t/helper/test-tool.c
> @@ -46,6 +46,7 @@ static struct test_cmd cmds[] = {
>  	{ "online-cpus", cmd__online_cpus },
>  	{ "parse-options", cmd__parse_options },
>  	{ "parse-pathspec-file", cmd__parse_pathspec_file },
> +	{ "partial-clone", cmd__partial_clone },
>  	{ "path-utils", cmd__path_utils },
>  	{ "pcre2-config", cmd__pcre2_config },
>  	{ "pkt-line", cmd__pkt_line },
> diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
> index e8069a3b22..f845ced4b3 100644
> --- a/t/helper/test-tool.h
> +++ b/t/helper/test-tool.h
> @@ -35,6 +35,7 @@ int cmd__oidmap(int argc, const char **argv);
>  int cmd__online_cpus(int argc, const char **argv);
>  int cmd__parse_options(int argc, const char **argv);
>  int cmd__parse_pathspec_file(int argc, const char** argv);
> +int cmd__partial_clone(int argc, const char **argv);
>  int cmd__path_utils(int argc, const char **argv);
>  int cmd__pcre2_config(int argc, const char **argv);
>  int cmd__pkt_line(int argc, const char **argv);
> diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
> index 584a039b85..e804d267e6 100755
> --- a/t/t0410-partial-clone.sh
> +++ b/t/t0410-partial-clone.sh
> @@ -604,6 +604,30 @@ test_expect_success 'do not fetch when checking existence of tree we construct o
>  	git -C repo cherry-pick side1
>  '
>  
> +test_expect_success 'lazy-fetch when accessing object not in the_repository' '
> +	rm -rf full partial.git &&
> +	test_create_repo full &&
> +	printf 12345 >full/file.txt &&
> +	git -C full add file.txt &&
> +	git -C full commit -m "first commit" &&
I think there is some test_commit or similar function here that's more
commonly used, right?

> +
> +	test_config -C full uploadpack.allowfilter 1 &&
> +	test_config -C full uploadpack.allowanysha1inwant 1 &&
I wasn't sure what these configs are for, but it looks like .allowfilter
is to allow 'full' to serve as a remote to a partial clone. But what do
you need .allowAnySha1InWant for here? Are we expecting to ask for SHAs
that 'full' doesn't have?

> +	git clone --filter=blob:none --bare "file://$(pwd)/full" partial.git &&
> +	FILE_HASH=$(git hash-object --stdin <full/file.txt) &&
> +
> +	# Sanity check that the file is missing
> +	git -C partial.git rev-list --objects --missing=print HEAD >out &&
> +	grep "[?]$FILE_HASH" out &&
> +
> +	OUT=$(test-tool partial-clone object-info partial.git "$FILE_HASH") &&
> +	test "$OUT" -eq 5 &&

Hm. I guess I am confused about why this fetches the desired object into
partial.git. Maybe the test-helper needs a comment (and maybe here too)
on the line where fetch will be triggered?

> +
> +	# Sanity check that the file is now present
> +	git -C partial.git rev-list --objects --missing=print HEAD >out &&
> +	! grep "[?]$FILE_HASH" out

 - Emily
Jonathan Tan June 9, 2021, 4:52 a.m. UTC | #9
> >  		    !(flags & OBJECT_INFO_SKIP_FETCH_OBJECT)) {
> >  			/*
> >  			 * TODO Investigate checking promisor_remote_get_direct()
> >  			 * TODO return value and stopping on error here.
> > -			 * TODO Pass a repository struct through
> > -			 * promisor_remote_get_direct(), such that arbitrary
> > -			 * repositories work.
> >  			 */
> >  			promisor_remote_get_direct(r, real, 1);
> 
> And this seems like a stale comment, since I see we were already passing
> 'r' here. But arbitrary repositories still don't just work, right? Or, I
> guess your point was "partial clone + submodules don't just work,
> because of the alternates thing" - so maybe this part is OK?

This part is OK (arbitrary repositories work here), yes.

> > @@ -150,7 +156,7 @@ static void promisor_remote_init(struct repository *r)
> >  		xcalloc(sizeof(*r->promisor_remote_config), 1);
> >  	config->promisors_tail = &config->promisors;
> >  
> > -	git_config(promisor_remote_config, config);
> > +	repo_config(r, promisor_remote_config, config);
> 
> Should this change have happened when we added 'r' to
> promisor_remote_init? If r==the_repository then there's no difference
> between these two calls, right?

Good point - yes, it should have. I'll change it.

> > +test_expect_success 'lazy-fetch when accessing object not in the_repository' '
> > +	rm -rf full partial.git &&
> > +	test_create_repo full &&
> > +	printf 12345 >full/file.txt &&
> > +	git -C full add file.txt &&
> > +	git -C full commit -m "first commit" &&
> I think there is some test_commit or similar function here that's more
> commonly used, right?

Taylor Blau suggested a similar thing, and I have changed it in v2.

> 
> > +
> > +	test_config -C full uploadpack.allowfilter 1 &&
> > +	test_config -C full uploadpack.allowanysha1inwant 1 &&
> I wasn't sure what these configs are for, but it looks like .allowfilter
> is to allow 'full' to serve as a remote to a partial clone. But what do
> you need .allowAnySha1InWant for here? Are we expecting to ask for SHAs
> that 'full' doesn't have?

We are expecting to ask for SHAs that 'full' doesn't *advertise*, yes (namely,
the hash of a certain blob).

> > +	git clone --filter=blob:none --bare "file://$(pwd)/full" partial.git &&
> > +	FILE_HASH=$(git hash-object --stdin <full/file.txt) &&
> > +
> > +	# Sanity check that the file is missing
> > +	git -C partial.git rev-list --objects --missing=print HEAD >out &&
> > +	grep "[?]$FILE_HASH" out &&
> > +
> > +	OUT=$(test-tool partial-clone object-info partial.git "$FILE_HASH") &&
> > +	test "$OUT" -eq 5 &&
> 
> Hm. I guess I am confused about why this fetches the desired object into
> partial.git. Maybe the test-helper needs a comment (and maybe here too)
> on the line where fetch will be triggered?

I've added a comment to the test-helper code in v2 - could you take a
look and see if that clarifies things? But in any case, the answer is
that this test-tool invocation attempts to read an object in the
submodule while running as a process in the superproject. The read
attempt is a read of a missing object, so that object is lazily fetched.
diff mbox series

Patch

diff --git a/Makefile b/Makefile
index c3565fc0f8..f6653bcd5e 100644
--- a/Makefile
+++ b/Makefile
@@ -725,6 +725,7 @@  TEST_BUILTINS_OBJS += test-oidmap.o
 TEST_BUILTINS_OBJS += test-online-cpus.o
 TEST_BUILTINS_OBJS += test-parse-options.o
 TEST_BUILTINS_OBJS += test-parse-pathspec-file.o
+TEST_BUILTINS_OBJS += test-partial-clone.o
 TEST_BUILTINS_OBJS += test-path-utils.o
 TEST_BUILTINS_OBJS += test-pcre2-config.o
 TEST_BUILTINS_OBJS += test-pkt-line.o
diff --git a/object-file.c b/object-file.c
index f233b440b2..ebf273e9e7 100644
--- a/object-file.c
+++ b/object-file.c
@@ -1570,15 +1570,12 @@  static int do_oid_object_info_extended(struct repository *r,
 		}
 
 		/* Check if it is a missing object */
-		if (fetch_if_missing && has_promisor_remote() &&
-		    !already_retried && r == the_repository &&
+		if (fetch_if_missing && repo_has_promisor_remote(r) &&
+		    !already_retried &&
 		    !(flags & OBJECT_INFO_SKIP_FETCH_OBJECT)) {
 			/*
 			 * TODO Investigate checking promisor_remote_get_direct()
 			 * TODO return value and stopping on error here.
-			 * TODO Pass a repository struct through
-			 * promisor_remote_get_direct(), such that arbitrary
-			 * repositories work.
 			 */
 			promisor_remote_get_direct(r, real, 1);
 			already_retried = 1;
diff --git a/promisor-remote.c b/promisor-remote.c
index 5819d2cf28..1601f05d79 100644
--- a/promisor-remote.c
+++ b/promisor-remote.c
@@ -11,7 +11,8 @@  struct promisor_remote_config {
 	struct promisor_remote **promisors_tail;
 };
 
-static int fetch_objects(const char *remote_name,
+static int fetch_objects(struct repository *repo,
+			 const char *remote_name,
 			 const struct object_id *oids,
 			 int oid_nr)
 {
@@ -21,6 +22,11 @@  static int fetch_objects(const char *remote_name,
 
 	child.git_cmd = 1;
 	child.in = -1;
+	if (repo != the_repository) {
+		prepare_other_repo_env(&child.env_array);
+		strvec_pushf(&child.env_array, "%s=%s", GIT_DIR_ENVIRONMENT,
+			     repo->gitdir);
+	}
 	strvec_pushl(&child.args, "-c", "fetch.negotiationAlgorithm=noop",
 		     "fetch", remote_name, "--no-tags",
 		     "--no-write-fetch-head", "--recurse-submodules=no",
@@ -150,7 +156,7 @@  static void promisor_remote_init(struct repository *r)
 		xcalloc(sizeof(*r->promisor_remote_config), 1);
 	config->promisors_tail = &config->promisors;
 
-	git_config(promisor_remote_config, config);
+	repo_config(r, promisor_remote_config, config);
 
 	if (config->repository_format_partial_clone) {
 		struct promisor_remote *o, *previous;
@@ -246,10 +252,8 @@  int promisor_remote_get_direct(struct repository *repo,
 
 	promisor_remote_init(repo);
 
-	if (repo != the_repository)
-		BUG("only the_repository is supported for now");
 	for (r = repo->promisor_remote_config->promisors; r; r = r->next) {
-		if (fetch_objects(r->name, remaining_oids, remaining_nr) < 0) {
+		if (fetch_objects(repo, r->name, remaining_oids, remaining_nr) < 0) {
 			if (remaining_nr == 1)
 				continue;
 			remaining_nr = remove_fetched_oids(repo, &remaining_oids,
diff --git a/t/helper/test-partial-clone.c b/t/helper/test-partial-clone.c
new file mode 100644
index 0000000000..e7bc7eb21f
--- /dev/null
+++ b/t/helper/test-partial-clone.c
@@ -0,0 +1,34 @@ 
+#include "cache.h"
+#include "test-tool.h"
+#include "repository.h"
+#include "object-store.h"
+
+static void object_info(const char *gitdir, const char *oid_hex)
+{
+	struct repository r;
+	struct object_id oid;
+	unsigned long size;
+	struct object_info oi = {.sizep = &size};
+	const char *p;
+
+	if (repo_init(&r, gitdir, NULL))
+		die("could not init repo");
+	if (parse_oid_hex(oid_hex, &oid, &p))
+		die("could not parse oid");
+	if (oid_object_info_extended(&r, &oid, &oi, 0))
+		die("could not obtain object info");
+	printf("%d\n", (int) size);
+}
+
+int cmd__partial_clone(int argc, const char **argv)
+{
+	if (argc < 4)
+		die("too few arguments");
+
+	if (!strcmp(argv[1], "object-info"))
+		object_info(argv[2], argv[3]);
+	else
+		die("invalid argument '%s'", argv[1]);
+
+	return 0;
+}
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index c5bd0c6d4c..b21e8f1519 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -46,6 +46,7 @@  static struct test_cmd cmds[] = {
 	{ "online-cpus", cmd__online_cpus },
 	{ "parse-options", cmd__parse_options },
 	{ "parse-pathspec-file", cmd__parse_pathspec_file },
+	{ "partial-clone", cmd__partial_clone },
 	{ "path-utils", cmd__path_utils },
 	{ "pcre2-config", cmd__pcre2_config },
 	{ "pkt-line", cmd__pkt_line },
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index e8069a3b22..f845ced4b3 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -35,6 +35,7 @@  int cmd__oidmap(int argc, const char **argv);
 int cmd__online_cpus(int argc, const char **argv);
 int cmd__parse_options(int argc, const char **argv);
 int cmd__parse_pathspec_file(int argc, const char** argv);
+int cmd__partial_clone(int argc, const char **argv);
 int cmd__path_utils(int argc, const char **argv);
 int cmd__pcre2_config(int argc, const char **argv);
 int cmd__pkt_line(int argc, const char **argv);
diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
index 584a039b85..e804d267e6 100755
--- a/t/t0410-partial-clone.sh
+++ b/t/t0410-partial-clone.sh
@@ -604,6 +604,30 @@  test_expect_success 'do not fetch when checking existence of tree we construct o
 	git -C repo cherry-pick side1
 '
 
+test_expect_success 'lazy-fetch when accessing object not in the_repository' '
+	rm -rf full partial.git &&
+	test_create_repo full &&
+	printf 12345 >full/file.txt &&
+	git -C full add file.txt &&
+	git -C full commit -m "first commit" &&
+
+	test_config -C full uploadpack.allowfilter 1 &&
+	test_config -C full uploadpack.allowanysha1inwant 1 &&
+	git clone --filter=blob:none --bare "file://$(pwd)/full" partial.git &&
+	FILE_HASH=$(git hash-object --stdin <full/file.txt) &&
+
+	# Sanity check that the file is missing
+	git -C partial.git rev-list --objects --missing=print HEAD >out &&
+	grep "[?]$FILE_HASH" out &&
+
+	OUT=$(test-tool partial-clone object-info partial.git "$FILE_HASH") &&
+	test "$OUT" -eq 5 &&
+
+	# Sanity check that the file is now present
+	git -C partial.git rev-list --objects --missing=print HEAD >out &&
+	! grep "[?]$FILE_HASH" out
+'
+
 . "$TEST_DIRECTORY"/lib-httpd.sh
 start_httpd