diff mbox series

[11/11] bundle: unbundle promisor packs

Message ID ec51d0a50e6e64ae37795d77f7d33204b9b71ecd.1645638911.git.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series Partial bundles | expand

Commit Message

Derrick Stolee Feb. 23, 2022, 5:55 p.m. UTC
From: Derrick Stolee <derrickstolee@github.com>

In order to have a valid pack-file after unbundling a bundle that has
the 'filter' capability, we need to generate a .promisor file. The
bundle does not promise _where_ the objects can be found, but we can
expect that these bundles will be unbundled in repositories with
appropriate promisor remotes that can find those missing objects.

Use the 'git index-pack --promisor=<message>' option to create this
.promisor file. Add "from-bundle" as the message to help anyone diagnose
issues with these promisor packs.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 bundle.c               | 4 ++++
 t/t6020-bundle-misc.sh | 8 +++++++-
 2 files changed, 11 insertions(+), 1 deletion(-)

Comments

Junio C Hamano March 4, 2022, 11:43 p.m. UTC | #1
"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Derrick Stolee <derrickstolee@github.com>
>
> In order to have a valid pack-file after unbundling a bundle that has
> the 'filter' capability, we need to generate a .promisor file. The
> bundle does not promise _where_ the objects can be found, but we can
> expect that these bundles will be unbundled in repositories with
> appropriate promisor remotes that can find those missing objects.

That sounds like a lot of wishful thinking, but I do not think of a
better way to phrase the idea.  Taking a bundle out of a repository
and unbundling it elsewhere is "git fetch" that could be done to
send objects from the former to the latter repository, so I am OK
with the assumption that the original repository will stay available
for such users who took its contents over sneaker-net instead of
over the wire.

> Use the 'git index-pack --promisor=<message>' option to create this
> .promisor file. Add "from-bundle" as the message to help anyone diagnose
> issues with these promisor packs.
>
> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
> ---
>  bundle.c               | 4 ++++
>  t/t6020-bundle-misc.sh | 8 +++++++-
>  2 files changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/bundle.c b/bundle.c
> index e284ef63062..3d97de40ef0 100644
> --- a/bundle.c
> +++ b/bundle.c
> @@ -631,6 +631,10 @@ int unbundle(struct repository *r, struct bundle_header *header,
>  	struct child_process ip = CHILD_PROCESS_INIT;
>  	strvec_pushl(&ip.args, "index-pack", "--fix-thin", "--stdin", NULL);
>  
> +	/* If there is a filter, then we need to create the promisor pack. */
> +	if (header->filter)
> +		strvec_push(&ip.args, "--promisor=from-bundle");
> +
>  	if (extra_index_pack_args) {
>  		strvec_pushv(&ip.args, extra_index_pack_args->v);
>  		strvec_clear(extra_index_pack_args);
> diff --git a/t/t6020-bundle-misc.sh b/t/t6020-bundle-misc.sh
> index 39cfefafb65..344af34db1e 100755
> --- a/t/t6020-bundle-misc.sh
> +++ b/t/t6020-bundle-misc.sh
> @@ -513,7 +513,13 @@ do
>  		The bundle uses this filter: $filter
>  		The bundle records a complete history.
>  		EOF
> -		test_cmp expect actual
> +		test_cmp expect actual &&
> +
> +		# This creates the first pack-file in the
> +		# .git/objects/pack directory. Look for a .promisor.
> +		git bundle unbundle partial.bdl &&
> +		ls .git/objects/pack/pack-*.promisor >promisor &&
> +		test_line_count = 1 promisor

OK.  Do we also want to inspect the contents of the resulting
repository to make sure that the bundle had the right contents?

One idea to do so would probably be

 - prepare a test repository (you already have it)
 - prepare a partial.bdl (you already do this)

 - clone the test repository into a new repository, with the same
   filter
 - create an empty repository, unbundle the partial.bdl

 - take "for-each-ref" and list of objects available in these two
   "partial copies" from the test repository, and compare
Derrick Stolee March 7, 2022, 2:48 p.m. UTC | #2
On 3/4/2022 6:43 PM, Junio C Hamano wrote:
> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>> From: Derrick Stolee <derrickstolee@github.com>
>>
>> In order to have a valid pack-file after unbundling a bundle that has
>> the 'filter' capability, we need to generate a .promisor file. The
>> bundle does not promise _where_ the objects can be found, but we can
>> expect that these bundles will be unbundled in repositories with
>> appropriate promisor remotes that can find those missing objects.
> 
> That sounds like a lot of wishful thinking, but I do not think of a
> better way to phrase the idea.  Taking a bundle out of a repository
> and unbundling it elsewhere is "git fetch" that could be done to
> send objects from the former to the latter repository, so I am OK
> with the assumption that the original repository will stay available
> for such users who took its contents over sneaker-net instead of
> over the wire.

As an aside, I'm also concerned about the existing model of promisor
remotes where it depends on each remote, and isn't a repository-wide
state. In particular, if I do a blobless partial clone of git/git and
then add git-for-windows/git as a remote and fetch it, it will break
because git-for-windows/git isn't set up as a promisor remote and we
expect to have every blob reachable from its pack-file (even though
it was not sent because we advertised a commit that can reach it).

I've been thinking about adjusting the config parsing around promisors
to say "I see one promisor remote, so I will assume all remotes are
promisors." It seems to me that this will fix cases like the above
without further breaking any cases (that are not already broken).

But that's a tangent for another time. :)
 
>> Use the 'git index-pack --promisor=<message>' option to create this
>> .promisor file. Add "from-bundle" as the message to help anyone diagnose
>> issues with these promisor packs.
>>
>> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
>> ---
>>  bundle.c               | 4 ++++
>>  t/t6020-bundle-misc.sh | 8 +++++++-
>>  2 files changed, 11 insertions(+), 1 deletion(-)
>>
>> diff --git a/bundle.c b/bundle.c
>> index e284ef63062..3d97de40ef0 100644
>> --- a/bundle.c
>> +++ b/bundle.c
>> @@ -631,6 +631,10 @@ int unbundle(struct repository *r, struct bundle_header *header,
>>  	struct child_process ip = CHILD_PROCESS_INIT;
>>  	strvec_pushl(&ip.args, "index-pack", "--fix-thin", "--stdin", NULL);
>>  
>> +	/* If there is a filter, then we need to create the promisor pack. */
>> +	if (header->filter)
>> +		strvec_push(&ip.args, "--promisor=from-bundle");
>> +
>>  	if (extra_index_pack_args) {
>>  		strvec_pushv(&ip.args, extra_index_pack_args->v);
>>  		strvec_clear(extra_index_pack_args);
>> diff --git a/t/t6020-bundle-misc.sh b/t/t6020-bundle-misc.sh
>> index 39cfefafb65..344af34db1e 100755
>> --- a/t/t6020-bundle-misc.sh
>> +++ b/t/t6020-bundle-misc.sh
>> @@ -513,7 +513,13 @@ do
>>  		The bundle uses this filter: $filter
>>  		The bundle records a complete history.
>>  		EOF
>> -		test_cmp expect actual
>> +		test_cmp expect actual &&
>> +
>> +		# This creates the first pack-file in the
>> +		# .git/objects/pack directory. Look for a .promisor.
>> +		git bundle unbundle partial.bdl &&
>> +		ls .git/objects/pack/pack-*.promisor >promisor &&
>> +		test_line_count = 1 promisor
> 
> OK.  Do we also want to inspect the contents of the resulting
> repository to make sure that the bundle had the right contents?
> 
> One idea to do so would probably be
> 
>  - prepare a test repository (you already have it)
>  - prepare a partial.bdl (you already do this)
> 
>  - clone the test repository into a new repository, with the same
>    filter
>  - create an empty repository, unbundle the partial.bdl
> 
>  - take "for-each-ref" and list of objects available in these two
>    "partial copies" from the test repository, and compare

Good idea. Thanks!

Of course, looking closer at it... "git bundle unbundle" doesn't
actually store the refs directly in the refspace, but instead only
outputs the refs that it used.

Here is an attempt to verify the refs that are reported match
those in a mirror clone.

--- >8 ---

diff --git a/t/t6020-bundle-misc.sh b/t/t6020-bundle-misc.sh
index 344af34db1e..a228cbfc4e3 100755
--- a/t/t6020-bundle-misc.sh
+++ b/t/t6020-bundle-misc.sh
@@ -490,7 +490,7 @@ test_expect_success 'unfiltered bundle with --objects' '
 for filter in "blob:none" "tree:0" "tree:1" "blob:limit=100"
 do
 	test_expect_success 'filtered bundle: $filter' '
-		test_when_finished rm -rf .git/objects/pack &&
+		test_when_finished rm -rf .git/objects/pack cloned unbundled &&
 		git bundle create partial.bdl \
 			--all \
 			--filter=$filter &&
@@ -515,11 +515,22 @@ do
 		EOF
 		test_cmp expect actual &&
 
-		# This creates the first pack-file in the
-		# .git/objects/pack directory. Look for a .promisor.
-		git bundle unbundle partial.bdl &&
-		ls .git/objects/pack/pack-*.promisor >promisor &&
-		test_line_count = 1 promisor
+		git init unbundled &&
+		(
+			cd unbundled &&
+			# This creates the first pack-file in the
+			# .git/objects/pack directory. Look for a .promisor.
+			git bundle unbundle ../partial.bdl >ref-list.txt &&
+			ls .git/objects/pack/pack-*.promisor >promisor &&
+			test_line_count = 1 promisor
+		) &&
+
+		git clone --filter=blob:none --mirror "file://$(pwd)" cloned &&
+		git -C cloned for-each-ref \
+			--format="%(objectname) %(refname)" >cloned-refs.txt &&
+		echo "$(git -C cloned rev-parse HEAD) HEAD" >>cloned-refs.txt &&
+		test_cmp cloned-refs.txt unbundled/ref-list.txt
 	'
 done
 
--- >8 ---

I also attempted doing a "git clone --bare partial.bdl unbundled.git" to
get the 'git clone' command to actually place the refs. However, 'git clone'
does not set up the repository filter based on the bundle, so it reports
missing blobs (even though there is no checkout). Making this work would
require that "repository global promisor config" idea that I mentioned in
another reply. I'll make note of this as a potential application of that
idea.

Thanks,
-Stolee
Ævar Arnfjörð Bjarmason March 7, 2022, 3:47 p.m. UTC | #3
On Wed, Feb 23 2022, Derrick Stolee via GitGitGadget wrote:

> From: Derrick Stolee <derrickstolee@github.com>
>
> In order to have a valid pack-file after unbundling a bundle that has
> the 'filter' capability, we need to generate a .promisor file. The
> bundle does not promise _where_ the objects can be found, but we can
> expect that these bundles will be unbundled in repositories with
> appropriate promisor remotes that can find those missing objects.
>
> Use the 'git index-pack --promisor=<message>' option to create this
> .promisor file. Add "from-bundle" as the message to help anyone diagnose
> issues with these promisor packs.
>
> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
> ---
>  bundle.c               | 4 ++++
>  t/t6020-bundle-misc.sh | 8 +++++++-
>  2 files changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/bundle.c b/bundle.c
> index e284ef63062..3d97de40ef0 100644
> --- a/bundle.c
> +++ b/bundle.c
> @@ -631,6 +631,10 @@ int unbundle(struct repository *r, struct bundle_header *header,
>  	struct child_process ip = CHILD_PROCESS_INIT;
>  	strvec_pushl(&ip.args, "index-pack", "--fix-thin", "--stdin", NULL);
>  
> +	/* If there is a filter, then we need to create the promisor pack. */
> +	if (header->filter)
> +		strvec_push(&ip.args, "--promisor=from-bundle");
> +
>  	if (extra_index_pack_args) {
>  		strvec_pushv(&ip.args, extra_index_pack_args->v);
>  		strvec_clear(extra_index_pack_args);
> diff --git a/t/t6020-bundle-misc.sh b/t/t6020-bundle-misc.sh
> index 39cfefafb65..344af34db1e 100755
> --- a/t/t6020-bundle-misc.sh
> +++ b/t/t6020-bundle-misc.sh
> @@ -513,7 +513,13 @@ do
>  		The bundle uses this filter: $filter
>  		The bundle records a complete history.
>  		EOF
> -		test_cmp expect actual
> +		test_cmp expect actual &&
> +
> +		# This creates the first pack-file in the
> +		# .git/objects/pack directory. Look for a .promisor.
> +		git bundle unbundle partial.bdl &&
> +		ls .git/objects/pack/pack-*.promisor >promisor &&
> +		test_line_count = 1 promisor
>  	'
>  done

Aside from what Junio mentioned, the preceding commit seems to be
incomplete here. I.e. I'd expect to see this replace a case where we
died or whatever before. What happened if we invoked "unbundle" before?
Derrick Stolee March 7, 2022, 4:10 p.m. UTC | #4
On 3/7/2022 10:47 AM, Ævar Arnfjörð Bjarmason wrote:
> 
> On Wed, Feb 23 2022, Derrick Stolee via GitGitGadget wrote:
> 
>> From: Derrick Stolee <derrickstolee@github.com>
>>
>> In order to have a valid pack-file after unbundling a bundle that has
>> the 'filter' capability, we need to generate a .promisor file. The
>> bundle does not promise _where_ the objects can be found, but we can
>> expect that these bundles will be unbundled in repositories with
>> appropriate promisor remotes that can find those missing objects.
>>
>> Use the 'git index-pack --promisor=<message>' option to create this
>> .promisor file. Add "from-bundle" as the message to help anyone diagnose
>> issues with these promisor packs.
>>
>> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
>> ---
>>  bundle.c               | 4 ++++
>>  t/t6020-bundle-misc.sh | 8 +++++++-
>>  2 files changed, 11 insertions(+), 1 deletion(-)
>>
>> diff --git a/bundle.c b/bundle.c
>> index e284ef63062..3d97de40ef0 100644
>> --- a/bundle.c
>> +++ b/bundle.c
>> @@ -631,6 +631,10 @@ int unbundle(struct repository *r, struct bundle_header *header,
>>  	struct child_process ip = CHILD_PROCESS_INIT;
>>  	strvec_pushl(&ip.args, "index-pack", "--fix-thin", "--stdin", NULL);
>>  
>> +	/* If there is a filter, then we need to create the promisor pack. */
>> +	if (header->filter)
>> +		strvec_push(&ip.args, "--promisor=from-bundle");
>> +
>>  	if (extra_index_pack_args) {
>>  		strvec_pushv(&ip.args, extra_index_pack_args->v);
>>  		strvec_clear(extra_index_pack_args);
>> diff --git a/t/t6020-bundle-misc.sh b/t/t6020-bundle-misc.sh
>> index 39cfefafb65..344af34db1e 100755
>> --- a/t/t6020-bundle-misc.sh
>> +++ b/t/t6020-bundle-misc.sh
>> @@ -513,7 +513,13 @@ do
>>  		The bundle uses this filter: $filter
>>  		The bundle records a complete history.
>>  		EOF
>> -		test_cmp expect actual
>> +		test_cmp expect actual &&
>> +
>> +		# This creates the first pack-file in the
>> +		# .git/objects/pack directory. Look for a .promisor.
>> +		git bundle unbundle partial.bdl &&
>> +		ls .git/objects/pack/pack-*.promisor >promisor &&
>> +		test_line_count = 1 promisor
>>  	'
>>  done
> 
> Aside from what Junio mentioned, the preceding commit seems to be
> incomplete here. I.e. I'd expect to see this replace a case where we
> died or whatever before. What happened if we invoked "unbundle" before?

Looking closely, I think the only difference is that this patch
adds the .promisor file. I can push my expanded test to be
earlier in the series so we can verify this.

Thanks,
-Stolee
Junio C Hamano March 7, 2022, 4:56 p.m. UTC | #5
Derrick Stolee <derrickstolee@github.com> writes:

> Of course, looking closer at it... "git bundle unbundle" doesn't
> actually store the refs directly in the refspace, but instead only
> outputs the refs that it used.

True.  I was more thinking about equivalence between

    cd src_repo
    git clone --no-local --filter=... . ../partial.network.cloned
    git bundle create --filter=... partial.bndl
    git clone partial.bndl ../partial.bundle.cloned

The two resulting repositories should look very similar except for
that the remote.origin.* of the former would expect that it pushes
back to where it was cloned from, while the latter would not.

> +		git init unbundled &&
> +		(
> +			cd unbundled &&
> +			# This creates the first pack-file in the
> +			# .git/objects/pack directory. Look for a .promisor.
> +			git bundle unbundle ../partial.bdl >ref-list.txt &&
> +			ls .git/objects/pack/pack-*.promisor >promisor &&
> +			test_line_count = 1 promisor

And can we enumerate the objects we have in .git/objects, both loose
and packed?

> +		) &&
> +
> +		git clone --filter=blob:none --mirror "file://$(pwd)" cloned &&
> +		git -C cloned for-each-ref \
> +			--format="%(objectname) %(refname)" >cloned-refs.txt &&
> +		echo "$(git -C cloned rev-parse HEAD) HEAD" >>cloned-refs.txt &&
> +		test_cmp cloned-refs.txt unbundled/ref-list.txt

Likewise here?  I think the two should match, and that was what I
was wondering if we should enforce.

>  	'
>  done
>  
> --- >8 ---
>
> I also attempted doing a "git clone --bare partial.bdl unbundled.git" to
> get the 'git clone' command to actually place the refs. However, 'git clone'
> does not set up the repository filter based on the bundle, so it reports
> missing blobs (even though there is no checkout).

Understandable, as cloning from a bundle, if I recall correctly, was
done as yet another special case in "git clone", differently from
the main "over the network" code path.  And from end-user's point of
view, I think updating it is part of introducing the filtered
bundle.

Thanks.
Derrick Stolee March 7, 2022, 6:57 p.m. UTC | #6
On 3/7/2022 11:56 AM, Junio C Hamano wrote:
> Derrick Stolee <derrickstolee@github.com> writes:
> 
>> Of course, looking closer at it... "git bundle unbundle" doesn't
>> actually store the refs directly in the refspace, but instead only
>> outputs the refs that it used.
> 
> True.  I was more thinking about equivalence between
> 
>     cd src_repo
>     git clone --no-local --filter=... . ../partial.network.cloned
>     git bundle create --filter=... partial.bndl
>     git clone partial.bndl ../partial.bundle.cloned
> 
> The two resulting repositories should look very similar except for
> that the remote.origin.* of the former would expect that it pushes
> back to where it was cloned from, while the latter would not.

Makes sense.

The one downside is that you list cloning form a partial bundle,
but that currently does not work, even if we avoid a checkout.
It fails because the clone command is not parsing the filter
and properly setting repo-global promisor information. (Again,
this is a bigger change to make this possible.)

I also had some struggles getting this to work since local clones
were actually ignoring the filter. I didn't think it was worth
setting up an HTTP or SSH server just for this test. See
workaround below.
 
>> +		git init unbundled &&
>> +		(
>> +			cd unbundled &&
>> +			# This creates the first pack-file in the
>> +			# .git/objects/pack directory. Look for a .promisor.
>> +			git bundle unbundle ../partial.bdl >ref-list.txt &&
>> +			ls .git/objects/pack/pack-*.promisor >promisor &&
>> +			test_line_count = 1 promisor
> 
> And can we enumerate the objects we have in .git/objects, both loose
> and packed?

I can enumerate using 'git rev-list --objects' to compare the
unbundled set to the full clone (adding --filter=$filter to the
full clone's run and --missing=allow-any to the unbundled one).

>> +		) &&
>> +
>> +		git clone --filter=blob:none --mirror "file://$(pwd)" cloned &&
>> +		git -C cloned for-each-ref \
>> +			--format="%(objectname) %(refname)" >cloned-refs.txt &&
>> +		echo "$(git -C cloned rev-parse HEAD) HEAD" >>cloned-refs.txt &&
>> +		test_cmp cloned-refs.txt unbundled/ref-list.txt
> 
> Likewise here?  I think the two should match, and that was what I
> was wondering if we should enforce.
> 
>>  	'
>>  done
>>  
>> --- >8 ---
>>
>> I also attempted doing a "git clone --bare partial.bdl unbundled.git" to
>> get the 'git clone' command to actually place the refs. However, 'git clone'
>> does not set up the repository filter based on the bundle, so it reports
>> missing blobs (even though there is no checkout).
> 
> Understandable, as cloning from a bundle, if I recall correctly, was
> done as yet another special case in "git clone", differently from
> the main "over the network" code path.  And from end-user's point of
> view, I think updating it is part of introducing the filtered
> bundle.

The reason I did not include that here is because of the lack of
repository-global promisor/filter config. I do want to loop back
and make those updates, but perhaps for this series we should add
an error condition into 'git clone' to say "Cannot currently clone
from a filtered bundle" to help users understand the issue?

Thanks,
-Stolee
Junio C Hamano March 7, 2022, 7:40 p.m. UTC | #7
Derrick Stolee <derrickstolee@github.com> writes:

> I also had some struggles getting this to work since local clones
> were actually ignoring the filter. I didn't think it was worth
> setting up an HTTP or SSH server just for this test.

Does "clone --no-local $path" work as a workaround?  It should do
the same thing as "ssh" codepath except that it uses "sh" instead as
the process on the other side is running locally.

>>> I also attempted doing a "git clone --bare partial.bdl unbundled.git" to
>>> get the 'git clone' command to actually place the refs. However, 'git clone'
>>> does not set up the repository filter based on the bundle, so it reports
>>> missing blobs (even though there is no checkout).
>> 
>> Understandable, as cloning from a bundle, if I recall correctly, was
>> done as yet another special case in "git clone", differently from
>> the main "over the network" code path.  And from end-user's point of
>> view, I think updating it is part of introducing the filtered
>> bundle.
>
> The reason I did not include that here is because of the lack of
> repository-global promisor/filter config. I do want to loop back
> and make those updates, but perhaps for this series we should add
> an error condition into 'git clone' to say "Cannot currently clone
> from a filtered bundle" to help users understand the issue?

It would be a workable stepwise solution, I would think.  It is not
like we are robbing an existing feature from users---it merely is
that the support of partial cloning over different "transport" is
uneven, which is to be expected, especially in earlier phase of
introducing a new feature.

Thanks.
Derrick Stolee March 7, 2022, 7:49 p.m. UTC | #8
On 3/7/2022 2:40 PM, Junio C Hamano wrote:
> Derrick Stolee <derrickstolee@github.com> writes:
> 
>> I also had some struggles getting this to work since local clones
>> were actually ignoring the filter. I didn't think it was worth
>> setting up an HTTP or SSH server just for this test.
> 
> Does "clone --no-local $path" work as a workaround?  It should do
> the same thing as "ssh" codepath except that it uses "sh" instead as
> the process on the other side is running locally.

I've been trying this:

 git clone --no-local --filter=$filter "file://$(pwd)" cloned &&

which "succeeds" with this in the stderr:

 warning: filtering not recognized by server, ignoring

>>>> I also attempted doing a "git clone --bare partial.bdl unbundled.git" to
>>>> get the 'git clone' command to actually place the refs. However, 'git clone'
>>>> does not set up the repository filter based on the bundle, so it reports
>>>> missing blobs (even though there is no checkout).
>>>
>>> Understandable, as cloning from a bundle, if I recall correctly, was
>>> done as yet another special case in "git clone", differently from
>>> the main "over the network" code path.  And from end-user's point of
>>> view, I think updating it is part of introducing the filtered
>>> bundle.
>>
>> The reason I did not include that here is because of the lack of
>> repository-global promisor/filter config. I do want to loop back
>> and make those updates, but perhaps for this series we should add
>> an error condition into 'git clone' to say "Cannot currently clone
>> from a filtered bundle" to help users understand the issue?
> 
> It would be a workable stepwise solution, I would think.  It is not
> like we are robbing an existing feature from users---it merely is
> that the support of partial cloning over different "transport" is
> uneven, which is to be expected, especially in earlier phase of
> introducing a new feature.

That was my understanding, too.

Thanks,
-Stolee
Junio C Hamano March 7, 2022, 7:54 p.m. UTC | #9
Derrick Stolee <derrickstolee@github.com> writes:

> I've been trying this:
>
>  git clone --no-local --filter=$filter "file://$(pwd)" cloned &&
>
> which "succeeds" with this in the stderr:
>
>  warning: filtering not recognized by server, ignoring

Hmph, and we won't see it when going over ssh to the same
repository?  That is puzzling.

>>> an error condition into 'git clone' to say "Cannot currently clone
>>> from a filtered bundle" to help users understand the issue?
>> 
>> It would be a workable stepwise solution, I would think.  It is not
>> like we are robbing an existing feature from users---it merely is
>> that the support of partial cloning over different "transport" is
>> uneven, which is to be expected, especially in earlier phase of
>> introducing a new feature.
>
> That was my understanding, too.

Good to see us on the same page.  Thanks.
Derrick Stolee March 7, 2022, 8:20 p.m. UTC | #10
On 3/7/2022 2:54 PM, Junio C Hamano wrote:
> Derrick Stolee <derrickstolee@github.com> writes:
> 
>> I've been trying this:
>>
>>  git clone --no-local --filter=$filter "file://$(pwd)" cloned &&
>>
>> which "succeeds" with this in the stderr:
>>
>>  warning: filtering not recognized by server, ignoring
> 
> Hmph, and we won't see it when going over ssh to the same
> repository?  That is puzzling.

Of course, it's not an issue with file://, but an issue with the
defaults. In order to test partial clones, I need to enable these
config options in the server repo:

		test_config uploadpack.allowfilter 1 &&
		test_config uploadpack.allowanysha1inwant 1 &&

Sorry for taking so long to realize this.

Thanks,
-Stolee
Junio C Hamano March 7, 2022, 9:35 p.m. UTC | #11
Derrick Stolee <derrickstolee@github.com> writes:

>> Hmph, and we won't see it when going over ssh to the same
>> repository?  That is puzzling.
>
> Of course, it's not an issue with file://, but an issue with the
> defaults. In order to test partial clones, I need to enable these
> config options in the server repo:
>
> 		test_config uploadpack.allowfilter 1 &&
> 		test_config uploadpack.allowanysha1inwant 1 &&
>
> Sorry for taking so long to realize this.

No, thanks for finding that out---the fact that I left it at
"puzzling" without offering that is a strong evidence that I didn't
think of it, either ;-)
diff mbox series

Patch

diff --git a/bundle.c b/bundle.c
index e284ef63062..3d97de40ef0 100644
--- a/bundle.c
+++ b/bundle.c
@@ -631,6 +631,10 @@  int unbundle(struct repository *r, struct bundle_header *header,
 	struct child_process ip = CHILD_PROCESS_INIT;
 	strvec_pushl(&ip.args, "index-pack", "--fix-thin", "--stdin", NULL);
 
+	/* If there is a filter, then we need to create the promisor pack. */
+	if (header->filter)
+		strvec_push(&ip.args, "--promisor=from-bundle");
+
 	if (extra_index_pack_args) {
 		strvec_pushv(&ip.args, extra_index_pack_args->v);
 		strvec_clear(extra_index_pack_args);
diff --git a/t/t6020-bundle-misc.sh b/t/t6020-bundle-misc.sh
index 39cfefafb65..344af34db1e 100755
--- a/t/t6020-bundle-misc.sh
+++ b/t/t6020-bundle-misc.sh
@@ -513,7 +513,13 @@  do
 		The bundle uses this filter: $filter
 		The bundle records a complete history.
 		EOF
-		test_cmp expect actual
+		test_cmp expect actual &&
+
+		# This creates the first pack-file in the
+		# .git/objects/pack directory. Look for a .promisor.
+		git bundle unbundle partial.bdl &&
+		ls .git/objects/pack/pack-*.promisor >promisor &&
+		test_line_count = 1 promisor
 	'
 done