diff mbox series

[1/2] fetch-pack: remove fetch_if_missing=0

Message ID eb0cbeeeed080596c130f657186894999ae6121b.1587412477.git.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series removed fetch_if_missing global | expand

Commit Message

Linus Arver via GitGitGadget April 20, 2020, 7:54 p.m. UTC
From: Hariom Verma <hariom18599@gmail.com>

Commit 6462d5e ("fetch: remove fetch_if_missing=0", 2019-11-08)
strove to remove the need for fetch_if_missing=0 from the fetching
mechanism, so it is plausible to attempt removing fetch_if_missing=0
from fetch-pack as well.

Signed-off-by: Hariom Verma <hariom18599@gmail.com>
---
 builtin/fetch-pack.c | 2 --
 fetch-pack.c         | 2 +-
 2 files changed, 1 insertion(+), 3 deletions(-)

Comments

Christian Couder May 7, 2020, 1:17 p.m. UTC | #1
On Mon, Apr 20, 2020 at 9:57 PM Hariom Verma via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Hariom Verma <hariom18599@gmail.com>
>
> Commit 6462d5e ("fetch: remove fetch_if_missing=0", 2019-11-08)
> strove to remove the need for fetch_if_missing=0 from the fetching
> mechanism, so it is plausible to attempt removing fetch_if_missing=0
> from fetch-pack as well.

It's ok to refer to a previous commit, but I think it would be better
if you could repeat a bit the reasons why removing the
fetch_if_missing global is a good idea, and not just rely on the
previous commit.

"it is plausible" also doesn't make it very clear that it's what the
patch is actually doing.

Thanks,
Christian.
Junio C Hamano May 7, 2020, 3:02 p.m. UTC | #2
Christian Couder <christian.couder@gmail.com> writes:

> On Mon, Apr 20, 2020 at 9:57 PM Hariom Verma via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
>>
>> From: Hariom Verma <hariom18599@gmail.com>
>>
>> Commit 6462d5e ("fetch: remove fetch_if_missing=0", 2019-11-08)
>> strove to remove the need for fetch_if_missing=0 from the fetching
>> mechanism, so it is plausible to attempt removing fetch_if_missing=0
>> from fetch-pack as well.
>
> It's ok to refer to a previous commit, but I think it would be better
> if you could repeat a bit the reasons why removing the
> fetch_if_missing global is a good idea, and not just rely on the
> previous commit.
>
> "it is plausible" also doesn't make it very clear that it's what the
> patch is actually doing.

I had the same reaction.  You could even write a random gibberish in
your patch and write "it's plausible this set of random changes made
without understanding what is going on in the current code might
have some chance to work" in your log message, and we would not even
want to touch such a patch with 10-foot pole.

The proposed log message above unfortunately makes this patch
indistinguishable from such a trash, unless we follow the codepaths
that are *not* touched by this patch and think about ramifications
of the removal *ourselves*.  In other words, it does nothing to help
the readers to support the change.
Jonathan Tan May 7, 2020, 7:43 p.m. UTC | #3
> From: Hariom Verma <hariom18599@gmail.com>
> 
> Commit 6462d5e ("fetch: remove fetch_if_missing=0", 2019-11-08)
> strove to remove the need for fetch_if_missing=0 from the fetching
> mechanism, so it is plausible to attempt removing fetch_if_missing=0
> from fetch-pack as well.
> 
> Signed-off-by: Hariom Verma <hariom18599@gmail.com>

As Christian said [1], please include tests like in the commit you
mentioned. For a change like this, I think that the test is the most
important part.

Also include a justification for why it's safe to remove
fetch_if_missing=0. You can probably cite the aforementioned commit to
say that it covers the fetch_pack() method, and then go through the rest
of the code to see if any may inadvertently fetch an object.

Also, the fetch-pack and index-pack parts can be sent in separate patch
sets, so you might want to concentrate on one command first.

[1] https://lore.kernel.org/git/CAP8UFD2SNnpKWtYUztZ76OU7zBsrXyYhG_Zds1wi+NqBKCv+Qw@mail.gmail.com/

> diff --git a/fetch-pack.c b/fetch-pack.c
> index 1734a573b01..1ca643f6491 100644
> --- a/fetch-pack.c
> +++ b/fetch-pack.c
> @@ -1649,7 +1649,7 @@ static void update_shallow(struct fetch_pack_args *args,
>  		struct oid_array extra = OID_ARRAY_INIT;
>  		struct object_id *oid = si->shallow->oid;
>  		for (i = 0; i < si->shallow->nr; i++)
> -			if (has_object_file(&oid[i]))
> +			if (has_object_file_with_flags(&oid[i], OBJECT_INFO_SKIP_FETCH_OBJECT))
>  				oid_array_append(&extra, &oid[i]);
>  		if (extra.nr) {
>  			setup_alternate_shallow(&shallow_lock,

Hmm...this triggers when the user requests a clone that is both partial
and shallow, and the server reports a shallow object that it didn't send
back as a packfile; and it causes another fetch to be sent. This is a
separate issue, but Hariom, if you'd like to take a look at this, that
would work out too. You'll need to figure out how to make the server
send back shallow lines referencing objects that are not in the packfile
- one way to do it is to use one-time-perl. (Search the codebase to see
how it is used.) This is probably more complex, though.
Hariom verma May 9, 2020, 5:40 p.m. UTC | #4
On Thu, May 7, 2020 at 6:47 PM Christian Couder
<christian.couder@gmail.com> wrote:
>
> It's ok to refer to a previous commit, but I think it would be better
> if you could repeat a bit the reasons why removing the
> fetch_if_missing global is a good idea, and not just rely on the
> previous commit.

I agree with that.

> "it is plausible" also doesn't make it very clear that it's what the
> patch is actually doing.

Thanks for pointing it out. Will improve.

Regards,
Hariom
Hariom verma May 9, 2020, 5:48 p.m. UTC | #5
On Thu, May 7, 2020 at 8:32 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Christian Couder <christian.couder@gmail.com> writes:
>
> > It's ok to refer to a previous commit, but I think it would be better
> > if you could repeat a bit the reasons why removing the
> > fetch_if_missing global is a good idea, and not just rely on the
> > previous commit.
> >
> > "it is plausible" also doesn't make it very clear that it's what the
> > patch is actually doing.
>
> I had the same reaction.  You could even write a random gibberish in
> your patch and write "it's plausible this set of random changes made
> without understanding what is going on in the current code might
> have some chance to work" in your log message, and we would not even
> want to touch such a patch with 10-foot pole.
>
> The proposed log message above unfortunately makes this patch
> indistinguishable from such a trash, unless we follow the codepaths
> that are *not* touched by this patch and think about ramifications
> of the removal *ourselves*.  In other words, it does nothing to help
> the readers to support the change.
>

I understand it must be too hard for you to deal with such [trash]patches.
My apologies. Will improve in next revision

Thanks,
Hariom
Hariom verma May 9, 2020, 6 p.m. UTC | #6
On Fri, May 8, 2020 at 1:13 AM Jonathan Tan <jonathantanmy@google.com> wrote:
>
> As Christian said [1], please include tests like in the commit you
> mentioned. For a change like this, I think that the test is the most
> important part.
>

I will definitely add tests.

> Also include a justification for why it's safe to remove
> fetch_if_missing=0. You can probably cite the aforementioned commit to
> say that it covers the fetch_pack() method, and then go through the rest
> of the code to see if any may inadvertently fetch an object.
>
> Also, the fetch-pack and index-pack parts can be sent in separate patch
> sets, so you might want to concentrate on one command first.
>

Thanks, Will split and concentrate on one at a time.

>
> > diff --git a/fetch-pack.c b/fetch-pack.c
> > index 1734a573b01..1ca643f6491 100644
> > --- a/fetch-pack.c
> > +++ b/fetch-pack.c
> > @@ -1649,7 +1649,7 @@ static void update_shallow(struct fetch_pack_args *args,
> >               struct oid_array extra = OID_ARRAY_INIT;
> >               struct object_id *oid = si->shallow->oid;
> >               for (i = 0; i < si->shallow->nr; i++)
> > -                     if (has_object_file(&oid[i]))
> > +                     if (has_object_file_with_flags(&oid[i], OBJECT_INFO_SKIP_FETCH_OBJECT))
> >                               oid_array_append(&extra, &oid[i]);
> >               if (extra.nr) {
> >                       setup_alternate_shallow(&shallow_lock,
>
> Hmm...this triggers when the user requests a clone that is both partial
> and shallow, and the server reports a shallow object that it didn't send
> back as a packfile; and it causes another fetch to be sent. This is a
> separate issue, but Hariom, if you'd like to take a look at this, that
> would work out too. You'll need to figure out how to make the server
> send back shallow lines referencing objects that are not in the packfile
> - one way to do it is to use one-time-perl. (Search the codebase to see
> how it is used.) This is probably more complex, though.

I'm clueless about "one-time-perl" thing(till now!). Will surely going
to learn about that.

Thanks for the nice explanation.

Regards,
Hariom
Kousik Sanagavarapu Feb. 20, 2023, 6:13 p.m. UTC | #7
So here, we are partial-cloning from a shallow remote and some
objects are not sent due to our clone filters. Let's say that
the shallow remote has a 5-commit history and we are cloning it
into another repository with a blob:none filter. The expected
behavior is cloning the 5 commits, with no blobs, except
for the HEAD.

When executing the above process, it leads to errors:

	fatal: the remote end hung up unexpectedly
	fatal: protocol error: bad pack header
	warning: Clone succeeded, but checkout failed
	You can inspect what was checked out with 'git status'
	and retry with 'git restore --source=HEAD :/'

I looked into it a bit and it seems that packet_read() is not
successful. I'm not really sure how packet reading fits into
the big picture but it looks like the buffer is not read
completely.

It is a similar case with "bad pack header" too. The function
read_pack_header() fails because the pack header was not fully read.

Also, is the shallow object not sent when cloning due to the partial
clone filter and hence a subsequent fetching is done to ask for this
object? If so, then will such a fetch counted as an args->update_shallow?
Kousik Sanagavarapu Feb. 22, 2023, 5:45 p.m. UTC | #8
Sorry, it seems that I misunderstood the whole situation here. I didn't
realize that the problem was on my end and not something to do with the
code. Please ignore the above email except for the last question, which
I still don't seem to understand. So, I would be grateful if you could
clarify.

Thanks,
Kousik
Jonathan Tan Feb. 22, 2023, 6:19 p.m. UTC | #9
Kousik Sanagavarapu <five231003@gmail.com> writes:
> Also, is the shallow object not sent when cloning due to the partial
> clone filter and hence a subsequent fetching is done to ask for this
> object? If so, then will such a fetch counted as an args->update_shallow?

What do you mean by the shallow object? If you mean the last commit that
is sent (that is, without its parents), then that is a commit and is not
excluded by the filter.

As for args->update_shallow, that's a good question. Just glancing at
the code, even if it is set, I don't think there would be any difference
in operation since the lazy fetch does not fetch any refs (and in fact,
in protocol v2, we skip the ref advertisement in this case, as far as I
can remember).
diff mbox series

Patch

diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index dc1485c8aa1..38a45512918 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -57,8 +57,6 @@  int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 	struct packet_reader reader;
 	enum protocol_version version;
 
-	fetch_if_missing = 0;
-
 	packet_trace_identity("fetch-pack");
 
 	memset(&args, 0, sizeof(args));
diff --git a/fetch-pack.c b/fetch-pack.c
index 1734a573b01..1ca643f6491 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1649,7 +1649,7 @@  static void update_shallow(struct fetch_pack_args *args,
 		struct oid_array extra = OID_ARRAY_INIT;
 		struct object_id *oid = si->shallow->oid;
 		for (i = 0; i < si->shallow->nr; i++)
-			if (has_object_file(&oid[i]))
+			if (has_object_file_with_flags(&oid[i], OBJECT_INFO_SKIP_FETCH_OBJECT))
 				oid_array_append(&extra, &oid[i]);
 		if (extra.nr) {
 			setup_alternate_shallow(&shallow_lock,