diff mbox series

[1/2] connected: allow supplying different view of reachable objects

Message ID a32e3d6146dd41af36f525a744d6cc099b42d6fb.1666967670.git.ps@pks.im (mailing list archive)
State New, archived
Headers show
Series receive-pack: use advertised reference tips to inform connectivity check | expand

Commit Message

Patrick Steinhardt Oct. 28, 2022, 2:42 p.m. UTC
The connectivity check is executed via git-receive-pack(1) to verify
that a client has provided all references that are required to satisfy a
set of reference updates. What the connectivity check does is to walk
the object graph with all reference tips as starting points while all
preexisting reference tips are marked as uninteresting.

Preexisting references are currently marked uninteresting by passing
`--not --all` to git-rev-list(1). Some users of the connectivity check
may have a better picture of which objects should be regarded as
uninteresting though, e.g. by reusing information from the reference
advertisement when serving a push.

Add a new field to `struct check_connected_options` that allows callers
to replace the `--not --all` logic with their own set of object IDs they
regard as uninteresting.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 connected.c | 9 ++++++++-
 connected.h | 7 +++++++
 2 files changed, 15 insertions(+), 1 deletion(-)

Comments

Ævar Arnfjörð Bjarmason Oct. 28, 2022, 2:54 p.m. UTC | #1
On Fri, Oct 28 2022, Patrick Steinhardt wrote:

> @@ -125,6 +125,13 @@ int check_connected(oid_iterate_fn fn, void *cb_data,
>  
>  	rev_list_in = xfdopen(rev_list.in, "w");
>  
> +	if (opt->reachable_oids_fn) {
> +		const struct object_id *reachable_oid;
> +		while ((reachable_oid = opt->reachable_oids_fn(opt->reachable_oids_data)) != NULL)
> +			if (fprintf(rev_list_in, "^%s\n", oid_to_hex(reachable_oid)) < 0)
> +				break;

Just a style nit, we tend to avoid != NULL, != 0 etc. comparisons. I see
connected.c has some of that already, but for new code let's just check
truthiness.

Also for such a small scope a shorter variable name helps us stay at the
usual column limits:

	if (opt->reachable_oids_fn) {
		const struct object_id *oid;
		while ((oid = opt->reachable_oids_fn(opt->reachable_oids_data)))
			if (fprintf(rev_list_in, "^%s\n", oid_to_hex(oid)) < 0)
				break;

The fprintf() return value checking seemed a bit odd, not because we
shouldn't do it, but because we usually don't bother. For other
reviewers: We have that form already in connected.c, so at least locally
we're not being diligently careful, only to have it undone by adjacent
code...

Looks good!
Junio C Hamano Oct. 28, 2022, 6:12 p.m. UTC | #2
Patrick Steinhardt <ps@pks.im> writes:

> diff --git a/connected.c b/connected.c
> index 74a20cb32e..2a4c4e0025 100644
> --- a/connected.c
> +++ b/connected.c
> @@ -98,7 +98,7 @@ int check_connected(oid_iterate_fn fn, void *cb_data,
>  	strvec_push(&rev_list.args, "--stdin");
>  	if (has_promisor_remote())
>  		strvec_push(&rev_list.args, "--exclude-promisor-objects");
> -	if (!opt->is_deepening_fetch) {
> +	if (!opt->is_deepening_fetch && !opt->reachable_oids_fn) {
>  		strvec_push(&rev_list.args, "--not");
>  		strvec_push(&rev_list.args, "--all");
>  	}
> @@ -125,6 +125,13 @@ int check_connected(oid_iterate_fn fn, void *cb_data,
>  
>  	rev_list_in = xfdopen(rev_list.in, "w");
>  
> +	if (opt->reachable_oids_fn) {
> +		const struct object_id *reachable_oid;
> +		while ((reachable_oid = opt->reachable_oids_fn(opt->reachable_oids_data)) != NULL)
> +			if (fprintf(rev_list_in, "^%s\n", oid_to_hex(reachable_oid)) < 0)
> +				break;
> +	}

It is good that these individual negative references are fed from
the standard input, not on the command line, as they can be many.

In the original code without the reachable_oids_fn, we refrain from
excluding when the is_deepening_fetch bit is set, but here we do not
pay attention to the bit at all.  Is that sensible, and if so why?
Taylor Blau Oct. 30, 2022, 6:49 p.m. UTC | #3
On Fri, Oct 28, 2022 at 11:12:33AM -0700, Junio C Hamano wrote:
> In the original code without the reachable_oids_fn, we refrain from
> excluding when the is_deepening_fetch bit is set, but here we do not
> pay attention to the bit at all.  Is that sensible, and if so why?

I was wondering the same thing. Thanks for asking.

Thanks,
Taylor
Patrick Steinhardt Oct. 31, 2022, 1:10 p.m. UTC | #4
On Fri, Oct 28, 2022 at 11:12:33AM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> > diff --git a/connected.c b/connected.c
> > index 74a20cb32e..2a4c4e0025 100644
> > --- a/connected.c
> > +++ b/connected.c
> > @@ -98,7 +98,7 @@ int check_connected(oid_iterate_fn fn, void *cb_data,
> >  	strvec_push(&rev_list.args, "--stdin");
> >  	if (has_promisor_remote())
> >  		strvec_push(&rev_list.args, "--exclude-promisor-objects");
> > -	if (!opt->is_deepening_fetch) {
> > +	if (!opt->is_deepening_fetch && !opt->reachable_oids_fn) {
> >  		strvec_push(&rev_list.args, "--not");
> >  		strvec_push(&rev_list.args, "--all");
> >  	}
> > @@ -125,6 +125,13 @@ int check_connected(oid_iterate_fn fn, void *cb_data,
> >  
> >  	rev_list_in = xfdopen(rev_list.in, "w");
> >  
> > +	if (opt->reachable_oids_fn) {
> > +		const struct object_id *reachable_oid;
> > +		while ((reachable_oid = opt->reachable_oids_fn(opt->reachable_oids_data)) != NULL)
> > +			if (fprintf(rev_list_in, "^%s\n", oid_to_hex(reachable_oid)) < 0)
> > +				break;
> > +	}
> 
> It is good that these individual negative references are fed from
> the standard input, not on the command line, as they can be many.
> 
> In the original code without the reachable_oids_fn, we refrain from
> excluding when the is_deepening_fetch bit is set, but here we do not
> pay attention to the bit at all.  Is that sensible, and if so why?

Hm, good point. On a deepening fetch the commits that were the previous
boundary will likely get replaced by new commits that are at a deeper
point in history, so they cannot be used as a well-defined boundary.
Instead, we do a complete graph-walk that doesn't stop at any previously
known commits at all. At least that's how I understand the code, the
explanation is likely a bit fuzzy.

I guess we should thus also pay attention to `is_deepening_fetch` here.
As this means that `is_deepening_fetch` and `reachable_oids_fn` are
mutually exclusive I'm inclined to go even further and `die()` if both
are set at the same time. We only adapt git-receive-pack(1) anyway, so
we should never run into this situation for now.

Patrick
Taylor Blau Nov. 1, 2022, 1:16 a.m. UTC | #5
On Mon, Oct 31, 2022 at 02:10:16PM +0100, Patrick Steinhardt wrote:
> I guess we should thus also pay attention to `is_deepening_fetch` here.
> As this means that `is_deepening_fetch` and `reachable_oids_fn` are
> mutually exclusive I'm inclined to go even further and `die()` if both
> are set at the same time. We only adapt git-receive-pack(1) anyway, so
> we should never run into this situation for now.

Yes, I agree.

Thanks,
Taylor
diff mbox series

Patch

diff --git a/connected.c b/connected.c
index 74a20cb32e..2a4c4e0025 100644
--- a/connected.c
+++ b/connected.c
@@ -98,7 +98,7 @@  int check_connected(oid_iterate_fn fn, void *cb_data,
 	strvec_push(&rev_list.args, "--stdin");
 	if (has_promisor_remote())
 		strvec_push(&rev_list.args, "--exclude-promisor-objects");
-	if (!opt->is_deepening_fetch) {
+	if (!opt->is_deepening_fetch && !opt->reachable_oids_fn) {
 		strvec_push(&rev_list.args, "--not");
 		strvec_push(&rev_list.args, "--all");
 	}
@@ -125,6 +125,13 @@  int check_connected(oid_iterate_fn fn, void *cb_data,
 
 	rev_list_in = xfdopen(rev_list.in, "w");
 
+	if (opt->reachable_oids_fn) {
+		const struct object_id *reachable_oid;
+		while ((reachable_oid = opt->reachable_oids_fn(opt->reachable_oids_data)) != NULL)
+			if (fprintf(rev_list_in, "^%s\n", oid_to_hex(reachable_oid)) < 0)
+				break;
+	}
+
 	do {
 		/*
 		 * If index-pack already checked that:
diff --git a/connected.h b/connected.h
index 6e59c92aa3..f09c7d7884 100644
--- a/connected.h
+++ b/connected.h
@@ -46,6 +46,13 @@  struct check_connected_options {
 	 * during a fetch.
 	 */
 	unsigned is_deepening_fetch : 1;
+
+	/*
+	 * If non-NULL, use this iterator to determine the set of reachable
+	 * objects instead of marking all references as unreachable.
+	 */
+	oid_iterate_fn reachable_oids_fn;
+	void *reachable_oids_data;
 };
 
 #define CHECK_CONNECTED_INIT { 0 }