[RFC,2/7] fetch-pack: allow NULL negotiator->known_common
diff mbox series

Message ID 401227c2220b6b45d80e21b52e29b6821ca139f9.1596590295.git.jonathantanmy@google.com
State New
Headers show
Series
  • Lazy fetch with subprocess
Related show

Commit Message

Jonathan Tan Aug. 5, 2020, 1:20 a.m. UTC
In a subsequent patch, a null fetch negotiator will be introduced. This
negotiator, among other things, will not need any information about
common objects and will have a NULL known_common. Teach fetch-pack to
allow this.

[NEEDSWORK]
Optimizing out the ref iteration also affects the execution
of everything_local(), which relies on COMPLETE being set. (Having said
that, the typical use case - lazy fetching - would be fine with
everything_local() always returning that not everything is local.)

This optimization is needed so that in the future, fetch_pack() can be
used to lazily fetch in a partial clone (without the no_dependents
flag). This means that fetch_pack() needs a way to execute without
relying on any targets of refs being present, and thus it cannot use the
ref iterator (because it checks and lazy-fetches any missing targets).
(Git currently does not have this problem because we use the
no_dependents flag, but lazy-fetching will in a subsequent patch be
changed to use the user-facing fetch command, which does not use this
flag.)
[/NEEDSWORK]

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 fetch-pack.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Junio C Hamano Aug. 5, 2020, 8:08 p.m. UTC | #1
Jonathan Tan <jonathantanmy@google.com> writes:

> In a subsequent patch, a null fetch negotiator will be introduced. This
> negotiator, among other things, will not need any information about
> common objects and will have a NULL known_common. Teach fetch-pack to
> allow this.

Hmph, both the default and the skipping negotiator seem to put NULL
in known_common and add_tip when its next() method is called.  Also
they clear known_common to NULL after add_tip is called even once.

So, how have we survived so far without this patch to "allow this
(i.e.  known_common method to be NULL)"?  Is there something that
makes sure a negotiator never gets called from this function after
its .next or .add_tip method is called?

Puzzled.  Or is this merely an optimization?  If so, it's not like
the change "allows this", but it starts to take advantage of it in
some way.

	... goes and looks at mark_complete_and_common_ref()

The function seems to have an unconditional call to ->known_common(),
so anybody passing a negotiator whose known_common is NULL would
already be segfaulting, so this does not appear to be an optimization
but necessary to keep the code from crashing.  I cannot quite tell
if it is avoiding unnecessary work, or sweeping crashes under the
rug, though.  

Is the untold assumption that mark_complete_and_common_ref() will
never be called after either mark_tips() or find_common() have been
called?

Thanks.

> [NEEDSWORK]
> Optimizing out the ref iteration also affects the execution
> of everything_local(), which relies on COMPLETE being set. (Having said
> that, the typical use case - lazy fetching - would be fine with
> everything_local() always returning that not everything is local.)
>
> This optimization is needed so that in the future, fetch_pack() can be
> used to lazily fetch in a partial clone (without the no_dependents
> flag). This means that fetch_pack() needs a way to execute without
> relying on any targets of refs being present, and thus it cannot use the
> ref iterator (because it checks and lazy-fetches any missing targets).
> (Git currently does not have this problem because we use the
> no_dependents flag, but lazy-fetching will in a subsequent patch be
> changed to use the user-facing fetch command, which does not use this
> flag.)
> [/NEEDSWORK]
>
> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> ---
>  fetch-pack.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/fetch-pack.c b/fetch-pack.c
> index 6c786f5970..5f5474dbed 100644
> --- a/fetch-pack.c
> +++ b/fetch-pack.c
> @@ -677,6 +677,9 @@ static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator,
>  	int old_save_commit_buffer = save_commit_buffer;
>  	timestamp_t cutoff = 0;
>  
> +	if (!negotiator->known_common)
> +		return;
> +
>  	save_commit_buffer = 0;
>  
>  	trace2_region_enter("fetch-pack", "parse_remote_refs_and_find_cutoff", NULL);
Junio C Hamano Aug. 5, 2020, 10:11 p.m. UTC | #2
Junio C Hamano <gitster@pobox.com> writes:

> Jonathan Tan <jonathantanmy@google.com> writes:
>
>> In a subsequent patch, a null fetch negotiator will be introduced. This
>> negotiator, among other things, will not need any information about
>> common objects and will have a NULL known_common. Teach fetch-pack to
>> allow this.
>
> Hmph, both the default and the skipping negotiator seem to put NULL
> in known_common and add_tip when its next() method is called.  Also
> they clear known_common to NULL after add_tip is called even once.
>
> So, how have we survived so far without this patch to "allow this
> (i.e.  known_common method to be NULL)"?  Is there something that
> makes sure a negotiator never gets called from this function after
> its .next or .add_tip method is called?
>
> Puzzled.  Or is this merely an optimization?  If so, it's not like
> the change "allows this", but it starts to take advantage of it in
> some way.
>
> 	... goes and looks at mark_complete_and_common_ref()
>
> The function seems to have an unconditional call to ->known_common(),
> so anybody passing a negotiator whose known_common is NULL would
> already be segfaulting, so this does not appear to be an optimization
> but necessary to keep the code from crashing.  I cannot quite tell
> if it is avoiding unnecessary work, or sweeping crashes under the
> rug, though.  
>
> Is the untold assumption that mark_complete_and_common_ref() will
> never be called after either mark_tips() or find_common() have been
> called?

Shot in the dark.  Perhaps clearing of .add_tip and .known_common in
the .next method was done to catch a wrong calling sequence where
mark_complete_and_common_ref() gets called after mark_tips() and/or
find_common() have by forcing the code to segfault?  If so, this
patch removes the safety and we may want to add an equivalent safety
logic.  Perhaps by adding a state field in the negotiator instance
to record that mark_tips() and/or find_common() have been used and
call a BUG() if mark_complete_and_common_ref() gets called after that,
if enforcing such an invariant was the original reason why these
fields were cleared.
Jonathan Tan Aug. 7, 2020, 8:59 p.m. UTC | #3
> > Hmph, both the default and the skipping negotiator seem to put NULL
> > in known_common and add_tip when its next() method is called.  Also
> > they clear known_common to NULL after add_tip is called even once.
> >
> > So, how have we survived so far without this patch to "allow this
> > (i.e.  known_common method to be NULL)"?  Is there something that
> > makes sure a negotiator never gets called from this function after
> > its .next or .add_tip method is called?

[snip]

> > Is the untold assumption that mark_complete_and_common_ref() will
> > never be called after either mark_tips() or find_common() have been
> > called?
> 
> Shot in the dark.  Perhaps clearing of .add_tip and .known_common in
> the .next method was done to catch a wrong calling sequence where
> mark_complete_and_common_ref() gets called after mark_tips() and/or
> find_common() have by forcing the code to segfault?

Ah...yes, if I remember correctly, that was my original intention when I
set them to NULL.

> If so, this
> patch removes the safety and we may want to add an equivalent safety
> logic.  Perhaps by adding a state field in the negotiator instance
> to record that mark_tips() and/or find_common() have been used and
> call a BUG() if mark_complete_and_common_ref() gets called after that,
> if enforcing such an invariant was the original reason why these
> fields were cleared.

Sounds good. As I said in my reply to your query on patch 1, we might
not need to set NULL anymore, but if we do, I'll do this.

Patch
diff mbox series

diff --git a/fetch-pack.c b/fetch-pack.c
index 6c786f5970..5f5474dbed 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -677,6 +677,9 @@  static void mark_complete_and_common_ref(struct fetch_negotiator *negotiator,
 	int old_save_commit_buffer = save_commit_buffer;
 	timestamp_t cutoff = 0;
 
+	if (!negotiator->known_common)
+		return;
+
 	save_commit_buffer = 0;
 
 	trace2_region_enter("fetch-pack", "parse_remote_refs_and_find_cutoff", NULL);