mbox series

[00/13] upload-pack: use 'struct upload_pack_data' thoroughly, part 1

Message ID 20200515100454.14486-1-chriscool@tuxfamily.org (mailing list archive)
Headers show
Series upload-pack: use 'struct upload_pack_data' thoroughly, part 1 | expand

Message

Christian Couder May 15, 2020, 10:04 a.m. UTC
In the thread started by:

https://lore.kernel.org/git/20200507095829.16894-1-chriscool@tuxfamily.org/

which led to the following bug fix commit:

08450ef791 (upload-pack: clear filter_options for each v2 fetch
command, 2020-05-08)

it was agreed that having many static variables in 'upload-pack.c',
while upload_pack_v2() is called more than once per process, is very
bug prone and messy, and that a good way forward would be to use
'struct upload_pack_data' thoroughly, especially in upload_pack()
where it isn't used yet.

This patch series is the first part of an effort in this direction.

While there are still a lot of static variables at the top of
'upload-pack.c' after this patch series, it does a lot of ground work
and a number of cleanups.

This patch series is based on master at 172e8ff696 (The ninth batch,
2020-05-13).

Christian Couder (13):
  upload-pack: remove unused 'wants' from upload_pack_data
  upload-pack: move {want,have}_obj to upload_pack_data
  upload-pack: move 'struct upload_pack_data' around
  upload-pack: use 'struct upload_pack_data' in upload_pack()
  upload-pack: pass upload_pack_data to get_common_commits()
  upload-pack: pass upload_pack_data to receive_needs()
  upload-pack: use upload_pack_data writer in receive_needs()
  upload-pack: move symref to upload_pack_data
  upload-pack: pass upload_pack_data to send_ref()
  upload-pack: pass upload_pack_data to check_non_tip()
  upload-pack: remove static variable 'stateless_rpc'
  upload-pack: pass upload_pack_data to create_pack_file()
  upload-pack: use upload_pack_data fields in receive_needs()

 upload-pack.c | 298 ++++++++++++++++++++++++--------------------------
 1 file changed, 145 insertions(+), 153 deletions(-)

Comments

Derrick Stolee May 15, 2020, 10:43 a.m. UTC | #1
On 5/15/2020 6:04 AM, Christian Couder wrote:
> In the thread started by:
> 
> https://lore.kernel.org/git/20200507095829.16894-1-chriscool@tuxfamily.org/
> 
> which led to the following bug fix commit:
> 
> 08450ef791 (upload-pack: clear filter_options for each v2 fetch
> command, 2020-05-08)
> 
> it was agreed that having many static variables in 'upload-pack.c',
> while upload_pack_v2() is called more than once per process, is very
> bug prone and messy, and that a good way forward would be to use
> 'struct upload_pack_data' thoroughly, especially in upload_pack()
> where it isn't used yet.
> 
> This patch series is the first part of an effort in this direction.
> 
> While there are still a lot of static variables at the top of
> 'upload-pack.c' after this patch series, it does a lot of ground work
> and a number of cleanups.

The patches here are carefully organized to make review easy. Thanks!

I was surprised to see how many local or static variables you were able
to remove using members that already existed in 'struct upload_pack_data'.
That made the changes here particularly easy to trust.

Reviewed-by: Derrick Stolee <dstolee@microsoft.com>

Thanks,
-Stolee
Jeff King May 15, 2020, 6:47 p.m. UTC | #2
On Fri, May 15, 2020 at 12:04:41PM +0200, Christian Couder wrote:

> In the thread started by:
> 
> https://lore.kernel.org/git/20200507095829.16894-1-chriscool@tuxfamily.org/
> 
> which led to the following bug fix commit:
> 
> 08450ef791 (upload-pack: clear filter_options for each v2 fetch
> command, 2020-05-08)
> 
> it was agreed that having many static variables in 'upload-pack.c',
> while upload_pack_v2() is called more than once per process, is very
> bug prone and messy, and that a good way forward would be to use
> 'struct upload_pack_data' thoroughly, especially in upload_pack()
> where it isn't used yet.
> 
> This patch series is the first part of an effort in this direction.

Thanks for following up on this! I think all of the patches here are
moving in a good direction. I think there's a philosophical question
about whether some of the functions should be taking the minimum set of
fields from upload_pack_data, or just the whole struct. But seeing the
opportunities for cleanup near the end, especially around little flags,
I think passing the big struct around (like you've done here) is
probably the best choice.

> While there are still a lot of static variables at the top of
> 'upload-pack.c' after this patch series, it does a lot of ground work
> and a number of cleanups.

Yeah, I think all of use_thin_pack, use_ofs_delta, etc, should be easy
conversions on top (and will really give us the payoff).

-Peff
Jeff King May 15, 2020, 6:55 p.m. UTC | #3
On Fri, May 15, 2020 at 02:47:16PM -0400, Jeff King wrote:

> > While there are still a lot of static variables at the top of
> > 'upload-pack.c' after this patch series, it does a lot of ground work
> > and a number of cleanups.
> 
> Yeah, I think all of use_thin_pack, use_ofs_delta, etc, should be easy
> conversions on top (and will really give us the payoff).

Hmm, none of those fields in upload_pack_data are used, even for v2!
I.e., if I apply this patch:

    diff --git a/upload-pack.c b/upload-pack.c
    index 401c9e6c4b..522ae3ff6e 100644
    --- a/upload-pack.c
    +++ b/upload-pack.c
    @@ -91,10 +91,6 @@ struct upload_pack_data {
     
     	unsigned stateless_rpc : 1;
     
    -	unsigned use_thin_pack : 1;
    -	unsigned use_ofs_delta : 1;
    -	unsigned no_progress : 1;
    -	unsigned use_include_tag : 1;
     	unsigned done : 1;
     };
     

it still compiles. So starting to use those would be a behavior change,
as we accidentally let use_ofs_delta, etc, propagate from one v2 "fetch"
command to another for ssh/git/file connections (but not for http). I
think that's fixing a bug (but one nobody is likely to see, because it
would imply the client sending different capabilities for each request).

I think we'd want something like the patch below.

Some of the other globals, like multi_ack, are really v0 only (since v2
assumes certain baseline capabilities). We could either leave them
as-is, or roll them into upload_pack_data and just let v2 code ignore
them as it does now.

diff --git a/upload-pack.c b/upload-pack.c
index 401c9e6c4b..2fa645834a 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -46,8 +46,7 @@ static timestamp_t oldest_have;
 
 static int multi_ack;
 static int no_done;
-static int use_thin_pack, use_ofs_delta, use_include_tag;
-static int no_progress, daemon_mode;
+static int daemon_mode;
 /* Allow specifying sha1 if it is a ref tip. */
 #define ALLOW_TIP_SHA1	01
 /* Allow request of a sha1 if it is reachable from a ref (possibly hidden ref). */
@@ -186,17 +185,17 @@ static void create_pack_file(struct upload_pack_data *pack_data)
 	}
 	argv_array_push(&pack_objects.args, "pack-objects");
 	argv_array_push(&pack_objects.args, "--revs");
-	if (use_thin_pack)
+	if (pack_data->use_thin_pack)
 		argv_array_push(&pack_objects.args, "--thin");
 
 	argv_array_push(&pack_objects.args, "--stdout");
 	if (shallow_nr)
 		argv_array_push(&pack_objects.args, "--shallow");
-	if (!no_progress)
+	if (!pack_data->no_progress)
 		argv_array_push(&pack_objects.args, "--progress");
-	if (use_ofs_delta)
+	if (pack_data->use_ofs_delta)
 		argv_array_push(&pack_objects.args, "--delta-base-offset");
-	if (use_include_tag)
+	if (pack_data->use_include_tag)
 		argv_array_push(&pack_objects.args, "--include-tag");
 	if (pack_data->filter_options.choice) {
 		const char *spec =
@@ -955,17 +954,17 @@ static void receive_needs(struct upload_pack_data *data,
 		if (parse_feature_request(features, "no-done"))
 			no_done = 1;
 		if (parse_feature_request(features, "thin-pack"))
-			use_thin_pack = 1;
+			data->use_thin_pack = 1;
 		if (parse_feature_request(features, "ofs-delta"))
-			use_ofs_delta = 1;
+			data->use_ofs_delta = 1;
 		if (parse_feature_request(features, "side-band-64k"))
 			use_sideband = LARGE_PACKET_MAX;
 		else if (parse_feature_request(features, "side-band"))
 			use_sideband = DEFAULT_PACKET_MAX;
 		if (parse_feature_request(features, "no-progress"))
-			no_progress = 1;
+			data->no_progress = 1;
 		if (parse_feature_request(features, "include-tag"))
-			use_include_tag = 1;
+			data->use_include_tag = 1;
 		if (allow_filter && parse_feature_request(features, "filter"))
 			filter_capability_requested = 1;
 
@@ -997,7 +996,7 @@ static void receive_needs(struct upload_pack_data *data,
 		check_non_tip(data);
 
 	if (!use_sideband && daemon_mode)
-		no_progress = 1;
+		data->no_progress = 1;
 
 	if (data->depth == 0 && !data->deepen_rev_list && data->shallows.nr == 0)
 		return;
@@ -1279,19 +1278,19 @@ static void process_args(struct packet_reader *request,
 
 		/* process args like thin-pack */
 		if (!strcmp(arg, "thin-pack")) {
-			use_thin_pack = 1;
+			data->use_thin_pack = 1;
 			continue;
 		}
 		if (!strcmp(arg, "ofs-delta")) {
-			use_ofs_delta = 1;
+			data->use_ofs_delta = 1;
 			continue;
 		}
 		if (!strcmp(arg, "no-progress")) {
-			no_progress = 1;
+			data->no_progress = 1;
 			continue;
 		}
 		if (!strcmp(arg, "include-tag")) {
-			use_include_tag = 1;
+			data->use_include_tag = 1;
 			continue;
 		}
 		if (!strcmp(arg, "done")) {
Christian Couder May 19, 2020, 9:44 a.m. UTC | #4
On Fri, May 15, 2020 at 8:55 PM Jeff King <peff@peff.net> wrote:
>
> On Fri, May 15, 2020 at 02:47:16PM -0400, Jeff King wrote:
>
> > > While there are still a lot of static variables at the top of
> > > 'upload-pack.c' after this patch series, it does a lot of ground work
> > > and a number of cleanups.
> >
> > Yeah, I think all of use_thin_pack, use_ofs_delta, etc, should be easy
> > conversions on top (and will really give us the payoff).
>
> Hmm, none of those fields in upload_pack_data are used, even for v2!
> I.e., if I apply this patch:

[...]

> it still compiles. So starting to use those would be a behavior change,
> as we accidentally let use_ofs_delta, etc, propagate from one v2 "fetch"
> command to another for ssh/git/file connections (but not for http). I
> think that's fixing a bug (but one nobody is likely to see, because it
> would imply the client sending different capabilities for each request).

I agree.

> I think we'd want something like the patch below.

Thanks for your patch! I have included it as the first patch in the
"part 2" patch series I am planning to send once this "part 1" will be
merged to pu or next:

- the commit with your patch:
https://gitlab.com/gitlab-org/gitlab-git/-/commit/458b79f0f563226bf50924553fc3889cb0942864
- the whole branch with "part 1" and "part 2":
https://gitlab.com/gitlab-org/gitlab-git/-/commits/fix-upload-pack-variable-scope5/

> Some of the other globals, like multi_ack, are really v0 only (since v2
> assumes certain baseline capabilities). We could either leave them
> as-is, or roll them into upload_pack_data and just let v2 code ignore
> them as it does now.

I am in favor of rolling them into upload_pack_data. That's what I
started doing in "part 1" and continued doing in "part 2". I think it
makes the code more coherent and cleaner.

Thanks for your review,
Christian.
Christian Couder May 19, 2020, 9:49 a.m. UTC | #5
On Fri, May 15, 2020 at 12:43 PM Derrick Stolee <stolee@gmail.com> wrote:
>
> On 5/15/2020 6:04 AM, Christian Couder wrote:

> > While there are still a lot of static variables at the top of
> > 'upload-pack.c' after this patch series, it does a lot of ground work
> > and a number of cleanups.
>
> The patches here are carefully organized to make review easy. Thanks!
>
> I was surprised to see how many local or static variables you were able
> to remove using members that already existed in 'struct upload_pack_data'.
> That made the changes here particularly easy to trust.
>
> Reviewed-by: Derrick Stolee <dstolee@microsoft.com>

Thanks for your review and kind words! I would very much like to add
your "Reviewed-by:" to this patch series, but as no suggestion has
been made to improve it, I am a bit reluctant to resend with only your
"Reviewed-by:" added.

Junio, what would you prefer?

Thanks,
Christian.