diff mbox series

[7/8] receive-pack: skip connectivity checks on delete-only commands

Message ID 68c60aff0c77c562aba5613ccbb9ab33ad8e0e08.1621451532.git.ps@pks.im (mailing list archive)
State New, archived
Headers show
Series Speed up connectivity checks via quarantine dir | expand

Commit Message

Patrick Steinhardt May 19, 2021, 7:13 p.m. UTC
In the case where git-receive-pack(1) receives only commands which
delete references, then per technical specification the client MUST NOT
send a packfile. As a result, we know that no new objects have been
received, which makes it a moot point to check whether all received
objects are fully connected.

Fix this by not doing a connectivity check in case there were no pushed
objects. Given that git-rev-walk(1) with only negative references will
not do any graph walk, no performance improvements are to be expected.
Conceptionally, it is still the right thing to do though.

The following tests were executed on linux.git and back up above
expectation:

Test                                     v2.32.0-rc0             HEAD
--------------------------------------------------------------------------------------------
5400.3: receive-pack clone create        1.27(1.11+0.16)         1.26(1.12+0.14) -0.8%
5400.5: receive-pack clone update        1.27(1.13+0.13)         1.27(1.11+0.16) +0.0%
5400.7: receive-pack clone reset         0.13(0.11+0.02)         0.14(0.11+0.02) +7.7%
5400.9: receive-pack clone delete        0.02(0.01+0.01)         0.02(0.00+0.01) +0.0%
5400.11: receive-pack extrarefs create   33.01(18.80+14.43)      32.63(18.52+14.24) -1.2%
5400.13: receive-pack extrarefs update   33.13(18.85+14.50)      32.82(18.85+14.29) -0.9%
5400.15: receive-pack extrarefs reset    32.90(18.82+14.32)      32.70(18.76+14.20) -0.6%
5400.17: receive-pack extrarefs delete   9.13(4.35+4.77)         8.99(4.28+4.70) -1.5%
5400.19: receive-pack empty create       223.35(640.63+127.74)   226.96(655.16+131.93) +1.6%

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 builtin/receive-pack.c | 49 ++++++++++++++++++++++++++----------------
 1 file changed, 30 insertions(+), 19 deletions(-)

Comments

Felipe Contreras May 21, 2021, 6:53 p.m. UTC | #1
Patrick Steinhardt wrote:
> In the case where git-receive-pack(1) receives only commands which
> delete references, then per technical specification the client MUST NOT
> send a packfile. As a result, we know that no new objects have been
> received, which makes it a moot point to check whether all received
> objects are fully connected.

I don't know if this is related but yesterday I decided to delete a
bunch of refs from a forked repo in GitHub. I did it naively with a for
loop and so it was doing a bunch of `git push myrepo :ref`.

It was unbearably slow.

Sure, it was a stupid thing to do, but maybe it can help you do some
tests.

Cheers.
Jeff King May 27, 2021, 2:38 p.m. UTC | #2
On Fri, May 21, 2021 at 01:53:49PM -0500, Felipe Contreras wrote:

> Patrick Steinhardt wrote:
> > In the case where git-receive-pack(1) receives only commands which
> > delete references, then per technical specification the client MUST NOT
> > send a packfile. As a result, we know that no new objects have been
> > received, which makes it a moot point to check whether all received
> > objects are fully connected.
> 
> I don't know if this is related but yesterday I decided to delete a
> bunch of refs from a forked repo in GitHub. I did it naively with a for
> loop and so it was doing a bunch of `git push myrepo :ref`.
> 
> It was unbearably slow.
> 
> Sure, it was a stupid thing to do, but maybe it can help you do some
> tests.

Patrick's patch might help some, as it would avoid calling rev-list at
all. But we wouldn't do any traversal in that command if there are no
positive tips anyway, so it is really just saving the startup overhead
of iterating the ref tips to add them to the traversal.

In the case of GitHub, the problem is much more likely outside of Git's
immediate control. Every push will run GitHub-specific hooks for things
like branch protections, etc, and there's a lot of overhead there.

-Peff
diff mbox series

Patch

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index a34742513a..b9263cec15 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1918,11 +1918,8 @@  static void execute_commands(struct command *commands,
 			     struct shallow_info *si,
 			     const struct string_list *push_options)
 {
-	struct check_connected_options opt = CHECK_CONNECTED_INIT;
 	struct command *cmd;
 	struct iterate_data data;
-	struct async muxer;
-	int err_fd = 0;
 	int run_proc_receive = 0;
 
 	if (unpacker_error) {
@@ -1931,25 +1928,39 @@  static void execute_commands(struct command *commands,
 		return;
 	}
 
-	if (use_sideband) {
-		memset(&muxer, 0, sizeof(muxer));
-		muxer.proc = copy_to_sideband;
-		muxer.in = -1;
-		if (!start_async(&muxer))
-			err_fd = muxer.in;
-		/* ...else, continue without relaying sideband */
-	}
-
 	data.cmds = commands;
 	data.si = si;
-	opt.err_fd = err_fd;
-	opt.progress = err_fd && !quiet;
-	opt.env = tmp_objdir_env(tmp_objdir);
-	if (check_connected(iterate_receive_command_list, &data, &opt))
-		set_connectivity_errors(commands, si);
 
-	if (use_sideband)
-		finish_async(&muxer);
+	/*
+	 * If received commands only consist of deletions, then the client MUST
+	 * NOT send a packfile because there cannot be any new objects in the
+	 * first place. As a result, we do not set up a quarantine environment
+	 * because we know no new objects will be received. And that in turn
+	 * means that we can skip connectivity checks here.
+	 */
+	if (tmp_objdir) {
+		struct check_connected_options opt = CHECK_CONNECTED_INIT;
+		struct async muxer;
+		int err_fd = 0;
+
+		if (use_sideband) {
+			memset(&muxer, 0, sizeof(muxer));
+			muxer.proc = copy_to_sideband;
+			muxer.in = -1;
+			if (!start_async(&muxer))
+				err_fd = muxer.in;
+			/* ...else, continue without relaying sideband */
+		}
+
+		opt.err_fd = err_fd;
+		opt.progress = err_fd && !quiet;
+		opt.env = tmp_objdir_env(tmp_objdir);
+		if (check_connected(iterate_receive_command_list, &data, &opt))
+			set_connectivity_errors(commands, si);
+
+		if (use_sideband)
+			finish_async(&muxer);
+	}
 
 	reject_updates_to_hidden(commands);