[2/2] upload-pack: use stdio in send_ref callbacks

Message ID	20210826100648.10333-2-jacob@gitlab.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <git-owner@kernel.org> From: Jacob Vosmaer <jacob@gitlab.com> To: peff@peff.net, me@ttaylorr.com, git@vger.kernel.org, gitster@pobox.com, ps@pks.im Cc: Jacob Vosmaer <jacob@gitlab.com> Subject: [PATCH 2/2] upload-pack: use stdio in send_ref callbacks Date: Thu, 26 Aug 2021 12:06:48 +0200 Message-Id: <20210826100648.10333-2-jacob@gitlab.com> In-Reply-To: <20210826100648.10333-1-jacob@gitlab.com> References: <CADMWQoMpURczcnZne=0cr2vavoLm_VT5eEMg4FCu3VeSg_UJaQ@mail.gmail.com> <20210826100648.10333-1-jacob@gitlab.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	[1/2] pkt-line: add packet_fwrite \| expand [1/2] pkt-line: add packet_fwrite [2/2] upload-pack: use stdio in send_ref callbacks

Jacob Vosmaer Aug. 26, 2021, 10:06 a.m. UTC

In both protocol v0 and v2, upload-pack writes one pktline packet per
advertised ref to stdout. That means one or two write(2) syscalls per
ref. This is problematic if these writes become network sends with
high overhead.

This commit changes both send_ref callbacks to use buffered IO using
stdio. The default buffer size is usually 4KB, but with musl libc it
is only 1KB. So for consistent results we need to set a buffer size
ourselves. During startup of upload-pack, we set the stdout buffer
size to LARGE_PACKET_MAX, i.e. just shy of 64KB.

To give an example of the impact: I set up a single-threaded loop that
calls ls-remote (with HTTP and protocol v2) on a local GitLab
instance, on a repository with 11K refs. When I switch from Git
v2.32.0 to this patch, I see a 50% reduction in CPU time for Git, and
75% for Gitaly (GitLab's Git RPC service).

So using buffered IO not only saves syscalls in upload-pack, it also
saves time in things that consume upload-pack's output.

Signed-off-by: Jacob Vosmaer <jacob@gitlab.com>
---
 builtin/upload-pack.c | 10 ++++++++++
 ls-refs.c             |  5 ++++-
 upload-pack.c         | 15 ++++++++++++---
 3 files changed, 26 insertions(+), 4 deletions(-)

Junio C Hamano Aug. 26, 2021, 4:33 p.m. UTC | #1

Jacob Vosmaer <jacob@gitlab.com> writes:

> In both protocol v0 and v2, upload-pack writes one pktline packet per
> advertised ref to stdout. That means one or two write(2) syscalls per
> ref. This is problematic if these writes become network sends with
> high overhead.
>
> This commit changes both send_ref callbacks to use buffered IO using
> stdio. The default buffer size is usually 4KB, but with musl libc it
> is only 1KB. So for consistent results we need to set a buffer size
> ourselves. During startup of upload-pack, we set the stdout buffer
> size to LARGE_PACKET_MAX, i.e. just shy of 64KB.
>
> To give an example of the impact: I set up a single-threaded loop that
> calls ls-remote (with HTTP and protocol v2) on a local GitLab
> instance, on a repository with 11K refs. When I switch from Git
> v2.32.0 to this patch, I see a 50% reduction in CPU time for Git, and
> 75% for Gitaly (GitLab's Git RPC service).
>
> So using buffered IO not only saves syscalls in upload-pack, it also
> saves time in things that consume upload-pack's output.
>
> Signed-off-by: Jacob Vosmaer <jacob@gitlab.com>
> ---

Nicely explained.

> +	/*
> +	 * Increase the stdio buffer size for stdout, for the benefit of ref
> +	 * advertisement writes. We are only allowed to call setvbuf(3) "after
> +	 * opening a stream and before any other operations have been performed
> +	 * on it", so let's call it before we have written anything to stdout.
> +	 */
> +	if (setvbuf(stdout, xmalloc(LARGE_PACKET_MAX), _IOFBF,
> +			LARGE_PACKET_MAX))
> +		die_errno("failed to grow stdout buffer");

Nice to see a comment on the tricky part.  I do not think we mind if
we rounded up the allocation size to the next power of two here, but
there probably won't be any measurable benefit for doing so.

> diff --git a/ls-refs.c b/ls-refs.c
> index 88f6c3f60d..83f2948fc3 100644
> --- a/ls-refs.c
> +++ b/ls-refs.c
> @@ -105,7 +105,7 @@ static int send_ref(const char *refname, const struct object_id *oid,
>  	}
>  
>  	strbuf_addch(&refline, '\n');
> -	packet_write(1, refline.buf, refline.len);
> +	packet_fwrite(stdout, refline.buf, refline.len);
>  
>  	strbuf_release(&refline);
>  	return 0;
> @@ -171,6 +171,9 @@ int ls_refs(struct repository *r, struct strvec *keys,
>  		strvec_push(&data.prefixes, "");
>  	for_each_fullref_in_prefixes(get_git_namespace(), data.prefixes.v,
>  				     send_ref, &data, 0);
> +	/* Call fflush because send_ref uses stdio. */
> +	if (fflush(stdout))
> +		die_errno(_("write failure on standard output"));

There is maybe_flush_or_die() helper that does a bit too much (I do
not think this code path needs to worry about GIT_FLUSH) but does
call check_pipe(errno) like packet_write_fmt() does upon seeing a
write failure.

>  	packet_flush(1);

I briefly wondered if all the operations on refline can be turned
into direct operations on stdout, but the answer obviously is no.
We still need to have a strbuf to assemble a single packet and
measure its final length before we can send it out to stdout.

> diff --git a/upload-pack.c b/upload-pack.c
> index 297b76fcb4..b592ac6cfb 100644
> --- a/upload-pack.c
> +++ b/upload-pack.c
> @@ -58,6 +58,7 @@ enum allow_uor {
>   */
>  struct upload_pack_data {
>  	struct string_list symref;				/* v0 only */
> +	struct strbuf send_ref_buf;				/* v0 only */
>  	struct object_array want_obj;
>  	struct object_array have_obj;
>  	struct oid_array haves;					/* v2 only */
> @@ -126,6 +127,7 @@ static void upload_pack_data_init(struct upload_pack_data *data)
>  	struct string_list uri_protocols = STRING_LIST_INIT_DUP;
>  	struct object_array extra_edge_obj = OBJECT_ARRAY_INIT;
>  	struct string_list allowed_filters = STRING_LIST_INIT_DUP;
> +	struct strbuf send_ref_buf = STRBUF_INIT;

IIUC, the most notable difference between this round and the
previous one is that now we are no longer buffering more than one
packet worth of data because we let the stdio to accumulate them.
I was a bit surprised that we still want to have a strbuf inside
this structure (which is there only because it wants to persist
during the whole conversation with the other side).

Ahh, sorry, scratch that.  I do remember the discussion/patch that
it was hurting to make calls to strbuf-init/strbuf-release once per
ref, and it is an easy way to have the structure here to reuse it.

OK.  This step looks quite sensible, other than the same "do we want
check_pipe() before dying upon fflush() failure?" we see in the last
hunk below.  I didn't mention this in the review of 1/2, but the new
fwrite_or_die() helper function may also have the same issue.

Thanks.

> @@ -1348,6 +1354,9 @@ void upload_pack(struct upload_pack_options *options)
>  		reset_timeout(data.timeout);
>  		head_ref_namespaced(send_ref, &data);
>  		for_each_namespaced_ref(send_ref, &data);
> +		/* Call fflush because send_ref uses stdio. */
> +		if (fflush(stdout))
> +			die_errno(_("write failure on standard output"));
>  		advertise_shallow_grafts(1);
>  		packet_flush(1);
>  	} else {

Junio C Hamano Aug. 26, 2021, 8:21 p.m. UTC | #2

Junio C Hamano <gitster@pobox.com> writes:

> IIUC, the most notable difference between this round and the
> previous one is that now we are no longer buffering more than one
> packet worth of data because we let the stdio to accumulate them.
> I was a bit surprised that we still want to have a strbuf inside
> this structure (which is there only because it wants to persist
> during the whole conversation with the other side).
>
> Ahh, sorry, scratch that.  I do remember the discussion/patch that
> it was hurting to make calls to strbuf-init/strbuf-release once per
> ref, and it is an easy way to have the structure here to reuse it.

But that means this majorly overlaps what Patrick is already doing
in his ps/ls-refs-strbuf-optim topic.  Perhaps these should be
rebased on top of that topic branch, I wonder?

Taylor Blau Aug. 26, 2021, 10:35 p.m. UTC | #3

On Thu, Aug 26, 2021 at 09:33:08AM -0700, Junio C Hamano wrote:
> > @@ -171,6 +171,9 @@ int ls_refs(struct repository *r, struct strvec *keys,
> >  		strvec_push(&data.prefixes, "");
> >  	for_each_fullref_in_prefixes(get_git_namespace(), data.prefixes.v,
> >  				     send_ref, &data, 0);
> > +	/* Call fflush because send_ref uses stdio. */
> > +	if (fflush(stdout))
> > +		die_errno(_("write failure on standard output"));
>
> There is maybe_flush_or_die() helper that does a bit too much (I do
> not think this code path needs to worry about GIT_FLUSH) but does
> call check_pipe(errno) like packet_write_fmt() does upon seeing a
> write failure.
>
> >  	packet_flush(1);

I was somewhat surprised to see the fflush call here and not in a
companion to the existing packet_flush (when working with stdio instead
of file descriptors, of course).

What Jacob wrote is not wrong, of course, but I think having
packet_fflush() or similar would be less error-prone.

> OK.  This step looks quite sensible, other than the same "do we want
> check_pipe() before dying upon fflush() failure?" we see in the last
> hunk below.  I didn't mention this in the review of 1/2, but the new
> fwrite_or_die() helper function may also have the same issue.

Agreed.

Thanks,
Taylor

Jeff King Aug. 26, 2021, 11:24 p.m. UTC | #4

On Thu, Aug 26, 2021 at 09:33:08AM -0700, Junio C Hamano wrote:

> > +	/*
> > +	 * Increase the stdio buffer size for stdout, for the benefit of ref
> > +	 * advertisement writes. We are only allowed to call setvbuf(3) "after
> > +	 * opening a stream and before any other operations have been performed
> > +	 * on it", so let's call it before we have written anything to stdout.
> > +	 */
> > +	if (setvbuf(stdout, xmalloc(LARGE_PACKET_MAX), _IOFBF,
> > +			LARGE_PACKET_MAX))
> > +		die_errno("failed to grow stdout buffer");
> 
> Nice to see a comment on the tricky part.  I do not think we mind if
> we rounded up the allocation size to the next power of two here, but
> there probably won't be any measurable benefit for doing so.

I'm a little negative on this part, actually. The commit message claims
it's for consistency across platforms. But I would argue that if your
libc buffer size is sub-optimal, then we shouldn't be sprinkling these
adjustments through the code. Either:

 - we should consider it a quality-of-implementation issue, and people
   using that libc should push back on their platform to change the
   default size; or

 - we should fix it consistently and transparently throughout Git by
   adjusting std{in,out,err} in common-main, and using fopen()/fdopen()
   wrappers to adjust to something more sensible

I also find the use of LARGE_PACKET_MAX weird here. Is that really the
optimal stdio buffer size? The whole point of this is coalescing _small_
packets into a single write() syscall. So how many small packets fit
into LARGE_PACKET_MAX is comparing apples and oranges.

> >  	packet_flush(1);
> 
> I briefly wondered if all the operations on refline can be turned
> into direct operations on stdout, but the answer obviously is no.
> We still need to have a strbuf to assemble a single packet and
> measure its final length before we can send it out to stdout.

I think we could push quite a bit more down into the pkt-line code by
making a few more operations FILE-aware. E.g., the patch below shows an
alternate version of patch 2, where we have more helpers and the changes
to the caller are much smaller.

The proliferation of packet_write_fmt() vs packet_fwrite_fmt() is a
little ugly, but:

  - I didn't bother to factor out commonality for this proof-of-concept.
    I think we could be sharing a bit more code in pkt-line.c, as noted.

  - This would all be much nicer if we refactored upload-pack to use
    the packet_writer interface. And then the caller sets up the "FILE
    vs descriptor" choice once when creating the writer, and all of the
    code it calls doesn't have to care either way.

diff --git a/ls-refs.c b/ls-refs.c
index 88f6c3f60d..e6a2dbd962 100644
--- a/ls-refs.c
+++ b/ls-refs.c
@@ -105,7 +105,7 @@ static int send_ref(const char *refname, const struct object_id *oid,
 	}
 
 	strbuf_addch(&refline, '\n');
-	packet_write(1, refline.buf, refline.len);
+	packet_fwrite(stdout, refline.buf, refline.len);
 
 	strbuf_release(&refline);
 	return 0;
@@ -171,7 +171,7 @@ int ls_refs(struct repository *r, struct strvec *keys,
 		strvec_push(&data.prefixes, "");
 	for_each_fullref_in_prefixes(get_git_namespace(), data.prefixes.v,
 				     send_ref, &data, 0);
-	packet_flush(1);
+	packet_fflush(stdout);
 	strvec_clear(&data.prefixes);
 	return 0;
 }
diff --git a/pkt-line.c b/pkt-line.c
index 244b326708..12cf09804b 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -92,6 +92,14 @@ void packet_flush(int fd)
 		die_errno(_("unable to write flush packet"));
 }
 
+void packet_fflush(FILE *fh)
+{
+	packet_trace("0000", 4, 1);
+	fwrite_or_die(fh, "0000", 4);
+	if (fflush(fh))
+		die_errno(_("unable to flush packet stream"));
+}
+
 void packet_delim(int fd)
 {
 	packet_trace("0001", 4, 1);
@@ -194,6 +202,21 @@ int packet_write_fmt_gently(int fd, const char *fmt, ...)
 	return status;
 }
 
+/* this could be refactored to share code with packet_write_fmt_1 */
+void packet_fwrite_fmt(FILE *fh, const char *fmt, ...)
+{
+	static struct strbuf buf = STRBUF_INIT;
+	va_list args;
+
+	strbuf_reset(&buf);
+
+	va_start(args, fmt);
+	format_packet(&buf, "", fmt, args);
+	va_end(args);
+
+	fwrite_or_die(fh, buf.buf, buf.len);
+}
+
 static int do_packet_write(const int fd_out, const char *buf, size_t size,
 			   struct strbuf *err)
 {
diff --git a/pkt-line.h b/pkt-line.h
index c9cb5e1719..7899f8d8e1 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -21,9 +21,11 @@
  * side can't, we stay with pure read/write interfaces.
  */
 void packet_flush(int fd);
+void packet_fflush(FILE *fh);
 void packet_delim(int fd);
 void packet_response_end(int fd);
 void packet_write_fmt(int fd, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
+void packet_fwrite_fmt(FILE *fh, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
 void packet_buf_flush(struct strbuf *buf);
 void packet_buf_delim(struct strbuf *buf);
 void set_packet_header(char *buf, int size);
diff --git a/upload-pack.c b/upload-pack.c
index 297b76fcb4..7c98c04d73 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -1207,7 +1207,7 @@ static int send_ref(const char *refname, const struct object_id *oid,
 
 		format_symref_info(&symref_info, &data->symref);
 		format_session_id(&session_id, data);
-		packet_write_fmt(1, "%s %s%c%s%s%s%s%s%s%s object-format=%s agent=%s\n",
+		packet_fwrite_fmt(stdout, "%s %s%c%s%s%s%s%s%s%s object-format=%s agent=%s\n",
 			     oid_to_hex(oid), refname_nons,
 			     0, capabilities,
 			     (data->allow_uor & ALLOW_TIP_SHA1) ?
@@ -1223,11 +1223,11 @@ static int send_ref(const char *refname, const struct object_id *oid,
 		strbuf_release(&symref_info);
 		strbuf_release(&session_id);
 	} else {
-		packet_write_fmt(1, "%s %s\n", oid_to_hex(oid), refname_nons);
+		packet_fwrite_fmt(stdout, "%s %s\n", oid_to_hex(oid), refname_nons);
 	}
 	capabilities = NULL;
 	if (!peel_iterated_oid(oid, &peeled))
-		packet_write_fmt(1, "%s %s^{}\n", oid_to_hex(&peeled), refname_nons);
+		packet_fwrite_fmt(stdout, "%s %s^{}\n", oid_to_hex(&peeled), refname_nons);
 	return 0;
 }
 
@@ -1348,6 +1348,14 @@ void upload_pack(struct upload_pack_options *options)
 		reset_timeout(data.timeout);
 		head_ref_namespaced(send_ref, &data);
 		for_each_namespaced_ref(send_ref, &data);
+		/*
+		 * sadly we still have to fflush manually here, because
+		 * advertise_shallow_grafts() uses the descriptor directly
+		 * before we could call packet_fflush(). But isn't that a good
+		 * reason to convert that function to stdio, too?
+		 */
+		if (fflush(stdout))
+			die_errno(_("write failure on standard output"));
 		advertise_shallow_grafts(1);
 		packet_flush(1);
 	} else {

> > +	struct strbuf send_ref_buf = STRBUF_INIT;
> 
> IIUC, the most notable difference between this round and the
> previous one is that now we are no longer buffering more than one
> packet worth of data because we let the stdio to accumulate them.
> I was a bit surprised that we still want to have a strbuf inside
> this structure (which is there only because it wants to persist
> during the whole conversation with the other side).
> 
> Ahh, sorry, scratch that.  I do remember the discussion/patch that
> it was hurting to make calls to strbuf-init/strbuf-release once per
> ref, and it is an easy way to have the structure here to reuse it.

You'll note in my patch above that we format into a static strbuf to
avoid this cost. Perhaps a little ugly, but that's exactly what the
existing packet_write_fmt() code is doing!

If we want to get rid of that, I think there are two options:

  - for stdio handles, we don't ever need the fully-formatted packet. We
    can get the formatted size with a too-small snprintf(), and then
    directly fprintf() out the result (this doesn't work with
    descriptors, because we have to format the result to feed it to
    write()).

  - if this were all using packet_writer, it could easily learn to
    heap-allocate a single buffer for repeated use. That's roughly the
    same as what's happening in this series, but it would all be taken
    care of by the pkt-line code.

-Peff

Jeff King Aug. 26, 2021, 11:32 p.m. UTC | #5

On Thu, Aug 26, 2021 at 12:06:48PM +0200, Jacob Vosmaer wrote:

> @@ -126,6 +127,7 @@ static void upload_pack_data_init(struct upload_pack_data *data)
>  	struct string_list uri_protocols = STRING_LIST_INIT_DUP;
>  	struct object_array extra_edge_obj = OBJECT_ARRAY_INIT;
>  	struct string_list allowed_filters = STRING_LIST_INIT_DUP;
> +	struct strbuf send_ref_buf = STRBUF_INIT;
>  
>  	memset(data, 0, sizeof(*data));
>  	data->symref = symref;
> @@ -141,6 +143,7 @@ static void upload_pack_data_init(struct upload_pack_data *data)
>  	data->allow_filter_fallback = 1;
>  	data->tree_filter_max_depth = ULONG_MAX;
>  	packet_writer_init(&data->writer, 1);
> +	data->send_ref_buf = send_ref_buf;

This does a struct copy of the strbuf, which is usually a bad thing
(both copies think they own the pointer). It works here because the
original immediately goes out of scope. The usual thing would be to do
this instead:

  strbuf_init(&data->send_ref_buf, 0);

but I notice this whole function is somewhat odd that way (lots of
static initializers followed by struct assignment, rather than using
initialization functions).

I'm not sure if it's worth changing or not. Again, I don't think it's
doing the wrong thing, but just sort of unusual for our code base. A few
of the data structures in use here don't have _init() functions
(object_array and oid_array), but that probably means we ought to add
them.

-Peff

Junio C Hamano Aug. 27, 2021, 4:15 p.m. UTC | #6

Jeff King <peff@peff.net> writes:

> On Thu, Aug 26, 2021 at 09:33:08AM -0700, Junio C Hamano wrote:
>
>> > +	/*
>> > +	 * Increase the stdio buffer size for stdout, for the benefit of ref
>> > +	 * advertisement writes. We are only allowed to call setvbuf(3) "after
>> > +	 * opening a stream and before any other operations have been performed
>> > +	 * on it", so let's call it before we have written anything to stdout.
>> > +	 */
>> > +	if (setvbuf(stdout, xmalloc(LARGE_PACKET_MAX), _IOFBF,
>> > +			LARGE_PACKET_MAX))
>> > +		die_errno("failed to grow stdout buffer");
>> 
>> Nice to see a comment on the tricky part.  I do not think we mind if
>> we rounded up the allocation size to the next power of two here, but
>> there probably won't be any measurable benefit for doing so.
>
> I'm a little negative on this part, actually. The commit message claims
> it's for consistency across platforms. But I would argue that if your
> libc buffer size is sub-optimal, then we shouldn't be sprinkling these
> adjustments through the code. Either:
>
>  - we should consider it a quality-of-implementation issue, and people
>    using that libc should push back on their platform to change the
>    default size; or
>
>  - we should fix it consistently and transparently throughout Git by
>    adjusting std{in,out,err} in common-main, and using fopen()/fdopen()
>    wrappers to adjust to something more sensible

I agree that it would be ideal if we didn't do setvbuf() at all, the
second best would be to do so at a central place.  My "Nice" applied
only to the comment ;-)

> I also find the use of LARGE_PACKET_MAX weird here. Is that really the
> optimal stdio buffer size? The whole point of this is coalescing _small_
> packets into a single write() syscall. So how many small packets fit
> into LARGE_PACKET_MAX is comparing apples and oranges.

The sizing is all my fault.  The original used 32k threashold and
implemented manual buffering by flushing whenever the accumulated
data exceeded the threashold, as opposed to "if we buffer this new
piece, it would exceed, so let's flush first what we got so far,
which is smaller than the threashold", which I found indefensible in
two ways.  The "flush _after_ we go over it" semantics looked iffy,
and 32k was totally out of thin air.  As LARGE_PACKET_MAX is the
hard upper limit of each packet that has been with us forever, it
was more defensible than 32k ;-)

But if we are using stdio, I agree that it is much better not to
worry about sizing at all by not doing setvbuf() and leaving it to
libc implementation.  They ought to know what works on their
platform the best.

Thanks.

[2/2] upload-pack: use stdio in send_ref callbacks

Commit Message

Comments

Patch