From patchwork Thu Aug 26 10:06:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jacob Vosmaer X-Patchwork-Id: 12459379 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-24.5 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 63E57C432BE for ; Thu, 26 Aug 2021 10:09:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4A461610E9 for ; Thu, 26 Aug 2021 10:09:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241239AbhHZKKb (ORCPT ); Thu, 26 Aug 2021 06:10:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55218 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233880AbhHZKKa (ORCPT ); Thu, 26 Aug 2021 06:10:30 -0400 Received: from mail-ej1-x631.google.com (mail-ej1-x631.google.com [IPv6:2a00:1450:4864:20::631]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 12D3CC061757 for ; Thu, 26 Aug 2021 03:09:43 -0700 (PDT) Received: by mail-ej1-x631.google.com with SMTP id u3so4976375ejz.1 for ; Thu, 26 Aug 2021 03:09:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gitlab.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=a4w3yeW+/95eIzmS1csDT83uvsjpgObceYkw3P6sDfM=; b=B+JU664QXA3dZC8oidcPJw8Ru9Vrwc1qtXSIVFxbxtVuWt6J5FjvSauEvU4JzcsA/h JUKSY9HJNq/8Sms0F23o/ha2GFcS72A3Ks/wZ+G+VyYLQ2n1Uc7Vizr7UTKOIVspJsFy LXioivfvy3HG75I1wQhDh5hDfjDOeMPHw41K0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=a4w3yeW+/95eIzmS1csDT83uvsjpgObceYkw3P6sDfM=; b=K/EpuTdk6TbBLoFAIzjL+7IZbTScMJjrq5vboCGO3Q+F9PgaiBuK7+/5y0W6DURipi +ZILRvk3DTJJoYDV9jH4FTA01ySHMAmfOLQ4bU8BBQCfDJvcA9/GldrCKVIwY7Mki/gF WhC24fWo2vYmkQJsnPREm+uOdvJNnXUUJtIpywOXDovD0yUr1bD5UESYjrVv6YWPA6G8 otj5pJ54z10KRRsB+pwSeV3Cn/oBEbzr7wBqjQDA+J6SZ+FTrcs8GMszVtDAtUTH+20a /R5QVyv7veYAv08y/10YEe421VyoMI1ISy8H9NIj48GrcsW6q5Vvb5AZHnypv2ufapDV NGng== X-Gm-Message-State: AOAM531iuq600YcvMlD4vD/4O8DTKkI/CGrAJJF8inuxkvBSd/VJlKnx 4rOKnt2sxYUXMMfEk/jfMmBKN/81IPFwlg== X-Google-Smtp-Source: ABdhPJxRZ2TXtqSn1EiQlC+1cjS2nhhpeWC8YmyjKZb43p2idGN3zersp5fcSUuGhNlmWF8TQUeH+Q== X-Received: by 2002:a17:906:4cc1:: with SMTP id q1mr3539186ejt.415.1629972581656; Thu, 26 Aug 2021 03:09:41 -0700 (PDT) Received: from localhost.localdomain (e93008.upc-e.chello.nl. [213.93.93.8]) by smtp.gmail.com with ESMTPSA id bt24sm1132649ejb.77.2021.08.26.03.09.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Aug 2021 03:09:41 -0700 (PDT) From: Jacob Vosmaer To: peff@peff.net, me@ttaylorr.com, git@vger.kernel.org, gitster@pobox.com, ps@pks.im Cc: Jacob Vosmaer Subject: [PATCH 1/2] pkt-line: add packet_fwrite Date: Thu, 26 Aug 2021 12:06:47 +0200 Message-Id: <20210826100648.10333-1-jacob@gitlab.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org This adds a new function packet_fwrite which works like packet_write, except it writes to a stdio stream. Helped-by: Patrick Steinhardt Signed-off-by: Jacob Vosmaer --- cache.h | 1 + pkt-line.c | 16 ++++++++++++++++ pkt-line.h | 1 + write-or-die.c | 6 ++++++ 4 files changed, 24 insertions(+) diff --git a/cache.h b/cache.h index bd4869beee..fbefe0927b 100644 --- a/cache.h +++ b/cache.h @@ -1736,6 +1736,7 @@ extern const char *git_mailmap_blob; void maybe_flush_or_die(FILE *, const char *); __attribute__((format (printf, 2, 3))) void fprintf_or_die(FILE *, const char *fmt, ...); +void fwrite_or_die(FILE *f, const void *buf, size_t count); #define COPY_READ_ERROR (-2) #define COPY_WRITE_ERROR (-3) diff --git a/pkt-line.c b/pkt-line.c index 9f63eae2e6..244b326708 100644 --- a/pkt-line.c +++ b/pkt-line.c @@ -243,6 +243,22 @@ void packet_write(int fd_out, const char *buf, size_t size) die("%s", err.buf); } +void packet_fwrite(FILE *f, const char *buf, size_t size) +{ + size_t packet_size; + char header[4]; + + if (size > LARGE_PACKET_DATA_MAX) + die(_("packet write failed - data exceeds max packet size")); + + packet_trace(buf, size, 1); + packet_size = size + 4; + + set_packet_header(header, packet_size); + fwrite_or_die(f, header, 4); + fwrite_or_die(f, buf, size); +} + void packet_buf_write(struct strbuf *buf, const char *fmt, ...) { va_list args; diff --git a/pkt-line.h b/pkt-line.h index 5af5f45687..c9cb5e1719 100644 --- a/pkt-line.h +++ b/pkt-line.h @@ -28,6 +28,7 @@ void packet_buf_flush(struct strbuf *buf); void packet_buf_delim(struct strbuf *buf); void set_packet_header(char *buf, int size); void packet_write(int fd_out, const char *buf, size_t size); +void packet_fwrite(FILE *f, const char *buf, size_t size); void packet_buf_write(struct strbuf *buf, const char *fmt, ...) __attribute__((format (printf, 2, 3))); void packet_buf_write_len(struct strbuf *buf, const char *data, size_t len); int packet_flush_gently(int fd); diff --git a/write-or-die.c b/write-or-die.c index d33e68f6ab..d82ef54f90 100644 --- a/write-or-die.c +++ b/write-or-die.c @@ -70,3 +70,9 @@ void write_or_die(int fd, const void *buf, size_t count) die_errno("write error"); } } + +void fwrite_or_die(FILE *f, const void *buf, size_t count) +{ + if (fwrite(buf, count, 1, f) != 1) + die_errno("write error"); +} From patchwork Thu Aug 26 10:06:48 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jacob Vosmaer X-Patchwork-Id: 12459381 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-24.5 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6C2F3C432BE for ; Thu, 26 Aug 2021 10:09:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 50B6B610E9 for ; Thu, 26 Aug 2021 10:09:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241270AbhHZKKn (ORCPT ); Thu, 26 Aug 2021 06:10:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55280 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233880AbhHZKKm (ORCPT ); Thu, 26 Aug 2021 06:10:42 -0400 Received: from mail-ej1-x633.google.com (mail-ej1-x633.google.com [IPv6:2a00:1450:4864:20::633]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AEAF2C061757 for ; Thu, 26 Aug 2021 03:09:55 -0700 (PDT) Received: by mail-ej1-x633.google.com with SMTP id bt14so5089793ejb.3 for ; Thu, 26 Aug 2021 03:09:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gitlab.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=RgLWzHLwXQQC2y2bjePQzQv7bB6tdxEyxfKqHQCTsqw=; b=fBB6Ox3+sSdv+JL1V5LgtEUvDDaCX4nVaoNq18sEEGHBtYQ+bmOcy4YUCaTeMgeuE3 nqNI+xeqIjwOKe0O8hpuMa/F1vn9CxWE8NZeLrqK/donO1FP5wX8YkJHv6wmsoI5gdJV qlVmP8AKXNJyixvjimHqRE1C8KPwfAEO3SLIQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=RgLWzHLwXQQC2y2bjePQzQv7bB6tdxEyxfKqHQCTsqw=; b=gFa1ne23hsU9qZcQnNl1zTe3wWSzCzfHHGkzITPiGj+w7T2wJ/C+2CjaTM7VWKrIHp ppoAjtJDOEnws1xZ2r/Ea+FOo6al7+C67C+N/PABm9uGr67zSBlkq0TL8yQRXckXPOzJ ijAQfq9UyjBt4pB5FcZzKELRODajk1zEzyekP8bXUemGBfqtEiqteKMb2OSxlTq5X+8s EFNdyrSsW/L7ud1K7h48orYPHkUNoqGVUTOQ3F6x+HifkVxT13pQovqQNQOY3/B0PDNp EaWXuZG3/NWInuOmzevAdc2zUUFTF1+slq2y/7b+CtmhOM9aTJKhWOXGOPiBM+/fGXvl Pj4w== X-Gm-Message-State: AOAM530Jxoe/YBkBQ4cJCXSn0H6xBLTOr9Euo6QiSG+Pv+F27pO2bpWA 6Bv92EpLGvclKJPO2bQRC8qSPIrqdixuBg== X-Google-Smtp-Source: ABdhPJwxtsATqhKZktqXrOR0GQ2W/tCyPrCdJfG0ksfqaJvUTW5a6Z7J1hSqeaKISzmSGihcYh8O0g== X-Received: by 2002:a17:906:3157:: with SMTP id e23mr3476969eje.29.1629972594257; Thu, 26 Aug 2021 03:09:54 -0700 (PDT) Received: from localhost.localdomain (e93008.upc-e.chello.nl. [213.93.93.8]) by smtp.gmail.com with ESMTPSA id bt24sm1132649ejb.77.2021.08.26.03.09.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Aug 2021 03:09:53 -0700 (PDT) From: Jacob Vosmaer To: peff@peff.net, me@ttaylorr.com, git@vger.kernel.org, gitster@pobox.com, ps@pks.im Cc: Jacob Vosmaer Subject: [PATCH 2/2] upload-pack: use stdio in send_ref callbacks Date: Thu, 26 Aug 2021 12:06:48 +0200 Message-Id: <20210826100648.10333-2-jacob@gitlab.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20210826100648.10333-1-jacob@gitlab.com> References: <20210826100648.10333-1-jacob@gitlab.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org In both protocol v0 and v2, upload-pack writes one pktline packet per advertised ref to stdout. That means one or two write(2) syscalls per ref. This is problematic if these writes become network sends with high overhead. This commit changes both send_ref callbacks to use buffered IO using stdio. The default buffer size is usually 4KB, but with musl libc it is only 1KB. So for consistent results we need to set a buffer size ourselves. During startup of upload-pack, we set the stdout buffer size to LARGE_PACKET_MAX, i.e. just shy of 64KB. To give an example of the impact: I set up a single-threaded loop that calls ls-remote (with HTTP and protocol v2) on a local GitLab instance, on a repository with 11K refs. When I switch from Git v2.32.0 to this patch, I see a 50% reduction in CPU time for Git, and 75% for Gitaly (GitLab's Git RPC service). So using buffered IO not only saves syscalls in upload-pack, it also saves time in things that consume upload-pack's output. Signed-off-by: Jacob Vosmaer --- builtin/upload-pack.c | 10 ++++++++++ ls-refs.c | 5 ++++- upload-pack.c | 15 ++++++++++++--- 3 files changed, 26 insertions(+), 4 deletions(-) diff --git a/builtin/upload-pack.c b/builtin/upload-pack.c index 6da8fa2607..8033f84124 100644 --- a/builtin/upload-pack.c +++ b/builtin/upload-pack.c @@ -48,6 +48,16 @@ int cmd_upload_pack(int argc, const char **argv, const char *prefix) if (!enter_repo(dir, strict)) die("'%s' does not appear to be a git repository", dir); + /* + * Increase the stdio buffer size for stdout, for the benefit of ref + * advertisement writes. We are only allowed to call setvbuf(3) "after + * opening a stream and before any other operations have been performed + * on it", so let's call it before we have written anything to stdout. + */ + if (setvbuf(stdout, xmalloc(LARGE_PACKET_MAX), _IOFBF, + LARGE_PACKET_MAX)) + die_errno("failed to grow stdout buffer"); + switch (determine_protocol_version_server()) { case protocol_v2: serve_opts.advertise_capabilities = opts.advertise_refs; diff --git a/ls-refs.c b/ls-refs.c index 88f6c3f60d..83f2948fc3 100644 --- a/ls-refs.c +++ b/ls-refs.c @@ -105,7 +105,7 @@ static int send_ref(const char *refname, const struct object_id *oid, } strbuf_addch(&refline, '\n'); - packet_write(1, refline.buf, refline.len); + packet_fwrite(stdout, refline.buf, refline.len); strbuf_release(&refline); return 0; @@ -171,6 +171,9 @@ int ls_refs(struct repository *r, struct strvec *keys, strvec_push(&data.prefixes, ""); for_each_fullref_in_prefixes(get_git_namespace(), data.prefixes.v, send_ref, &data, 0); + /* Call fflush because send_ref uses stdio. */ + if (fflush(stdout)) + die_errno(_("write failure on standard output")); packet_flush(1); strvec_clear(&data.prefixes); return 0; diff --git a/upload-pack.c b/upload-pack.c index 297b76fcb4..b592ac6cfb 100644 --- a/upload-pack.c +++ b/upload-pack.c @@ -58,6 +58,7 @@ enum allow_uor { */ struct upload_pack_data { struct string_list symref; /* v0 only */ + struct strbuf send_ref_buf; /* v0 only */ struct object_array want_obj; struct object_array have_obj; struct oid_array haves; /* v2 only */ @@ -126,6 +127,7 @@ static void upload_pack_data_init(struct upload_pack_data *data) struct string_list uri_protocols = STRING_LIST_INIT_DUP; struct object_array extra_edge_obj = OBJECT_ARRAY_INIT; struct string_list allowed_filters = STRING_LIST_INIT_DUP; + struct strbuf send_ref_buf = STRBUF_INIT; memset(data, 0, sizeof(*data)); data->symref = symref; @@ -141,6 +143,7 @@ static void upload_pack_data_init(struct upload_pack_data *data) data->allow_filter_fallback = 1; data->tree_filter_max_depth = ULONG_MAX; packet_writer_init(&data->writer, 1); + data->send_ref_buf = send_ref_buf; data->keepalive = 5; data->advertise_sid = 0; @@ -158,6 +161,7 @@ static void upload_pack_data_clear(struct upload_pack_data *data) object_array_clear(&data->extra_edge_obj); list_objects_filter_release(&data->filter_options); string_list_clear(&data->allowed_filters, 0); + strbuf_release(&data->send_ref_buf); free((char *)data->pack_objects_hook); } @@ -1201,13 +1205,14 @@ static int send_ref(const char *refname, const struct object_id *oid, if (mark_our_ref(refname_nons, refname, oid)) return 0; + strbuf_reset(&data->send_ref_buf); if (capabilities) { struct strbuf symref_info = STRBUF_INIT; struct strbuf session_id = STRBUF_INIT; format_symref_info(&symref_info, &data->symref); format_session_id(&session_id, data); - packet_write_fmt(1, "%s %s%c%s%s%s%s%s%s%s object-format=%s agent=%s\n", + packet_buf_write(&data->send_ref_buf, "%s %s%c%s%s%s%s%s%s%s object-format=%s agent=%s\n", oid_to_hex(oid), refname_nons, 0, capabilities, (data->allow_uor & ALLOW_TIP_SHA1) ? @@ -1223,11 +1228,12 @@ static int send_ref(const char *refname, const struct object_id *oid, strbuf_release(&symref_info); strbuf_release(&session_id); } else { - packet_write_fmt(1, "%s %s\n", oid_to_hex(oid), refname_nons); + packet_buf_write(&data->send_ref_buf, "%s %s\n", oid_to_hex(oid), refname_nons); } capabilities = NULL; if (!peel_iterated_oid(oid, &peeled)) - packet_write_fmt(1, "%s %s^{}\n", oid_to_hex(&peeled), refname_nons); + packet_buf_write(&data->send_ref_buf, "%s %s^{}\n", oid_to_hex(&peeled), refname_nons); + fwrite_or_die(stdout, data->send_ref_buf.buf, data->send_ref_buf.len); return 0; } @@ -1348,6 +1354,9 @@ void upload_pack(struct upload_pack_options *options) reset_timeout(data.timeout); head_ref_namespaced(send_ref, &data); for_each_namespaced_ref(send_ref, &data); + /* Call fflush because send_ref uses stdio. */ + if (fflush(stdout)) + die_errno(_("write failure on standard output")); advertise_shallow_grafts(1); packet_flush(1); } else {