From patchwork Mon Apr 8 12:23:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13621066 Received: from fhigh6-smtp.messagingengine.com (fhigh6-smtp.messagingengine.com [103.168.172.157]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E47596A8AD for ; Mon, 8 Apr 2024 12:23:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.157 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712579038; cv=none; b=OWeEuZJklYDITC3F5OmjLsD70B+DGX+JYSXcutr4WDhJlEO1WkUExK6ebLKaIhOtRjUcSDxghEWiQqR8v249JoSN7Y4vbXHJO7kzvNy6+AoOOeJ9HZwGX6gwqOxQETNIotFvGlY9jFWYLf10p9hEPV4axagesJAUXQT/u+s009M= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712579038; c=relaxed/simple; bh=xALB8ROpc8WleFGYIqMZX0NLRNkm7IhngodHrZd4P5s=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=Yyfaim10dhgXy9ePNeBfYNwmUV+G2/Ala3Wmq5vbj6yrghjs7FpJDMQFJp96H6cJl0TrU+E5AuOA164M7Hg/W2UWVupRMVyyjBNdQ0q0cNR9kZMmI8ZgooPR9iVXatH9Ji59WLrVSxIj2v2gL1N2gc0i7r+Xpn6bbqXK41HbOX8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=D5mNi6GN; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=ZFJlHy48; arc=none smtp.client-ip=103.168.172.157 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="D5mNi6GN"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="ZFJlHy48" Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailfhigh.nyi.internal (Postfix) with ESMTP id EF8111140112; Mon, 8 Apr 2024 08:23:55 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Mon, 08 Apr 2024 08:23:55 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1712579035; x=1712665435; bh=YrnkX2gvdT rBRJ6mq3fMg1CMCFkS5ZChUk0SDm2zSAU=; b=D5mNi6GNv2LIGfcSiMsOLOMdWU z3Ckb0QTqXKnVqwK/TvRnf7vknf4Ancs8TIX9wzNezrjTl5P9yM389jAj/rP8Ail esXh7Z4PUroJRElpSjZT1GsoYFXFAxRaDPAJJ/Kv+fWA/1bjBEPcClGQPSH8OJC+ sj7bXJVGfupnQ3aMTt6eSz1k10R6NiCR+vHvWCE/APbadk8l51Y6ReqA1b5jkmYG ElurYxGiW7ymytqgnpshnceAF07kGc4UNfvnIyOdwptr6N7maFYpadqgDrxZYdiy uzZZzFR5z+AyLCT49n2pI/klgYN/psmjqyinJ2L7FJkkKQq5DFcpU34C+WFg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1712579035; x=1712665435; bh=YrnkX2gvdTrBRJ6mq3fMg1CMCFkS 5ZChUk0SDm2zSAU=; b=ZFJlHy48EZNoHdJuigJGLpxwvaeGbPoHxIS70SPpy6ds jxS918FBawdB7hHXwK/UINPwTcv5BKoid9Ez3fv3oX1JUxkF2HzUkBvC4BQbqLoh OvpyEnZyumnmYptispe5SQqZsDr2aExNcTbrJ+QMNPq5lAVQ7cyZvdnEkHTXZzIl DYC70pkjiV5vgwO1DaNkgctaoRMFLXOBwu2t1vsdkQGS4eFEUhR+yN5Tx5sAx0iM zQMlREGiQeG0ojx1yU/dDy76BK1EA4ww4XXPybUqWQ10LTo8/UuAgAgSY3wNXysX 7M10cEb6zA4Mg+w9pctyaQNUETaGzDXpBVOQU59ftw== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledrudegiedgheduucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpeffhffvvefukfhfgggtuggjsehgtd erredttddvnecuhfhrohhmpefrrghtrhhitghkucfuthgvihhnhhgrrhguthcuoehpshes phhkshdrihhmqeenucggtffrrghtthgvrhhnpeeukedtvedtffevleejtefgheehieegke eluddvfeefgeehgfeltddtheejleffteenucevlhhushhtvghrufhiiigvpedtnecurfgr rhgrmhepmhgrihhlfhhrohhmpehpshesphhkshdrihhm X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 8 Apr 2024 08:23:54 -0400 (EDT) Received: by localhost (OpenSMTPD) with ESMTPSA id 120ce5c1 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Mon, 8 Apr 2024 12:23:48 +0000 (UTC) Date: Mon, 8 Apr 2024 14:23:52 +0200 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Junio C Hamano , Han-Wen Nienhuys Subject: [PATCH v3 01/11] refs/reftable: fix D/F conflict error message on ref copy Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: The `write_copy_table()` function is shared between the reftable implementations for renaming and copying refs. The only difference between those two cases is that the rename will also delete the old reference, whereas copying won't. This has resulted in a bug though where we don't properly verify refname availability. When calling `refs_verify_refname_available()`, we always add the old ref name to the list of refs to be skipped when computing availability, which indicates that the name would be available even if it already exists at the current point in time. This is only the right thing to do for renames though, not for copies. The consequence of this bug is quite harmless because the reftable backend has its own checks for D/F conflicts further down in the call stack, and thus we refuse the update regardless of the bug. But all the user gets in this case is an uninformative message that copying the ref has failed, without any further details. Fix the bug and only add the old name to the skip-list in case we rename the ref. Consequently, this error case will now be handled by `refs_verify_refname_available()`, which knows to provide a proper error message. Signed-off-by: Patrick Steinhardt --- refs/reftable-backend.c | 3 ++- t/t0610-reftable-basics.sh | 33 +++++++++++++++++++++++++++++++++ 2 files changed, 35 insertions(+), 1 deletion(-) diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c index e206d5a073..0358da14db 100644 --- a/refs/reftable-backend.c +++ b/refs/reftable-backend.c @@ -1351,7 +1351,8 @@ static int write_copy_table(struct reftable_writer *writer, void *cb_data) /* * Verify that the new refname is available. */ - string_list_insert(&skip, arg->oldname); + if (arg->delete_old) + string_list_insert(&skip, arg->oldname); ret = refs_verify_refname_available(&arg->refs->base, arg->newname, NULL, &skip, &errbuf); if (ret < 0) { diff --git a/t/t0610-reftable-basics.sh b/t/t0610-reftable-basics.sh index 686781192e..055231a707 100755 --- a/t/t0610-reftable-basics.sh +++ b/t/t0610-reftable-basics.sh @@ -730,6 +730,39 @@ test_expect_success 'reflog: updates via HEAD update HEAD reflog' ' ) ' +test_expect_success 'branch: copying branch with D/F conflict' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + test_commit A && + git branch branch && + cat >expect <<-EOF && + error: ${SQ}refs/heads/branch${SQ} exists; cannot create ${SQ}refs/heads/branch/moved${SQ} + fatal: branch copy failed + EOF + test_must_fail git branch -c branch branch/moved 2>err && + test_cmp expect err + ) +' + +test_expect_success 'branch: moving branch with D/F conflict' ' + test_when_finished "rm -rf repo" && + git init repo && + ( + cd repo && + test_commit A && + git branch branch && + git branch conflict && + cat >expect <<-EOF && + error: ${SQ}refs/heads/conflict${SQ} exists; cannot create ${SQ}refs/heads/conflict/moved${SQ} + fatal: branch rename failed + EOF + test_must_fail git branch -m branch conflict/moved 2>err && + test_cmp expect err + ) +' + test_expect_success 'worktree: adding worktree creates separate stack' ' test_when_finished "rm -rf repo worktree" && git init repo && From patchwork Mon Apr 8 12:23:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13621067 Received: from fout3-smtp.messagingengine.com (fout3-smtp.messagingengine.com [103.168.172.146]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 021EB6A8AD for ; Mon, 8 Apr 2024 12:24:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.146 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712579042; cv=none; b=ZN7TSPfxdGr9QGOZxRUuUNt0DFARe5AIYekgp26bw4O/ERskNywUQWHbhww6VhoFrkL/e6g6JUnvwNYldXI5rtAfb+aA4QniNAJJ4OA5nc3jypt22ynwqTUFTpcPweJ7DuAlzrN81/UO0x4mwEuQan7dTfSAAvXXw+cTPtI/3Ic= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712579042; c=relaxed/simple; bh=8UEk3P16u8wm7vxHESNtofyoU/W56N/LhrF0GXD0cTE=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=KixxgaL1khwpTHivzcsNQxZwbxDzsV63KIHDwIt52ozRIzp1Fgme0Fyvx1Rpk7kge6w+n4rXYOwKKXFHPhnQUKjVV2z9v1Beq+oDZL8vo7VkDR88OcKKN4hGyOdB7rVGum8+3DgsDkE90yo7xtKrdQ5WAWX9GzYQbb5+wnJHylU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=p1mOzCqs; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=b711VvMp; arc=none smtp.client-ip=103.168.172.146 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="p1mOzCqs"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="b711VvMp" Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailfout.nyi.internal (Postfix) with ESMTP id 173A013800D5; Mon, 8 Apr 2024 08:24:00 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Mon, 08 Apr 2024 08:24:00 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1712579040; x=1712665440; bh=P6xXQdSZK2 A5GAunZCCBmZl6zF+TSASzbY/6+1qtQXo=; b=p1mOzCqsdc/A34JSGdQYn1NtZO eR5U7qaCZCLTHcs1NPRPftRlyl0KMw0c661J6Y7SE2khi1MOrsVesHZJSldHzLqB c5JS9eWjLTn2YkII0jEJuh4vzu+yh/Zf71/6nib/phDVitbS/v5OTYF81fVTIIrY fclBwbzce/kH7Rk7nwzzP4ZCUIoct6EY790rsE4QJVJQandAwjMTT5TGBLMyJTE1 k5R3S5NB5ues2RMwRGEL3k8pvnFvrU7Hw3jET+5zfIXImKtF4dVVWOB5LcSVHUnV C1Lhbrcy2+jJk7K975ar29VtGqyfGrMLwMJWyp2z6PpCwYTVDgMdTTBe5IAQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1712579040; x=1712665440; bh=P6xXQdSZK2A5GAunZCCBmZl6zF+T SASzbY/6+1qtQXo=; b=b711VvMpg7VLmiYQ2pCQ7MYUiTZ7FPbte857Xd3vSj9U zlMPhyE0Amy3Wb4VHIw/EaJ+7nX8oJYE5w+fMJO041GSXzZzV4RWxToI+zDFwYge rxRyLi+GYa9oCkUkqYkgY5o2hVHSbSaq8yxF20YRFcnk85IjrNQnA42sFdF1mjW/ 6oVFwUVpkT1+Lvu5hextaPeW8+zozdh/xiZTGMgvuCLcqAOhQi/ea7buJ27GuUyS AJRwBbvfwXN3eg/Nure7uiNLt1G3/Iy2S25HeyDq85zcrqKEczedxrl3eFl/NpG/ e5xicHo8JO6ovzOiB3wSypFI/cCBMEDJ2Qh8epi71g== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledrudegiedgheduucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpeffhffvvefukfhfgggtuggjsehgtd erredttddvnecuhfhrohhmpefrrghtrhhitghkucfuthgvihhnhhgrrhguthcuoehpshes phhkshdrihhmqeenucggtffrrghtthgvrhhnpeeukedtvedtffevleejtefgheehieegke eluddvfeefgeehgfeltddtheejleffteenucevlhhushhtvghrufhiiigvpedunecurfgr rhgrmhepmhgrihhlfhhrohhmpehpshesphhkshdrihhm X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 8 Apr 2024 08:23:59 -0400 (EDT) Received: by localhost (OpenSMTPD) with ESMTPSA id b793b8a1 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Mon, 8 Apr 2024 12:23:53 +0000 (UTC) Date: Mon, 8 Apr 2024 14:23:56 +0200 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Junio C Hamano , Han-Wen Nienhuys Subject: [PATCH v3 02/11] refs/reftable: perform explicit D/F check when writing symrefs Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: We already perform explicit D/F checks in all reftable callbacks which write refs, except when writing symrefs. For one this leads to an error message which isn't perfectly actionable because we only tell the user that there was a D/F conflict, but not which refs conflicted with each other. But second, once all ref updating callbacks explicitly check for D/F conflicts, we can disable the D/F checks in the reftable library itself and thus avoid some duplicated efforts. Refactor the code that writes symref tables to explicitly call into `refs_verify_refname_available()` when writing symrefs. Signed-off-by: Patrick Steinhardt --- refs/reftable-backend.c | 20 +++++++++++++++++--- t/t0610-reftable-basics.sh | 2 +- 2 files changed, 18 insertions(+), 4 deletions(-) diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c index 0358da14db..8a54b0d8b2 100644 --- a/refs/reftable-backend.c +++ b/refs/reftable-backend.c @@ -1217,6 +1217,7 @@ static int reftable_be_pack_refs(struct ref_store *ref_store, struct write_create_symref_arg { struct reftable_ref_store *refs; struct reftable_stack *stack; + struct strbuf *err; const char *refname; const char *target; const char *logmsg; @@ -1239,6 +1240,11 @@ static int write_create_symref_table(struct reftable_writer *writer, void *cb_da reftable_writer_set_limits(writer, ts, ts); + ret = refs_verify_refname_available(&create->refs->base, create->refname, + NULL, NULL, create->err); + if (ret < 0) + return ret; + ret = reftable_writer_add_ref(writer, &ref); if (ret) return ret; @@ -1280,12 +1286,14 @@ static int reftable_be_create_symref(struct ref_store *ref_store, struct reftable_ref_store *refs = reftable_be_downcast(ref_store, REF_STORE_WRITE, "create_symref"); struct reftable_stack *stack = stack_for(refs, refname, &refname); + struct strbuf err = STRBUF_INIT; struct write_create_symref_arg arg = { .refs = refs, .stack = stack, .refname = refname, .target = target, .logmsg = logmsg, + .err = &err, }; int ret; @@ -1301,9 +1309,15 @@ static int reftable_be_create_symref(struct ref_store *ref_store, done: assert(ret != REFTABLE_API_ERROR); - if (ret) - error("unable to write symref for %s: %s", refname, - reftable_error_str(ret)); + if (ret) { + if (err.len) + error("%s", err.buf); + else + error("unable to write symref for %s: %s", refname, + reftable_error_str(ret)); + } + + strbuf_release(&err); return ret; } diff --git a/t/t0610-reftable-basics.sh b/t/t0610-reftable-basics.sh index 055231a707..12b0004781 100755 --- a/t/t0610-reftable-basics.sh +++ b/t/t0610-reftable-basics.sh @@ -255,7 +255,7 @@ test_expect_success 'ref transaction: creating symbolic ref fails with F/D confl git init repo && test_commit -C repo A && cat >expect <<-EOF && - error: unable to write symref for refs/heads: file/directory conflict + error: ${SQ}refs/heads/main${SQ} exists; cannot create ${SQ}refs/heads${SQ} EOF test_must_fail git -C repo symbolic-ref refs/heads refs/heads/foo 2>err && test_cmp expect err From patchwork Mon Apr 8 12:24:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13621068 Received: from fout3-smtp.messagingengine.com (fout3-smtp.messagingengine.com [103.168.172.146]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 25F436A340 for ; Mon, 8 Apr 2024 12:24:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.146 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712579047; cv=none; b=eP39zZSt9oZX/e6vQNjkcm+c1aHTs/c5Q9M8FNwqld3BqnSyQGYtr+eMVNWT3WT2nacJOLeabvfNvpgGJX/gLs9exbr9sYS9lkq8Avp6rCwDXnZL8ojRlR7tPbJ6xmSuYvwIwc/f1sJhKRMLt6orEYWCUcfhmITyL+EVci0tJjY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712579047; c=relaxed/simple; bh=PzM9mGZBtv/60xeuQhgpBNrEVg+fyyEsHxSvMmJALSY=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=Oz4l3p4q2gfh0dm8HfMuxEY88Ieo0OO2b1KZN+g8oWEr2q0TzCYZx0yTAHa/ZhK48KKYdhJRx7wYwHRzt2IfBD81sR15fyQ1/ia9gYuSS59cIrPSLx1+D5K+B8FpH88vxllsusU0em7FmVRMLyOcAFr3yuWGf+4uknVCSa36Rzw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=Sz3Uuzxg; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=GqiD2aP+; arc=none smtp.client-ip=103.168.172.146 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="Sz3Uuzxg"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="GqiD2aP+" Received: from compute1.internal (compute1.nyi.internal [10.202.2.41]) by mailfout.nyi.internal (Postfix) with ESMTP id 369EE138010B; Mon, 8 Apr 2024 08:24:05 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute1.internal (MEProxy); Mon, 08 Apr 2024 08:24:05 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1712579045; x=1712665445; bh=6VkoQ0LeaP fr1rq2ORipEk5JxrHckoUW4dtch2bJmtw=; b=Sz3UuzxgsyenmdfN8I1Fr9Cw80 R719tLVfwvfpSGqHlcAbJE7diprSpLFJST1RVsjs/CarUjgqWaJws6YPHzv/kQKn XXxir0918BimHXrZrW4DZXMhqTc8XXLs0+BUXFEoceuGOU2FYlULM5f6NthGDjnD LJeE7FXE+FrduKWwVrRqkvtnbRZnpXUn99EYXMmkhzV9Wafsuy5B+pY4JqSSsZzb wNr629f9IqVdXfG+oEAPhgk4/H5pDomnIU9PQt1Z+2taAKAjSGAdONKP20Q7dkNa sCHD8gOc5hw1va6nYy7L1mZu3gL1tiGcQWPm+k5kBtate1BP6pEjlgT/HudA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1712579045; x=1712665445; bh=6VkoQ0LeaPfr1rq2ORipEk5JxrHc koUW4dtch2bJmtw=; b=GqiD2aP++n1/7qGtWPkHCL3r/npVILiJc2MCV9DwDWQt Rg7iXKU40zhW4nvO4vTIVwkgBvIZLTFjR1/RKcdU8KnjTIQ/XyPVZ6AENx2yOJeO pZknOKESVTwgDShVq5dykNgNadVjQ/b56iZmz0LSQwZPDC2cAmc/ICEZ1kYiIS+a SpKNKR1Q6Yr2mkBJi53dwgsquu+a7WG7bghNyPaQT90tYUI0PAQfSahqvLNCUVSF jhaNIXABtZKCZK4NV2EJnqHOdjnhVsCbTPTWyaKOWBeMc3wwsycJ6NfvhbxRYI9X E7+acpADiiaVCfCkd8j8DCQvVOub49ERxQ7TQpHuBw== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledrudegiedgheduucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpeffhffvvefukfhfgggtuggjsehgtd erredttdejnecuhfhrohhmpefrrghtrhhitghkucfuthgvihhnhhgrrhguthcuoehpshes phhkshdrihhmqeenucggtffrrghtthgvrhhnpeetueevhffhudefvdegieeuieelgedthf egfedtueevjeejtdfgjeehudejuedtudenucevlhhushhtvghrufhiiigvpedtnecurfgr rhgrmhepmhgrihhlfhhrohhmpehpshesphhkshdrihhm X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 8 Apr 2024 08:24:04 -0400 (EDT) Received: by localhost (OpenSMTPD) with ESMTPSA id 2fb0d75b (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Mon, 8 Apr 2024 12:23:57 +0000 (UTC) Date: Mon, 8 Apr 2024 14:24:01 +0200 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Junio C Hamano , Han-Wen Nienhuys Subject: [PATCH v3 03/11] refs/reftable: skip duplicate name checks Message-ID: <763c6fdfcd93651dac46de9c308c66f10d73d3d2.1712578837.git.ps@pks.im> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: All the callback functions which write refs in the reftable backend perform D/F conflict checks via `refs_verify_refname_available()`. But in reality we perform these D/F conflict checks a second time in the reftable library via `stack_check_addition()`. Interestingly, the code in the reftable library is inferior compared to the generic function: - It is slower than `refs_verify_refname_available()`, even though this can probably be optimized. - It does not provide a proper error message to the caller, and thus all the user would see is a generic "file/directory conflict" message. Disable the D/F conflict checks in the reftable library by setting the `skip_name_check` write option. This results in a non-negligible speedup when writing many refs. The following benchmark writes 100k refs in a single transaction: Benchmark 1: update-ref: create many refs (HEAD~) Time (mean ± σ): 3.241 s ± 0.040 s [User: 1.854 s, System: 1.381 s] Range (min … max): 3.185 s … 3.454 s 100 runs Benchmark 2: update-ref: create many refs (HEAD) Time (mean ± σ): 2.878 s ± 0.024 s [User: 1.506 s, System: 1.367 s] Range (min … max): 2.838 s … 2.960 s 100 runs Summary update-ref: create many refs (HEAD~) ran 1.13 ± 0.02 times faster than update-ref: create many refs (HEAD) Signed-off-by: Patrick Steinhardt --- refs/reftable-backend.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c index 8a54b0d8b2..7515dd3019 100644 --- a/refs/reftable-backend.c +++ b/refs/reftable-backend.c @@ -247,6 +247,11 @@ static struct ref_store *reftable_be_init(struct repository *repo, refs->write_options.block_size = 4096; refs->write_options.hash_id = repo->hash_algo->format_id; refs->write_options.default_permissions = calc_shared_perm(0666 & ~mask); + /* + * We verify names via `refs_verify_refname_available()`, so there is + * no need to do the same checks in the reftable library again. + */ + refs->write_options.skip_name_check = 1; /* * Set up the main reftable stack that is hosted in GIT_COMMON_DIR. From patchwork Mon Apr 8 12:24:06 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13621069 Received: from fhigh6-smtp.messagingengine.com (fhigh6-smtp.messagingengine.com [103.168.172.157]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4C9D16BB23 for ; Mon, 8 Apr 2024 12:24:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.157 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712579053; cv=none; b=nhiK9rUzGHMV12cwazDCyoTFPxlkr+P53jTxgXgUsW1m3g7WHa0avNEWlFMlmXHooCdazakASYOqBfWd3nIEKCQUO3Lz9yoS8fssFBZ+5lmnbHm7s7gds5UVJC5pUUChyDjIKK3o/BQbQNsMhzXwm7mG2kjvKm8ZLFbgeIz3ivs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712579053; c=relaxed/simple; bh=Hhx4iOIn1uMbW0oJpEhMXloqHv0u+EFrVisA3LyMgN8=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=AOOPOhdwH0NAJxciIWZvHl8GdHV9B1REJe8hzXHuYONRelWwJZHcoj2AAskjxkN6JSj7vwj9S4MUwflbIpo1wmSRtHiZS+7zUEyoCgcmOgcB+gFEFqAu6LZULDN9AeYO/77AItSoy/gfDNmfnJewkaAqVRkZbopmy27MGWKypeQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=ZVzMvOJJ; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=CHrSp6uM; arc=none smtp.client-ip=103.168.172.157 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="ZVzMvOJJ"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="CHrSp6uM" Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailfhigh.nyi.internal (Postfix) with ESMTP id 6979911400E0; Mon, 8 Apr 2024 08:24:10 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute5.internal (MEProxy); Mon, 08 Apr 2024 08:24:10 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1712579050; x=1712665450; bh=N1t4zfVEqR fazGQ3CfKlzKQ9nLul0NSJ41Crk3ULLCU=; b=ZVzMvOJJW4kRQLXI/KMKcJxVyY baT/ET1NHb8ug1ukxVpQ8PjoTx7+M7V0+p+dY1gAH1GpmAD3uZTvRLZe5yZT2OAx 5B9S6336jKBXzG70nXRA4TH3C533NMo8dDj5H1N2Kf25gErXg7udMojd6v2hzboB bS87WFe9k5IIWv7xfJCTcWP52qbD9w1o7C+bhtP92n6Ttz8fJ1qxRRx58DDALL3D E+EWauxWWXRSgSIRsx2/gv3wObpCitL5AUq4AuGRISbZeFv1YKQUKqXNmRral+xB HNg21Mi/+A/wqQm1TGIHQstdmBVff+TUi6FMW6Q+dL2ZYmxofHIMaQc6Jp9Q== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1712579050; x=1712665450; bh=N1t4zfVEqRfazGQ3CfKlzKQ9nLul 0NSJ41Crk3ULLCU=; b=CHrSp6uMK3aOhQ3rkCi+Uwg19fkB5NjSqI0tNdpUx+ix WRanXrsNkjx3d0Otr1Np5KfZhKL4JihaPbW4ih6XtRpCXlSKkv34zWWqViSlvbFW us/DwT6IgvNnrpVbAP3AMkOT5DF8Y0CuMxufo2V7SByu5eTEm/7mz4ozdUiU69Jv VsPmUsBqKL7IT1GZyy+PkMmWAKPXvL/Ouhy9tYbKfDZQVtNcOq/ZgNOfjJsuzwDP U8cWNTZCxFZ59pIIL0yYcoxqdQrYb0UJR0FauJODYS2/MkxXPvCAMUctKrOXI92G ttkiw+jTgCcRasd3WsU89kEbpGCLdOPwNCqmHL4cKw== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledrudegiedghedvucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpeffhffvvefukfhfgggtuggjsehgtd erredttddvnecuhfhrohhmpefrrghtrhhitghkucfuthgvihhnhhgrrhguthcuoehpshes phhkshdrihhmqeenucggtffrrghtthgvrhhnpeegiefhheekudeuueefjeeuteefleelje ekudekhfeiffejveejheefleelheffveenucffohhmrghinhepghhoohhglhgvrdgtohhm necuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepphhsse hpkhhsrdhimh X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 8 Apr 2024 08:24:09 -0400 (EDT) Received: by localhost (OpenSMTPD) with ESMTPSA id be6e92dd (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Mon, 8 Apr 2024 12:24:02 +0000 (UTC) Date: Mon, 8 Apr 2024 14:24:06 +0200 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Junio C Hamano , Han-Wen Nienhuys Subject: [PATCH v3 04/11] reftable: remove name checks Message-ID: <2a5f07627a445f214940bf093487af9492e27684.1712578837.git.ps@pks.im> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: In the preceding commit we have disabled name checks in the "reftable" backend. These checks were responsible for verifying multiple things when writing records to the reftable stack: - Detecting file/directory conflicts. Starting with the preceding commits this is now handled by the reftable backend itself via `refs_verify_refname_available()`. - Validating refnames. This is handled by `check_refname_format()` in the generic ref transacton layer. The code in the reftable library is thus not used anymore and likely to bitrot over time. Remove it. Signed-off-by: Patrick Steinhardt --- Makefile | 2 - refs/reftable-backend.c | 5 - reftable/error.c | 2 - reftable/refname.c | 206 ------------------------------------- reftable/refname.h | 29 ------ reftable/refname_test.c | 101 ------------------ reftable/reftable-error.h | 3 - reftable/reftable-tests.h | 1 - reftable/reftable-writer.h | 4 - reftable/stack.c | 67 +----------- reftable/stack_test.c | 39 ------- t/helper/test-reftable.c | 1 - 12 files changed, 1 insertion(+), 459 deletions(-) delete mode 100644 reftable/refname.c delete mode 100644 reftable/refname.h delete mode 100644 reftable/refname_test.c diff --git a/Makefile b/Makefile index c43c1bd1a0..05e3d37581 100644 --- a/Makefile +++ b/Makefile @@ -2655,7 +2655,6 @@ REFTABLE_OBJS += reftable/merged.o REFTABLE_OBJS += reftable/pq.o REFTABLE_OBJS += reftable/reader.o REFTABLE_OBJS += reftable/record.o -REFTABLE_OBJS += reftable/refname.o REFTABLE_OBJS += reftable/generic.o REFTABLE_OBJS += reftable/stack.o REFTABLE_OBJS += reftable/tree.o @@ -2668,7 +2667,6 @@ REFTABLE_TEST_OBJS += reftable/merged_test.o REFTABLE_TEST_OBJS += reftable/pq_test.o REFTABLE_TEST_OBJS += reftable/record_test.o REFTABLE_TEST_OBJS += reftable/readwrite_test.o -REFTABLE_TEST_OBJS += reftable/refname_test.o REFTABLE_TEST_OBJS += reftable/stack_test.o REFTABLE_TEST_OBJS += reftable/test_framework.o REFTABLE_TEST_OBJS += reftable/tree_test.o diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c index 7515dd3019..8a54b0d8b2 100644 --- a/refs/reftable-backend.c +++ b/refs/reftable-backend.c @@ -247,11 +247,6 @@ static struct ref_store *reftable_be_init(struct repository *repo, refs->write_options.block_size = 4096; refs->write_options.hash_id = repo->hash_algo->format_id; refs->write_options.default_permissions = calc_shared_perm(0666 & ~mask); - /* - * We verify names via `refs_verify_refname_available()`, so there is - * no need to do the same checks in the reftable library again. - */ - refs->write_options.skip_name_check = 1; /* * Set up the main reftable stack that is hosted in GIT_COMMON_DIR. diff --git a/reftable/error.c b/reftable/error.c index 0d1766735e..169f89d2f1 100644 --- a/reftable/error.c +++ b/reftable/error.c @@ -27,8 +27,6 @@ const char *reftable_error_str(int err) return "misuse of the reftable API"; case REFTABLE_ZLIB_ERROR: return "zlib failure"; - case REFTABLE_NAME_CONFLICT: - return "file/directory conflict"; case REFTABLE_EMPTY_TABLE_ERROR: return "wrote empty table"; case REFTABLE_REFNAME_ERROR: diff --git a/reftable/refname.c b/reftable/refname.c deleted file mode 100644 index bbfde15754..0000000000 --- a/reftable/refname.c +++ /dev/null @@ -1,206 +0,0 @@ -/* - Copyright 2020 Google LLC - - Use of this source code is governed by a BSD-style - license that can be found in the LICENSE file or at - https://developers.google.com/open-source/licenses/bsd -*/ - -#include "system.h" -#include "reftable-error.h" -#include "basics.h" -#include "refname.h" -#include "reftable-iterator.h" - -struct refname_needle_lesseq_args { - char **haystack; - const char *needle; -}; - -static int refname_needle_lesseq(size_t k, void *_args) -{ - struct refname_needle_lesseq_args *args = _args; - return strcmp(args->needle, args->haystack[k]) <= 0; -} - -static int modification_has_ref(struct modification *mod, const char *name) -{ - struct reftable_ref_record ref = { NULL }; - int err = 0; - - if (mod->add_len > 0) { - struct refname_needle_lesseq_args args = { - .haystack = mod->add, - .needle = name, - }; - size_t idx = binsearch(mod->add_len, refname_needle_lesseq, &args); - if (idx < mod->add_len && !strcmp(mod->add[idx], name)) - return 0; - } - - if (mod->del_len > 0) { - struct refname_needle_lesseq_args args = { - .haystack = mod->del, - .needle = name, - }; - size_t idx = binsearch(mod->del_len, refname_needle_lesseq, &args); - if (idx < mod->del_len && !strcmp(mod->del[idx], name)) - return 1; - } - - err = reftable_table_read_ref(&mod->tab, name, &ref); - reftable_ref_record_release(&ref); - return err; -} - -static void modification_release(struct modification *mod) -{ - /* don't delete the strings themselves; they're owned by ref records. - */ - FREE_AND_NULL(mod->add); - FREE_AND_NULL(mod->del); - mod->add_len = 0; - mod->del_len = 0; -} - -static int modification_has_ref_with_prefix(struct modification *mod, - const char *prefix) -{ - struct reftable_iterator it = { NULL }; - struct reftable_ref_record ref = { NULL }; - int err = 0; - - if (mod->add_len > 0) { - struct refname_needle_lesseq_args args = { - .haystack = mod->add, - .needle = prefix, - }; - size_t idx = binsearch(mod->add_len, refname_needle_lesseq, &args); - if (idx < mod->add_len && - !strncmp(prefix, mod->add[idx], strlen(prefix))) - goto done; - } - err = reftable_table_seek_ref(&mod->tab, &it, prefix); - if (err) - goto done; - - while (1) { - err = reftable_iterator_next_ref(&it, &ref); - if (err) - goto done; - - if (mod->del_len > 0) { - struct refname_needle_lesseq_args args = { - .haystack = mod->del, - .needle = ref.refname, - }; - size_t idx = binsearch(mod->del_len, refname_needle_lesseq, &args); - if (idx < mod->del_len && - !strcmp(ref.refname, mod->del[idx])) - continue; - } - - if (strncmp(ref.refname, prefix, strlen(prefix))) { - err = 1; - goto done; - } - err = 0; - goto done; - } - -done: - reftable_ref_record_release(&ref); - reftable_iterator_destroy(&it); - return err; -} - -static int validate_refname(const char *name) -{ - while (1) { - char *next = strchr(name, '/'); - if (!*name) { - return REFTABLE_REFNAME_ERROR; - } - if (!next) { - return 0; - } - if (next - name == 0 || (next - name == 1 && *name == '.') || - (next - name == 2 && name[0] == '.' && name[1] == '.')) - return REFTABLE_REFNAME_ERROR; - name = next + 1; - } - return 0; -} - -int validate_ref_record_addition(struct reftable_table tab, - struct reftable_ref_record *recs, size_t sz) -{ - struct modification mod = { - .tab = tab, - .add = reftable_calloc(sz, sizeof(*mod.add)), - .del = reftable_calloc(sz, sizeof(*mod.del)), - }; - int i = 0; - int err = 0; - for (; i < sz; i++) { - if (reftable_ref_record_is_deletion(&recs[i])) { - mod.del[mod.del_len++] = recs[i].refname; - } else { - mod.add[mod.add_len++] = recs[i].refname; - } - } - - err = modification_validate(&mod); - modification_release(&mod); - return err; -} - -static void strbuf_trim_component(struct strbuf *sl) -{ - while (sl->len > 0) { - int is_slash = (sl->buf[sl->len - 1] == '/'); - strbuf_setlen(sl, sl->len - 1); - if (is_slash) - break; - } -} - -int modification_validate(struct modification *mod) -{ - struct strbuf slashed = STRBUF_INIT; - int err = 0; - int i = 0; - for (; i < mod->add_len; i++) { - err = validate_refname(mod->add[i]); - if (err) - goto done; - strbuf_reset(&slashed); - strbuf_addstr(&slashed, mod->add[i]); - strbuf_addstr(&slashed, "/"); - - err = modification_has_ref_with_prefix(mod, slashed.buf); - if (err == 0) { - err = REFTABLE_NAME_CONFLICT; - goto done; - } - if (err < 0) - goto done; - - strbuf_reset(&slashed); - strbuf_addstr(&slashed, mod->add[i]); - while (slashed.len) { - strbuf_trim_component(&slashed); - err = modification_has_ref(mod, slashed.buf); - if (err == 0) { - err = REFTABLE_NAME_CONFLICT; - goto done; - } - if (err < 0) - goto done; - } - } - err = 0; -done: - strbuf_release(&slashed); - return err; -} diff --git a/reftable/refname.h b/reftable/refname.h deleted file mode 100644 index a24b40fcb4..0000000000 --- a/reftable/refname.h +++ /dev/null @@ -1,29 +0,0 @@ -/* - Copyright 2020 Google LLC - - Use of this source code is governed by a BSD-style - license that can be found in the LICENSE file or at - https://developers.google.com/open-source/licenses/bsd -*/ -#ifndef REFNAME_H -#define REFNAME_H - -#include "reftable-record.h" -#include "reftable-generic.h" - -struct modification { - struct reftable_table tab; - - char **add; - size_t add_len; - - char **del; - size_t del_len; -}; - -int validate_ref_record_addition(struct reftable_table tab, - struct reftable_ref_record *recs, size_t sz); - -int modification_validate(struct modification *mod); - -#endif diff --git a/reftable/refname_test.c b/reftable/refname_test.c deleted file mode 100644 index b9cc62554e..0000000000 --- a/reftable/refname_test.c +++ /dev/null @@ -1,101 +0,0 @@ -/* -Copyright 2020 Google LLC - -Use of this source code is governed by a BSD-style -license that can be found in the LICENSE file or at -https://developers.google.com/open-source/licenses/bsd -*/ - -#include "basics.h" -#include "block.h" -#include "blocksource.h" -#include "reader.h" -#include "record.h" -#include "refname.h" -#include "reftable-error.h" -#include "reftable-writer.h" -#include "system.h" - -#include "test_framework.h" -#include "reftable-tests.h" - -struct testcase { - char *add; - char *del; - int error_code; -}; - -static void test_conflict(void) -{ - struct reftable_write_options opts = { 0 }; - struct strbuf buf = STRBUF_INIT; - struct reftable_writer *w = - reftable_new_writer(&strbuf_add_void, &noop_flush, &buf, &opts); - struct reftable_ref_record rec = { - .refname = "a/b", - .value_type = REFTABLE_REF_SYMREF, - .value.symref = "destination", /* make sure it's not a symref. - */ - .update_index = 1, - }; - int err; - int i; - struct reftable_block_source source = { NULL }; - struct reftable_reader *rd = NULL; - struct reftable_table tab = { NULL }; - struct testcase cases[] = { - { "a/b/c", NULL, REFTABLE_NAME_CONFLICT }, - { "b", NULL, 0 }, - { "a", NULL, REFTABLE_NAME_CONFLICT }, - { "a", "a/b", 0 }, - - { "p/", NULL, REFTABLE_REFNAME_ERROR }, - { "p//q", NULL, REFTABLE_REFNAME_ERROR }, - { "p/./q", NULL, REFTABLE_REFNAME_ERROR }, - { "p/../q", NULL, REFTABLE_REFNAME_ERROR }, - - { "a/b/c", "a/b", 0 }, - { NULL, "a//b", 0 }, - }; - reftable_writer_set_limits(w, 1, 1); - - err = reftable_writer_add_ref(w, &rec); - EXPECT_ERR(err); - - err = reftable_writer_close(w); - EXPECT_ERR(err); - reftable_writer_free(w); - - block_source_from_strbuf(&source, &buf); - err = reftable_new_reader(&rd, &source, "filename"); - EXPECT_ERR(err); - - reftable_table_from_reader(&tab, rd); - - for (i = 0; i < ARRAY_SIZE(cases); i++) { - struct modification mod = { - .tab = tab, - }; - - if (cases[i].add) { - mod.add = &cases[i].add; - mod.add_len = 1; - } - if (cases[i].del) { - mod.del = &cases[i].del; - mod.del_len = 1; - } - - err = modification_validate(&mod); - EXPECT(err == cases[i].error_code); - } - - reftable_reader_free(rd); - strbuf_release(&buf); -} - -int refname_test_main(int argc, const char *argv[]) -{ - RUN_TEST(test_conflict); - return 0; -} diff --git a/reftable/reftable-error.h b/reftable/reftable-error.h index 4c457aaaf8..3a5f5b92c6 100644 --- a/reftable/reftable-error.h +++ b/reftable/reftable-error.h @@ -48,9 +48,6 @@ enum reftable_error { /* Wrote a table without blocks. */ REFTABLE_EMPTY_TABLE_ERROR = -8, - /* Dir/file conflict. */ - REFTABLE_NAME_CONFLICT = -9, - /* Invalid ref name. */ REFTABLE_REFNAME_ERROR = -10, diff --git a/reftable/reftable-tests.h b/reftable/reftable-tests.h index 0019cbcfa4..114cc3d053 100644 --- a/reftable/reftable-tests.h +++ b/reftable/reftable-tests.h @@ -14,7 +14,6 @@ int block_test_main(int argc, const char **argv); int merged_test_main(int argc, const char **argv); int pq_test_main(int argc, const char **argv); int record_test_main(int argc, const char **argv); -int refname_test_main(int argc, const char **argv); int readwrite_test_main(int argc, const char **argv); int stack_test_main(int argc, const char **argv); int tree_test_main(int argc, const char **argv); diff --git a/reftable/reftable-writer.h b/reftable/reftable-writer.h index 7c7cae5f99..3c119e2bbb 100644 --- a/reftable/reftable-writer.h +++ b/reftable/reftable-writer.h @@ -38,10 +38,6 @@ struct reftable_write_options { /* Default mode for creating files. If unset, use 0666 (+umask) */ unsigned int default_permissions; - /* boolean: do not check ref names for validity or dir/file conflicts. - */ - unsigned skip_name_check : 1; - /* boolean: copy log messages exactly. If unset, check that the message * is a single line, and add '\n' if missing. */ diff --git a/reftable/stack.c b/reftable/stack.c index 1ecf1b9751..e264df5ced 100644 --- a/reftable/stack.c +++ b/reftable/stack.c @@ -12,8 +12,8 @@ license that can be found in the LICENSE file or at #include "system.h" #include "merged.h" #include "reader.h" -#include "refname.h" #include "reftable-error.h" +#include "reftable-generic.h" #include "reftable-record.h" #include "reftable-merged.h" #include "writer.h" @@ -27,8 +27,6 @@ static int stack_write_compact(struct reftable_stack *st, struct reftable_writer *wr, size_t first, size_t last, struct reftable_log_expiry_config *config); -static int stack_check_addition(struct reftable_stack *st, - const char *new_tab_name); static void reftable_addition_close(struct reftable_addition *add); static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st, int reuse_open); @@ -781,10 +779,6 @@ int reftable_addition_add(struct reftable_addition *add, goto done; } - err = stack_check_addition(add->stack, get_tempfile_path(tab_file)); - if (err < 0) - goto done; - if (wr->min_update_index < add->next_update_index) { err = REFTABLE_API_ERROR; goto done; @@ -1340,65 +1334,6 @@ int reftable_stack_read_log(struct reftable_stack *st, const char *refname, return err; } -static int stack_check_addition(struct reftable_stack *st, - const char *new_tab_name) -{ - int err = 0; - struct reftable_block_source src = { NULL }; - struct reftable_reader *rd = NULL; - struct reftable_table tab = { NULL }; - struct reftable_ref_record *refs = NULL; - struct reftable_iterator it = { NULL }; - int cap = 0; - int len = 0; - int i = 0; - - if (st->config.skip_name_check) - return 0; - - err = reftable_block_source_from_file(&src, new_tab_name); - if (err < 0) - goto done; - - err = reftable_new_reader(&rd, &src, new_tab_name); - if (err < 0) - goto done; - - err = reftable_reader_seek_ref(rd, &it, ""); - if (err > 0) { - err = 0; - goto done; - } - if (err < 0) - goto done; - - while (1) { - struct reftable_ref_record ref = { NULL }; - err = reftable_iterator_next_ref(&it, &ref); - if (err > 0) - break; - if (err < 0) - goto done; - - REFTABLE_ALLOC_GROW(refs, len + 1, cap); - refs[len++] = ref; - } - - reftable_table_from_merged_table(&tab, reftable_stack_merged_table(st)); - - err = validate_ref_record_addition(tab, refs, len); - -done: - for (i = 0; i < len; i++) { - reftable_ref_record_release(&refs[i]); - } - - free(refs); - reftable_iterator_destroy(&it); - reftable_reader_free(rd); - return err; -} - static int is_table_name(const char *s) { const char *dot = strrchr(s, '.'); diff --git a/reftable/stack_test.c b/reftable/stack_test.c index 0dc9a44648..b88097c3b6 100644 --- a/reftable/stack_test.c +++ b/reftable/stack_test.c @@ -353,44 +353,6 @@ static void test_reftable_stack_transaction_api_performs_auto_compaction(void) clear_dir(dir); } -static void test_reftable_stack_validate_refname(void) -{ - struct reftable_write_options cfg = { 0 }; - struct reftable_stack *st = NULL; - int err; - char *dir = get_tmp_dir(__LINE__); - - int i; - struct reftable_ref_record ref = { - .refname = "a/b", - .update_index = 1, - .value_type = REFTABLE_REF_SYMREF, - .value.symref = "master", - }; - char *additions[] = { "a", "a/b/c" }; - - err = reftable_new_stack(&st, dir, cfg); - EXPECT_ERR(err); - - err = reftable_stack_add(st, &write_test_ref, &ref); - EXPECT_ERR(err); - - for (i = 0; i < ARRAY_SIZE(additions); i++) { - struct reftable_ref_record ref = { - .refname = additions[i], - .update_index = 1, - .value_type = REFTABLE_REF_SYMREF, - .value.symref = "master", - }; - - err = reftable_stack_add(st, &write_test_ref, &ref); - EXPECT(err == REFTABLE_NAME_CONFLICT); - } - - reftable_stack_destroy(st); - clear_dir(dir); -} - static int write_error(struct reftable_writer *wr, void *arg) { return *((int *)arg); @@ -1097,7 +1059,6 @@ int stack_test_main(int argc, const char *argv[]) RUN_TEST(test_reftable_stack_transaction_api_performs_auto_compaction); RUN_TEST(test_reftable_stack_update_index_check); RUN_TEST(test_reftable_stack_uptodate); - RUN_TEST(test_reftable_stack_validate_refname); RUN_TEST(test_sizes_to_segments); RUN_TEST(test_sizes_to_segments_all_equal); RUN_TEST(test_sizes_to_segments_empty); diff --git a/t/helper/test-reftable.c b/t/helper/test-reftable.c index 00237ef0d9..bae731669c 100644 --- a/t/helper/test-reftable.c +++ b/t/helper/test-reftable.c @@ -13,7 +13,6 @@ int cmd__reftable(int argc, const char **argv) readwrite_test_main(argc, argv); merged_test_main(argc, argv); stack_test_main(argc, argv); - refname_test_main(argc, argv); return 0; } From patchwork Mon Apr 8 12:24:11 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13621070 Received: from fhigh6-smtp.messagingengine.com (fhigh6-smtp.messagingengine.com [103.168.172.157]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6D2B871753 for ; Mon, 8 Apr 2024 12:24:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.157 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712579057; cv=none; b=XjmbZMJYLTjRMkl1oajlFs25CSHeBk+vlBocHBT74A1t0KLkIakZSUuvC/eZWNTTtZKf0vLRaGkz58ERgDjPsfa/upzWe5DJ2K2EppeXyAFzijRoMj6rTLAchOPQBkFyNi1yYRyeH/OcrpwCQwovRGYk2RHqiEkY+XxGGuDKltM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712579057; c=relaxed/simple; bh=L2q72KL0Vi29cvlSAQQGLoE+STH3emnnsBLdYbgS+2M=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=YNVDVZDy0vlkJgMPD9cI/FfZg7LlCxhghMxHj9hyDa8+6i1xG0rJZxjEdR+aasgCRgp5M+R0rmT54Q8jGbvxV0QJbEpoIJQWIke4wlbbZSAwH0xAa5jTgeJM82ndcRrheO+xzlC8wjrKTF1hTNp0CuNssYjUrq/svFKYvw3cx24= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=TFFQRzz+; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=oVCWU1+l; arc=none smtp.client-ip=103.168.172.157 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="TFFQRzz+"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="oVCWU1+l" Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailfhigh.nyi.internal (Postfix) with ESMTP id 8AF70114010A; Mon, 8 Apr 2024 08:24:14 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute5.internal (MEProxy); Mon, 08 Apr 2024 08:24:14 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1712579054; x=1712665454; bh=a6sLCF3Fil PnNH7WT8y8pvBvc70+AlJq82Xu2HY6chI=; b=TFFQRzz+OHsOHiQFCrd2Pzabvp AlD5LZqPTW13O+ExYdfX4trlkvXXDltTOyP1aJx+GAhsRkjfBUmd2E6IiiYMUJpA pEPYiBS+ChFdoiF5rjNTG579D1u/maFK+cAjSGgsJUZqbKjM3ZqHG7DDs2nvtFZn lDUrvFzyPvDgLM3qqwcNsAOkRxk8VFQIqraW5NswoLFhu1MWxfS4+bIQte/TH6rt krltCmxyEVPGEtRO4kW3VKwDNV6KcRlHNo1Wp5HuQU5Hm7NfObNpCY4Ck4SubDrl ou0RXY4nNGksq+XmXaJUI0PsdQm7X6pMDJPSUEYoK0Bq/QanEeJZu08HgpFQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1712579054; x=1712665454; bh=a6sLCF3FilPnNH7WT8y8pvBvc70+ AlJq82Xu2HY6chI=; b=oVCWU1+lpdqyQsR/qKXkPud98I90NdgtekqVbrKn4E9A mMUIJQGEX9KB30a04cQyb8WESGAoduZQFTQspNVhqQZnKN4QPOZfpQX1EMl7XAe2 yYp07+pyUu6y9jIjgCMWA/N0TflgyYjJZXjjP4xMuQ4nH7Ag8BthqXxc87tS72d/ HWKIbaRuRaGEFQftSibGZDAxu0iIgHbMIKPNwmF1zBmVHJ6Ic3U7CGu7TZsZaIR9 cde3YNyQjLYHqjHuMZE+dTh2Pja2ZthOKh9fnIgw+y+lxz1m5q4ZIhJIE7rm2kgh vFuSI92XigDlxvNZwCEglBYIHutT5qQg5yBmeM734A== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledrudegiedghedvucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpeffhffvvefukfhfgggtuggjsehgtd erredttdejnecuhfhrohhmpefrrghtrhhitghkucfuthgvihhnhhgrrhguthcuoehpshes phhkshdrihhmqeenucggtffrrghtthgvrhhnpeejvedufedttdeigfeuiedugfetiefgve etleefgedvheefffdvtdfffefhhfegteenucffohhmrghinhepuhhpuggrthgvrdhnrghm vgdpuhhpuggrthgvrdgvmhgrihhlpdhuphgurghtvgdrthiipdhuphgurghtvgdrnhgvfi enucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehpshes phhkshdrihhm X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 8 Apr 2024 08:24:13 -0400 (EDT) Received: by localhost (OpenSMTPD) with ESMTPSA id 6018bc4e (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Mon, 8 Apr 2024 12:24:07 +0000 (UTC) Date: Mon, 8 Apr 2024 14:24:11 +0200 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Junio C Hamano , Han-Wen Nienhuys Subject: [PATCH v3 05/11] refs/reftable: don't recompute committer ident Message-ID: <1ca7d9b6cff2eeceddf2dc1c8cb8f5d0216487cc.1712578837.git.ps@pks.im> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: In order to write reflog entries we need to compute the committer's identity as it gets encoded in the log record itself. The reftable backend does this via `git_committer_info()` and `split_ident_line()` in `fill_reftable_log_record()`, which use the Git config as well as environment variables to figure out the identity. While most callers would only call `fill_reftable_log_record()` once or twice, `write_transaction_table()` will call it as many times as there are queued ref updates. This can be quite a waste of effort when writing many refs with reflog entries in a single transaction. Refactor the code to pre-compute the committer information. This results in a small speedup when writing 100000 refs in a single transaction: Benchmark 1: update-ref: create many refs (HEAD~) Time (mean ± σ): 2.895 s ± 0.020 s [User: 1.516 s, System: 1.374 s] Range (min … max): 2.868 s … 2.983 s 100 runs Benchmark 2: update-ref: create many refs (HEAD) Time (mean ± σ): 2.845 s ± 0.017 s [User: 1.461 s, System: 1.379 s] Range (min … max): 2.803 s … 2.913 s 100 runs Summary update-ref: create many refs (HEAD) ran 1.02 ± 0.01 times faster than update-ref: create many refs (HEAD~) Signed-off-by: Patrick Steinhardt --- refs/reftable-backend.c | 52 +++++++++++++++++++++++++++-------------- 1 file changed, 34 insertions(+), 18 deletions(-) diff --git a/refs/reftable-backend.c b/refs/reftable-backend.c index 8a54b0d8b2..a5ef36ffa9 100644 --- a/refs/reftable-backend.c +++ b/refs/reftable-backend.c @@ -171,32 +171,30 @@ static int should_write_log(struct ref_store *refs, const char *refname) } } -static void fill_reftable_log_record(struct reftable_log_record *log) +static void fill_reftable_log_record(struct reftable_log_record *log, const struct ident_split *split) { - const char *info = git_committer_info(0); - struct ident_split split = {0}; + const char *tz_begin; int sign = 1; - if (split_ident_line(&split, info, strlen(info))) - BUG("failed splitting committer info"); - reftable_log_record_release(log); log->value_type = REFTABLE_LOG_UPDATE; log->value.update.name = - xstrndup(split.name_begin, split.name_end - split.name_begin); + xstrndup(split->name_begin, split->name_end - split->name_begin); log->value.update.email = - xstrndup(split.mail_begin, split.mail_end - split.mail_begin); - log->value.update.time = atol(split.date_begin); - if (*split.tz_begin == '-') { + xstrndup(split->mail_begin, split->mail_end - split->mail_begin); + log->value.update.time = atol(split->date_begin); + + tz_begin = split->tz_begin; + if (*tz_begin == '-') { sign = -1; - split.tz_begin++; + tz_begin++; } - if (*split.tz_begin == '+') { + if (*tz_begin == '+') { sign = 1; - split.tz_begin++; + tz_begin++; } - log->value.update.tz_offset = sign * atoi(split.tz_begin); + log->value.update.tz_offset = sign * atoi(tz_begin); } static int read_ref_without_reload(struct reftable_stack *stack, @@ -1018,9 +1016,15 @@ static int write_transaction_table(struct reftable_writer *writer, void *cb_data reftable_stack_merged_table(arg->stack); uint64_t ts = reftable_stack_next_update_index(arg->stack); struct reftable_log_record *logs = NULL; + struct ident_split committer_ident = {0}; size_t logs_nr = 0, logs_alloc = 0, i; + const char *committer_info; int ret = 0; + committer_info = git_committer_info(0); + if (split_ident_line(&committer_ident, committer_info, strlen(committer_info))) + BUG("failed splitting committer info"); + QSORT(arg->updates, arg->updates_nr, transaction_update_cmp); reftable_writer_set_limits(writer, ts, ts); @@ -1086,7 +1090,7 @@ static int write_transaction_table(struct reftable_writer *writer, void *cb_data log = &logs[logs_nr++]; memset(log, 0, sizeof(*log)); - fill_reftable_log_record(log); + fill_reftable_log_record(log, &committer_ident); log->update_index = ts; log->refname = xstrdup(u->refname); memcpy(log->value.update.new_hash, u->new_oid.hash, GIT_MAX_RAWSZ); @@ -1233,9 +1237,11 @@ static int write_create_symref_table(struct reftable_writer *writer, void *cb_da .value.symref = (char *)create->target, .update_index = ts, }; + struct ident_split committer_ident = {0}; struct reftable_log_record log = {0}; struct object_id new_oid; struct object_id old_oid; + const char *committer_info; int ret; reftable_writer_set_limits(writer, ts, ts); @@ -1263,7 +1269,11 @@ static int write_create_symref_table(struct reftable_writer *writer, void *cb_da !should_write_log(&create->refs->base, create->refname)) return 0; - fill_reftable_log_record(&log); + committer_info = git_committer_info(0); + if (split_ident_line(&committer_ident, committer_info, strlen(committer_info))) + BUG("failed splitting committer info"); + + fill_reftable_log_record(&log, &committer_ident); log.refname = xstrdup(create->refname); log.update_index = ts; log.value.update.message = xstrndup(create->logmsg, @@ -1339,10 +1349,16 @@ static int write_copy_table(struct reftable_writer *writer, void *cb_data) struct reftable_log_record old_log = {0}, *logs = NULL; struct reftable_iterator it = {0}; struct string_list skip = STRING_LIST_INIT_NODUP; + struct ident_split committer_ident = {0}; struct strbuf errbuf = STRBUF_INIT; size_t logs_nr = 0, logs_alloc = 0, i; + const char *committer_info; int ret; + committer_info = git_committer_info(0); + if (split_ident_line(&committer_ident, committer_info, strlen(committer_info))) + BUG("failed splitting committer info"); + if (reftable_stack_read_ref(arg->stack, arg->oldname, &old_ref)) { ret = error(_("refname %s not found"), arg->oldname); goto done; @@ -1417,7 +1433,7 @@ static int write_copy_table(struct reftable_writer *writer, void *cb_data) ALLOC_GROW(logs, logs_nr + 1, logs_alloc); memset(&logs[logs_nr], 0, sizeof(logs[logs_nr])); - fill_reftable_log_record(&logs[logs_nr]); + fill_reftable_log_record(&logs[logs_nr], &committer_ident); logs[logs_nr].refname = (char *)arg->newname; logs[logs_nr].update_index = deletion_ts; logs[logs_nr].value.update.message = @@ -1449,7 +1465,7 @@ static int write_copy_table(struct reftable_writer *writer, void *cb_data) */ ALLOC_GROW(logs, logs_nr + 1, logs_alloc); memset(&logs[logs_nr], 0, sizeof(logs[logs_nr])); - fill_reftable_log_record(&logs[logs_nr]); + fill_reftable_log_record(&logs[logs_nr], &committer_ident); logs[logs_nr].refname = (char *)arg->newname; logs[logs_nr].update_index = creation_ts; logs[logs_nr].value.update.message = From patchwork Mon Apr 8 12:24:16 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13621071 Received: from fhigh6-smtp.messagingengine.com (fhigh6-smtp.messagingengine.com [103.168.172.157]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 989EA7B3EB for ; Mon, 8 Apr 2024 12:24:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.157 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712579062; cv=none; b=dPCY7UplvTozc+5+7AWk9QhU+H+vBjuSTyoJW3VChkDkjlRAgoMoVUC/rzxqpBAuD97vj5yZQthIpiPGCTsY+IrdAS+ReAV7Xr/DiowTbeL1fdPbXr1+8y316/Y2xmAabyXA1HPJTZuImG1rOY2All4arbVqTwnqRaMQs8Qt7SU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712579062; c=relaxed/simple; bh=P20/KeVieAHPSyhApAVgZGqz2EEikN3uMe+hXD1H9Ow=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=BjYRJiYn51CZsWJyEj2oUNyUNNF6tYhJ6o4JRxY4srLAdNcFNAU+g5hOtUGmMi5VVRrcauvgIIMdhULPCL1OsbK55f3/kPK6Zcw4Oe6JPn+SK1IZB6vvuVFH742YwAcNORi+r5opz+lz+IGsP+UEozB7vRQEGMVP8vV5CGpAxok= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=FUiep67z; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=E5o2JYuc; arc=none smtp.client-ip=103.168.172.157 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="FUiep67z"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="E5o2JYuc" Received: from compute2.internal (compute2.nyi.internal [10.202.2.46]) by mailfhigh.nyi.internal (Postfix) with ESMTP id AAAC01140118; Mon, 8 Apr 2024 08:24:19 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute2.internal (MEProxy); Mon, 08 Apr 2024 08:24:19 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1712579059; x=1712665459; bh=RYdmMcH0l9 DAgvcx45US1gMYgfiMKVsYfGmgWtM9TzE=; b=FUiep67zLwmPF0QtUSTZIiNyaW amuKP9DQyTzT0r9PHrTEwtNw+ZPDgDg3Hlwb+PkYN1/yLak1Q+FYGRBooJVMfMah AOHjTwczLfvKxh9GUc8ibTxbSSq8AxuD908j54fhhrCdpepCk9a8n5wWLfaCJGas vBeLnHfSPM6KYp0X0b2lbKbrDr0LLoXG8lY6Dxxq1CD+791LAKRGFaTNYuLjSAZv quhGODi4x6MZUU+PSeNWIp7QeliHwLqY9gYCoAn3B4X4HOxll3lv6N+pAkEZRuiz RQpkTGP7eobHF7sfco60rvK4RUEwgNQaE0Jtxvrs+pyGYU+jh/nwJdwhbUuw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1712579059; x=1712665459; bh=RYdmMcH0l9DAgvcx45US1gMYgfiM KVsYfGmgWtM9TzE=; b=E5o2JYuc8yZT+sHiOdX+UhYbRIetdo6Cpmq/8Gi81pDs VOpWLjJR/+TujhgR+IF7LihemC5+NOG33Jk+gM3A/6qRKa0QGRbfAR0pRM/2o3DE UKND3H7/YRK6bavRqgFoPuNHSSN855ijuR4dL+Y986ggcEM3lxExecUsRDJmHqYA bkVF1LgUSKBhiIIfuTWtJAUbqkb8pCzK3RIl4DHYYiL67eWaYtlG9c0ufu7C0Ihl tXqQrDRmqo/accCzwLMfPBDzyf1YWbaKB0+TZi3h/YeSSTp31wee/bXkSvB1OmF+ rNu52DmGvpOhQzZbtvnToyoKPM6URSN3T+LBMD6xvQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledrudegiedgheduucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpeffhffvvefukfhfgggtuggjsehgtd erredttddvnecuhfhrohhmpefrrghtrhhitghkucfuthgvihhnhhgrrhguthcuoehpshes phhkshdrihhmqeenucggtffrrghtthgvrhhnpeeukedtvedtffevleejtefgheehieegke eluddvfeefgeehgfeltddtheejleffteenucevlhhushhtvghrufhiiigvpedunecurfgr rhgrmhepmhgrihhlfhhrohhmpehpshesphhkshdrihhm X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 8 Apr 2024 08:24:18 -0400 (EDT) Received: by localhost (OpenSMTPD) with ESMTPSA id 8078336e (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Mon, 8 Apr 2024 12:24:12 +0000 (UTC) Date: Mon, 8 Apr 2024 14:24:16 +0200 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Junio C Hamano , Han-Wen Nienhuys Subject: [PATCH v3 06/11] reftable/writer: refactorings for `writer_add_record()` Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Large parts of the reftable library do not conform to Git's typical code style. Refactor `writer_add_record()` such that it conforms better to it and add some documentation that explains some of its more intricate behaviour. Signed-off-by: Patrick Steinhardt --- reftable/writer.c | 38 +++++++++++++++++++++++++++----------- 1 file changed, 27 insertions(+), 11 deletions(-) diff --git a/reftable/writer.c b/reftable/writer.c index 1d9ff0fbfa..0ad5eb8887 100644 --- a/reftable/writer.c +++ b/reftable/writer.c @@ -209,7 +209,8 @@ static int writer_add_record(struct reftable_writer *w, struct reftable_record *rec) { struct strbuf key = STRBUF_INIT; - int err = -1; + int err; + reftable_record_key(rec, &key); if (strbuf_cmp(&w->last_key, &key) >= 0) { err = REFTABLE_API_ERROR; @@ -218,27 +219,42 @@ static int writer_add_record(struct reftable_writer *w, strbuf_reset(&w->last_key); strbuf_addbuf(&w->last_key, &key); - if (!w->block_writer) { + if (!w->block_writer) writer_reinit_block_writer(w, reftable_record_type(rec)); - } - assert(block_writer_type(w->block_writer) == reftable_record_type(rec)); + if (block_writer_type(w->block_writer) != reftable_record_type(rec)) + BUG("record of type %d added to writer of type %d", + reftable_record_type(rec), block_writer_type(w->block_writer)); - if (block_writer_add(w->block_writer, rec) == 0) { + /* + * Try to add the record to the writer. If this succeeds then we're + * done. Otherwise the block writer may have hit the block size limit + * and needs to be flushed. + */ + if (!block_writer_add(w->block_writer, rec)) { err = 0; goto done; } + /* + * The current block is full, so we need to flush and reinitialize the + * writer to start writing the next block. + */ err = writer_flush_block(w); - if (err < 0) { + if (err < 0) goto done; - } - writer_reinit_block_writer(w, reftable_record_type(rec)); + + /* + * Try to add the record to the writer again. If this still fails then + * the record does not fit into the block size. + * + * TODO: it would be great to have `block_writer_add()` return proper + * error codes so that we don't have to second-guess the failure + * mode here. + */ err = block_writer_add(w->block_writer, rec); - if (err == -1) { - /* we are writing into memory, so an error can only mean it - * doesn't fit. */ + if (err) { err = REFTABLE_ENTRY_TOO_BIG_ERROR; goto done; } From patchwork Mon Apr 8 12:24:20 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13621072 Received: from fhigh6-smtp.messagingengine.com (fhigh6-smtp.messagingengine.com [103.168.172.157]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CB7486E61B for ; Mon, 8 Apr 2024 12:24:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.157 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712579066; cv=none; b=cr4sLwI5v8XKafKbl2L+Giaek5Gd0Nz2I+MYTQZFfpzuYA8tnZFnyXg85bbPVIEDbd9tr550Pa+pbZ742H3WMgrjsJuOoWaRJurXdb2GRaMN+nKlXG5n/HFmkteamaVoD7PDa0a4x7dvLwP3f9ZXx8X7TuRLqN0n5f+2+IpsIoM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712579066; c=relaxed/simple; bh=uegcfR614/E+bVvj4mocBYU+y1tmIt+NxF/gibYhXBI=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=oloIbm7HHlGJUUdCLvDnnnBnI9e9bTiUqYSK8POK+oMisoLwy6C/eFM0U6qEMZDnFcEZP+1wzyl4t4D9Gd/dw5cqkJ2itFtkAGXs/SxaISLh6X13qNBAwKBbhhe/zt5CGicFNlKpluWKPE4WgknQAIS9TeKpKphaCZZNfvbFWTM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=Sop9FdP5; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=OesvhYR6; arc=none smtp.client-ip=103.168.172.157 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="Sop9FdP5"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="OesvhYR6" Received: from compute2.internal (compute2.nyi.internal [10.202.2.46]) by mailfhigh.nyi.internal (Postfix) with ESMTP id D4F3A1140121; Mon, 8 Apr 2024 08:24:23 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute2.internal (MEProxy); Mon, 08 Apr 2024 08:24:23 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1712579063; x=1712665463; bh=pUUn7C4MOC YYeERrdAVCms0J6BU7raCL+zMX9QM+5jA=; b=Sop9FdP5wNIpnxwK8DH/Tzy0s0 DujKHLqne7pGI2xvtlc6psNiPEM1lxcce+T1oT7ef4pUcmeLNKqkAgz4YTT+XYLm rOWS6+VWi6EDnZl8GsdftKEyPTBU5hlC6Ui1FEwqltTa6Un2oBwNKK1LeCr9NRYz jUsG3kuqhOLY2kwU0MSJXW+c3RiByyWyTx59HX6KsAzdG11KXi5O59X408vcuynd hVN4Rz+9AQbQ4wznnW3laT65RAbJwiRY/Vql/YoUGGZW4XQs/RkjZPRWSpPwzv4w i51RvuJaXE1VDyOmLJgHc86mR6Exar0/0wGEZu/JZJlPrVbSNuLn/xoVaTYA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1712579063; x=1712665463; bh=pUUn7C4MOCYYeERrdAVCms0J6BU7 raCL+zMX9QM+5jA=; b=OesvhYR6j7PNIf5sMj08Du5UKzscopC2lluLJ6Aomh0t S1B+I52SfrJTNAhC70k5UuV5esFxQbkjhkfo6ioRb+j3yYdbPDDG8HAPocnApdOb VBo3Lu6+yNo6alE1hszWHTIjceq9WKFqCBKYiRqjHUEZF1XkmPWseXBWunoKNAj0 9AW+/6pMAJi8ta3TitGuLyXd64iRJoGlCI0l2j8DoxmPhLZUooIwFQT+HnVrB060 svtCfp+TTjefBux12tiEJWvWTXixEwJexKmeYWSo38TwlJs6St1LlD/hCcDXEwSk H0UNWIbYhJ6cwfpD2dNgRVXvLvw1XMkRo5Koo/YFPg== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledrudegiedgheduucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpeffhffvvefukfhfgggtuggjsehgtd erredttddvnecuhfhrohhmpefrrghtrhhitghkucfuthgvihhnhhgrrhguthcuoehpshes phhkshdrihhmqeenucggtffrrghtthgvrhhnpeeukedtvedtffevleejtefgheehieegke eluddvfeefgeehgfeltddtheejleffteenucevlhhushhtvghrufhiiigvpedvnecurfgr rhgrmhepmhgrihhlfhhrohhmpehpshesphhkshdrihhm X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 8 Apr 2024 08:24:22 -0400 (EDT) Received: by localhost (OpenSMTPD) with ESMTPSA id 6f892efb (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Mon, 8 Apr 2024 12:24:17 +0000 (UTC) Date: Mon, 8 Apr 2024 14:24:20 +0200 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Junio C Hamano , Han-Wen Nienhuys Subject: [PATCH v3 07/11] reftable/writer: refactorings for `writer_flush_nonempty_block()` Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Large parts of the reftable library do not conform to Git's typical code style. Refactor `writer_flush_nonempty_block()` such that it conforms better to it and add some documentation that explains some of its more intricate behaviour. Signed-off-by: Patrick Steinhardt --- reftable/writer.c | 72 +++++++++++++++++++++++++++++------------------ 1 file changed, 44 insertions(+), 28 deletions(-) diff --git a/reftable/writer.c b/reftable/writer.c index 0ad5eb8887..d347ec4cc6 100644 --- a/reftable/writer.c +++ b/reftable/writer.c @@ -659,58 +659,74 @@ static void writer_clear_index(struct reftable_writer *w) w->index_cap = 0; } -static const int debug = 0; - static int writer_flush_nonempty_block(struct reftable_writer *w) { + struct reftable_index_record index_record = { + .last_key = STRBUF_INIT, + }; uint8_t typ = block_writer_type(w->block_writer); - struct reftable_block_stats *bstats = - writer_reftable_block_stats(w, typ); - uint64_t block_typ_off = (bstats->blocks == 0) ? w->next : 0; - int raw_bytes = block_writer_finish(w->block_writer); - int padding = 0; - int err = 0; - struct reftable_index_record ir = { .last_key = STRBUF_INIT }; + struct reftable_block_stats *bstats; + int raw_bytes, padding = 0, err; + uint64_t block_typ_off; + + /* + * Finish the current block. This will cause the block writer to emit + * restart points and potentially compress records in case we are + * writing a log block. + * + * Note that this is still happening in memory. + */ + raw_bytes = block_writer_finish(w->block_writer); if (raw_bytes < 0) return raw_bytes; - if (!w->opts.unpadded && typ != BLOCK_TYPE_LOG) { + /* + * By default, all records except for log records are padded to the + * block size. + */ + if (!w->opts.unpadded && typ != BLOCK_TYPE_LOG) padding = w->opts.block_size - raw_bytes; - } - if (block_typ_off > 0) { + bstats = writer_reftable_block_stats(w, typ); + block_typ_off = (bstats->blocks == 0) ? w->next : 0; + if (block_typ_off > 0) bstats->offset = block_typ_off; - } - bstats->entries += w->block_writer->entries; bstats->restarts += w->block_writer->restart_len; bstats->blocks++; w->stats.blocks++; - if (debug) { - fprintf(stderr, "block %c off %" PRIu64 " sz %d (%d)\n", typ, - w->next, raw_bytes, - get_be24(w->block + w->block_writer->header_off + 1)); - } - - if (w->next == 0) { + /* + * If this is the first block we're writing to the table then we need + * to also write the reftable header. + */ + if (!w->next) writer_write_header(w, w->block); - } err = padded_write(w, w->block, raw_bytes, padding); if (err < 0) return err; + /* + * Add an index record for every block that we're writing. If we end up + * having more than a threshold of index records we will end up writing + * an index section in `writer_finish_section()`. Each index record + * contains the last record key of the block it is indexing as well as + * the offset of that block. + * + * Note that this also applies when flushing index blocks, in which + * case we will end up with a multi-level index. + */ REFTABLE_ALLOC_GROW(w->index, w->index_len + 1, w->index_cap); - - ir.offset = w->next; - strbuf_reset(&ir.last_key); - strbuf_addbuf(&ir.last_key, &w->block_writer->last_key); - w->index[w->index_len] = ir; - + index_record.offset = w->next; + strbuf_reset(&index_record.last_key); + strbuf_addbuf(&index_record.last_key, &w->block_writer->last_key); + w->index[w->index_len] = index_record; w->index_len++; + w->next += padding + raw_bytes; w->block_writer = NULL; + return 0; } From patchwork Mon Apr 8 12:24:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13621073 Received: from fhigh6-smtp.messagingengine.com (fhigh6-smtp.messagingengine.com [103.168.172.157]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CEFDA6A8D2 for ; Mon, 8 Apr 2024 12:24:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.157 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712579071; cv=none; b=mN2OIMV2WnQwAJQ8haaf7rY8Jk1FW2tWuIpHbIXOfNbrFBdNTWQn5yR4qmlCy+vZ+vgSLf26BVq8+LEFIg7z0pDh9/sWgI7qQGrjIECgQ8lne0qiKTFABfTOsonvy3pI+5NDwq4IM5lZlPt6kwWhrIaxCHDf9zy4kgv9I54V1WI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712579071; c=relaxed/simple; bh=BiBspI9+mrc6TFPaES0CgNKrmc+90BP7R/kgywU2R+s=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=fgByTIXoo0shis1xEhVLwdq3WSbYsKSwIuENPzCGkaOTG1RbHpSrY/zLqcGDdLST5b5HMPU2+hNvfSrlxTntHxClZv6mkRYa1nycoYMfjFFhAwnITAQjjcgfMVFuz/OwmlErPbSVC042V9Ho2RXSqVCWoGpARH7qWVX2Ni+bBkY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=Ojkj3VHZ; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=nWlLplmp; arc=none smtp.client-ip=103.168.172.157 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="Ojkj3VHZ"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="nWlLplmp" Received: from compute7.internal (compute7.nyi.internal [10.202.2.48]) by mailfhigh.nyi.internal (Postfix) with ESMTP id EE374114011E; Mon, 8 Apr 2024 08:24:28 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute7.internal (MEProxy); Mon, 08 Apr 2024 08:24:28 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1712579068; x=1712665468; bh=27ye/T69+A 69LkuZbKou1uiTrdEOlDCBWBk9lXGjAiA=; b=Ojkj3VHZuM4ZznINKhiQ/nD5Kk YjAbdW/XGNVfNCWrJtLVy7OWtkMCuDzwGtxRNcVb8kwz1ORq7RzO8maK1bC+AIua 1t7JoEw1Irojbw5aJtmTLt0WMmoznH71NlgE33zVmftnlC9BcRnlj1z16JyRghyT TX5qL0+8mQHXwRzSGeiXN4jYzm1hZQVwCmluy+StUnkAyqBc9gbLBAKT8q6Q3oVQ QmRbIMByxUfFvcapTAZK/OLqGahv/KjVu0QYNc4UoDo3vryYxNIBXJOOW2+VJium srAwy7SajU6Q/pG29fGwXMiex5AGrP7R8M7HW6j1gHWt2YtEPPSDQZAz1M7g== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1712579068; x=1712665468; bh=27ye/T69+A69LkuZbKou1uiTrdEO lDCBWBk9lXGjAiA=; b=nWlLplmprMTeuHdh+8Fu5eYnpr/akjBVenU3i78Ip711 K08xpMue9iV2Xlp9xdeli/uRsEyT5BOZrmgBsSbje9YL3GrQw2H7EN5nPjvF0acr 1bQQljLD0uyTl3q2cIl9WI++aLLtN741gYD7KYl6DCoGJHF+VduhZ7cmdQ7a3HNl n0WtFkioibHEJkPJBJJF7ze5b2ySjaCiOCDk6/uKms9N3bxHUpJ7CG2PMnTD+fBs HZcJZsFoaKXWYPe+B4GOvu/Ra/vVQ2inL3P83Bfc7LAPdsfn8zNNJUQH1zE9AjzB PED1RC7R6VvH6jtlctCIuuVR/LHz5sTpiA5gwltKIw== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledrudegiedgheduucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpeffhffvvefukfhfgggtuggjsehgtd erredttddvnecuhfhrohhmpefrrghtrhhitghkucfuthgvihhnhhgrrhguthcuoehpshes phhkshdrihhmqeenucggtffrrghtthgvrhhnpeeukedtvedtffevleejtefgheehieegke eluddvfeefgeehgfeltddtheejleffteenucevlhhushhtvghrufhiiigvpedtnecurfgr rhgrmhepmhgrihhlfhhrohhmpehpshesphhkshdrihhm X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 8 Apr 2024 08:24:27 -0400 (EDT) Received: by localhost (OpenSMTPD) with ESMTPSA id 77064373 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Mon, 8 Apr 2024 12:24:21 +0000 (UTC) Date: Mon, 8 Apr 2024 14:24:25 +0200 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Junio C Hamano , Han-Wen Nienhuys Subject: [PATCH v3 08/11] reftable/writer: unify releasing memory Message-ID: <76d4a1f73b10c0a62707abea80f38fe18f4b8e7b.1712578837.git.ps@pks.im> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: There are two code paths which release memory of the reftable writer: - `reftable_writer_close()` releases internal state after it has written data. - `reftable_writer_free()` releases the block that was written to and the writer itself. Both code paths free different parts of the writer, and consequently the caller must make sure to call both. And while callers mostly do this already, this falls apart when a write failure causes the caller to skip calling `reftable_write_close()`. Introduce a new function `reftable_writer_release()` that releases all internal state and call it from both paths. Like this it is fine for the caller to not call `reftable_writer_close()`. Signed-off-by: Patrick Steinhardt --- reftable/writer.c | 23 +++++++++++++++-------- 1 file changed, 15 insertions(+), 8 deletions(-) diff --git a/reftable/writer.c b/reftable/writer.c index d347ec4cc6..4eeb736445 100644 --- a/reftable/writer.c +++ b/reftable/writer.c @@ -149,11 +149,21 @@ void reftable_writer_set_limits(struct reftable_writer *w, uint64_t min, w->max_update_index = max; } +static void writer_release(struct reftable_writer *w) +{ + if (w) { + reftable_free(w->block); + w->block = NULL; + block_writer_release(&w->block_writer_data); + w->block_writer = NULL; + writer_clear_index(w); + strbuf_release(&w->last_key); + } +} + void reftable_writer_free(struct reftable_writer *w) { - if (!w) - return; - reftable_free(w->block); + writer_release(w); reftable_free(w); } @@ -643,16 +653,13 @@ int reftable_writer_close(struct reftable_writer *w) } done: - /* free up memory. */ - block_writer_release(&w->block_writer_data); - writer_clear_index(w); - strbuf_release(&w->last_key); + writer_release(w); return err; } static void writer_clear_index(struct reftable_writer *w) { - for (size_t i = 0; i < w->index_len; i++) + for (size_t i = 0; w->index && i < w->index_len; i++) strbuf_release(&w->index[i].last_key); FREE_AND_NULL(w->index); w->index_len = 0; From patchwork Mon Apr 8 12:24:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13621074 Received: from fout3-smtp.messagingengine.com (fout3-smtp.messagingengine.com [103.168.172.146]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 065206F07E for ; Mon, 8 Apr 2024 12:24:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.146 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712579076; cv=none; b=tQ/qmRd1EZzKu91xyhjHqQ8Wl/7gws6hAaSxKB0ffJ2RtCA6motxpPV/uqqGo/zsnyV+tKAD9S8c5qVknw5p4RsP7nGZMRp9ClpLnBecTWm1757CsJInM14hPdTf0cManqDT9vm2FSeYjLCk8IeFwIUnjbWSi05SrHMH0Hs8oW0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712579076; c=relaxed/simple; bh=AO5CrgfjljE4bryFD0f5JDpPpjfmEkyjdnqHtqfWWow=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=kOAYSLMfWz+7UJlIxsP40K6B1WzdHDjBKjItyRMyfXheTZenB1T82EwCE3NbjBFZyo+90MEWulE78HEr46r++cHpyVfVApRyeTJ+EgraYg+smKO28JVQbBiHTv6AuBjQQ8I03yPqwwp1Kk7Nq2YMsKY84fjcuYOf530rNuzX7pI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=VCQXthNM; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=I1Yw12KQ; arc=none smtp.client-ip=103.168.172.146 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="VCQXthNM"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="I1Yw12KQ" Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailfout.nyi.internal (Postfix) with ESMTP id 1A4951380108; Mon, 8 Apr 2024 08:24:34 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute5.internal (MEProxy); Mon, 08 Apr 2024 08:24:34 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1712579074; x=1712665474; bh=NNdXD3mgba Vw2Y+KS7JZpW31ipNkI+XAR16SwSi+EDY=; b=VCQXthNMww1bcNk//XrEwmiHJ8 +Mf184kI4HhPhf78Go7pAEKbdEkGkQTfUbt8bEmfi4W1NubG4rEJyKgwnlXqNHh4 v8QK/ocwXYSh8De62TiAJo3+5i50x/6rVFWx0YAZlmAzfolOA3uzM5kcvpRG0kFZ Rt8kF5OBaHWPn78MuhHV+qdxtAfBNYUXYln1eRd1kijL3zSjQ/vIzIUQdIXlg/IJ DfPdLjr3MO76D9x+KKT3FztPcIFce5WPCrAzZQoz/oWMc0OfgqSntEWvvnMUs/OE nh0QDYPT3bS3X38aFfsul5RLala2YfLiMPlMgGJBPWol/drgqU6wcuQA/wmQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1712579074; x=1712665474; bh=NNdXD3mgbaVw2Y+KS7JZpW31ipNk I+XAR16SwSi+EDY=; b=I1Yw12KQqa/zcVgVBx7K/NCtMcPlDYv+EbLxZfo+rB08 hR49IAgnaEC0r0KcRFk+W7dOcHZ0b6+N79IhSrUarVq6ai31bh4tQIXK1xHpiurC 8a3XS83BDFPWMOxHYmiOW6I5opI5IrAur9UOPsWeuzX1kQG4IdY/KntLrmfwXjY1 SxjbV839Fl9qoDPAGZZEPw8maJ4C7V2K+DwnDYrSXMp8m1AwzvuofX4nEkJCoVH6 3cK6Z4dfMrzm2SFOCYHRvv8okPAyi35p21cfy6/kIcvcpGGLuIcrYxjbcGjecb4x DmRavHmGBC6epyxoRQKXDm+ZaVNHeF8xu1Nx3RGJsA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledrudegiedghedvucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpeffhffvvefukfhfgggtuggjsehgtd erredttddvnecuhfhrohhmpefrrghtrhhitghkucfuthgvihhnhhgrrhguthcuoehpshes phhkshdrihhmqeenucggtffrrghtthgvrhhnpeeukedtvedtffevleejtefgheehieegke eluddvfeefgeehgfeltddtheejleffteenucevlhhushhtvghrufhiiigvpedtnecurfgr rhgrmhepmhgrihhlfhhrohhmpehpshesphhkshdrihhm X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 8 Apr 2024 08:24:33 -0400 (EDT) Received: by localhost (OpenSMTPD) with ESMTPSA id effdc585 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Mon, 8 Apr 2024 12:24:26 +0000 (UTC) Date: Mon, 8 Apr 2024 14:24:30 +0200 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Junio C Hamano , Han-Wen Nienhuys Subject: [PATCH v3 09/11] reftable/writer: reset `last_key` instead of releasing it Message-ID: <722ab0ee281060ad43882471d472e31fc066a339.1712578837.git.ps@pks.im> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: The reftable writer tracks the last key that it has written so that it can properly compute the compressed prefix for the next record it is about to write. This last key must be reset whenever we move on to write the next block, which is done in `writer_reinit_block_writer()`. We do this by calling `strbuf_release()` though, which needlessly deallocates the underlying buffer. Convert the code to use `strbuf_reset()` instead, which saves one allocation per block we're about to write. This requires us to also amend `reftable_writer_free()` to release the buffer's memory now as we previously seemingly relied on `writer_reinit_block_writer()` to release the memory for us. Releasing memory here is the right thing to do anyway. While at it, convert a callsite where we truncate the buffer by setting its length to zero to instead use `strbuf_reset()`, too. Signed-off-by: Patrick Steinhardt --- reftable/writer.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/reftable/writer.c b/reftable/writer.c index 4eeb736445..10eccaaa07 100644 --- a/reftable/writer.c +++ b/reftable/writer.c @@ -109,7 +109,7 @@ static void writer_reinit_block_writer(struct reftable_writer *w, uint8_t typ) block_start = header_size(writer_version(w)); } - strbuf_release(&w->last_key); + strbuf_reset(&w->last_key); block_writer_init(&w->block_writer_data, typ, w->block, w->opts.block_size, block_start, hash_size(w->opts.hash_id)); @@ -478,7 +478,7 @@ static int writer_finish_section(struct reftable_writer *w) bstats->max_index_level = max_level; /* Reinit lastKey, as the next section can start with any key. */ - w->last_key.len = 0; + strbuf_reset(&w->last_key); return 0; } From patchwork Mon Apr 8 12:24:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13621075 Received: from fout3-smtp.messagingengine.com (fout3-smtp.messagingengine.com [103.168.172.146]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4415C6F08B for ; Mon, 8 Apr 2024 12:24:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.146 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712579081; cv=none; b=KMcyyJNs+1YNWyt7TL1GXtwcIuJmf28sGxa4Z9L/bV+eTTYPfBuLuECmTYvpaKHe76qKF71CgnpuD9ZE7HeA3HeRumigg8m6XJWEoiFLv84FPDJdViQI1a+edBzDtFQPNTIxol13KAbF+qBmnIaYI3Y+f/IIcB7Tg04/NjgIX88= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712579081; c=relaxed/simple; bh=0seTZIEkQNX0xnpW/2R8vF4WeAs4mo3LNMBfOqqUoQc=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=sNQdbxfNWJLsFG8aCWgbP1/SijiGFr6+asH62Mh9whgH+me4igpynNevHJYxBrjnqwTplMgV3A8cQuTM3L4FTThhFpWvOSVAGsuwb/7ALwncgQ53pdhZOnIa38nQYe8yMUv1R14AcfybhPJXN7Rf6AIr1URHsLHD6dP5qXWp7DI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=XPBfg2cR; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=roMoE4Vg; arc=none smtp.client-ip=103.168.172.146 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="XPBfg2cR"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="roMoE4Vg" Received: from compute5.internal (compute5.nyi.internal [10.202.2.45]) by mailfout.nyi.internal (Postfix) with ESMTP id 3987E13800D8; Mon, 8 Apr 2024 08:24:39 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute5.internal (MEProxy); Mon, 08 Apr 2024 08:24:39 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1712579079; x=1712665479; bh=FyYqKNeGmX 3sV0fNB4gMa3tfUfwXDLWIqEZlYfONt6A=; b=XPBfg2cRIMjjROLimV2+TajCbv yZ6GDudWmLRD5GsgmmaKtwB7PkTiXb+AYnvcCmsUXauHU4T1rASeUzAIVmmbQ5q+ MxPJuXhzblvIzk/moshoQDDLDPinKm7dbRqbDS8ytN+DYOpdA5iIpbUYpHM46+F0 5dPWrSGvPnxInFFbiyUuuvwU7RiiJNXvTcTuHDO/Q/y1PnGca9FOiGKpik8PWpK6 vSFlfV2iSBOiX21S0NAshgCeiX9ZgVDPvCNziOT1HiQa9GFf93bsPYn7+4GVlF5O EALKb40Cq2ISjfKqDX+wrEp99oxV4fX0Jc+JvZslla11xXRkbwbX6z+oQL6Q== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1712579079; x=1712665479; bh=FyYqKNeGmX3sV0fNB4gMa3tfUfwX DLWIqEZlYfONt6A=; b=roMoE4Vg1Ejrcf4e0eWZvKX6OQI7biDBM0CEvXWp42N7 WICbg+jnbBYQOnN1iDClxoi/i3wPY/4CdUMjeTqlR7q1LTMgXs9WE5zW9JlZUWzi w9We0UFKZdubaS/tjx6+1SR8xfkB7JEU35vXemBpc+BtaalSepdhsQd2GBB96yUe s84WNc/pLiYXHiIJSa6QO+wLWIDhf6YYy3lgj1kgJxsVPrg+jELQ9tcLIVaXtTkd YhuzlY7+dDdfBhp2cMO4Nz/dSxobHStAZEbbJ+FQKHrn/2+xflGggwC665LUwsMw 6XCEvZb49Tj3Qu2Q1EQxnHLs5HwxkLfxEMF0qVNo6w== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledrudegiedghedvucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpeffhffvvefukfhfgggtuggjsehgtd erredttddvnecuhfhrohhmpefrrghtrhhitghkucfuthgvihhnhhgrrhguthcuoehpshes phhkshdrihhmqeenucggtffrrghtthgvrhhnpeeukedtvedtffevleejtefgheehieegke eluddvfeefgeehgfeltddtheejleffteenucevlhhushhtvghrufhiiigvpedunecurfgr rhgrmhepmhgrihhlfhhrohhmpehpshesphhkshdrihhm X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 8 Apr 2024 08:24:38 -0400 (EDT) Received: by localhost (OpenSMTPD) with ESMTPSA id 96f87336 (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Mon, 8 Apr 2024 12:24:32 +0000 (UTC) Date: Mon, 8 Apr 2024 14:24:35 +0200 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Junio C Hamano , Han-Wen Nienhuys Subject: [PATCH v3 10/11] reftable/block: reuse zstream when writing log blocks Message-ID: <962a96003b02ba21511da8ea820bbada2a767468.1712578837.git.ps@pks.im> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: While most reftable blocks are written to disk as-is, blocks for log records are compressed with zlib. To compress them we use `compress2()`, which is a simple wrapper around the more complex `zstream` interface that would require multiple function invocations. One downside of this interface is that `compress2()` will reallocate internal state of the `zstream` interface on every single invocation. Consequently, as we call `compress2()` for every single log block which we are about to write, this can lead to quite some memory allocation churn. Refactor the code so that the block writer reuses a `zstream`. This significantly reduces the number of bytes allocated when writing many refs in a single transaction, as demonstrated by the following benchmark that writes 100k refs in a single transaction. Before: HEAP SUMMARY: in use at exit: 671,931 bytes in 151 blocks total heap usage: 22,631,887 allocs, 22,631,736 frees, 1,854,670,793 bytes allocated After: HEAP SUMMARY: in use at exit: 671,931 bytes in 151 blocks total heap usage: 22,620,528 allocs, 22,620,377 frees, 1,245,549,984 bytes allocated Signed-off-by: Patrick Steinhardt --- reftable/block.c | 80 +++++++++++++++++++++++++++++++----------------- reftable/block.h | 1 + 2 files changed, 53 insertions(+), 28 deletions(-) diff --git a/reftable/block.c b/reftable/block.c index 298e8c56b9..d182561b4d 100644 --- a/reftable/block.c +++ b/reftable/block.c @@ -76,6 +76,10 @@ void block_writer_init(struct block_writer *bw, uint8_t typ, uint8_t *buf, bw->entries = 0; bw->restart_len = 0; bw->last_key.len = 0; + if (!bw->zstream) { + REFTABLE_CALLOC_ARRAY(bw->zstream, 1); + deflateInit(bw->zstream, 9); + } } uint8_t block_writer_type(struct block_writer *bw) @@ -139,39 +143,57 @@ int block_writer_finish(struct block_writer *w) w->next += 2; put_be24(w->buf + 1 + w->header_off, w->next); + /* + * Log records are stored zlib-compressed. Note that the compression + * also spans over the restart points we have just written. + */ if (block_writer_type(w) == BLOCK_TYPE_LOG) { int block_header_skip = 4 + w->header_off; - uLongf src_len = w->next - block_header_skip; - uLongf dest_cap = src_len * 1.001 + 12; - uint8_t *compressed; - - REFTABLE_ALLOC_ARRAY(compressed, dest_cap); - - while (1) { - uLongf out_dest_len = dest_cap; - int zresult = compress2(compressed, &out_dest_len, - w->buf + block_header_skip, - src_len, 9); - if (zresult == Z_BUF_ERROR && dest_cap < LONG_MAX) { - dest_cap *= 2; - compressed = - reftable_realloc(compressed, dest_cap); - if (compressed) - continue; - } - - if (Z_OK != zresult) { - reftable_free(compressed); - return REFTABLE_ZLIB_ERROR; - } - - memcpy(w->buf + block_header_skip, compressed, - out_dest_len); - w->next = out_dest_len + block_header_skip; + uLongf src_len = w->next - block_header_skip, compressed_len; + unsigned char *compressed; + int ret; + + ret = deflateReset(w->zstream); + if (ret != Z_OK) + return REFTABLE_ZLIB_ERROR; + + /* + * Precompute the upper bound of how many bytes the compressed + * data may end up with. Combined with `Z_FINISH`, `deflate()` + * is guaranteed to return `Z_STREAM_END`. + */ + compressed_len = deflateBound(w->zstream, src_len); + REFTABLE_ALLOC_ARRAY(compressed, compressed_len); + + w->zstream->next_out = compressed; + w->zstream->avail_out = compressed_len; + w->zstream->next_in = w->buf + block_header_skip; + w->zstream->avail_in = src_len; + + /* + * We want to perform all decompression in a single step, which + * is why we can pass Z_FINISH here. As we have precomputed the + * deflated buffer's size via `deflateBound()` this function is + * guaranteed to succeed according to the zlib documentation. + */ + ret = deflate(w->zstream, Z_FINISH); + if (ret != Z_STREAM_END) { reftable_free(compressed); - break; + return REFTABLE_ZLIB_ERROR; } + + /* + * Overwrite the uncompressed data we have already written and + * adjust the `next` pointer to point right after the + * compressed data. + */ + memcpy(w->buf + block_header_skip, compressed, + w->zstream->total_out); + w->next = w->zstream->total_out + block_header_skip; + + reftable_free(compressed); } + return w->next; } @@ -480,6 +502,8 @@ int block_reader_seek(struct block_reader *br, struct block_iter *it, void block_writer_release(struct block_writer *bw) { + deflateEnd(bw->zstream); + FREE_AND_NULL(bw->zstream); FREE_AND_NULL(bw->restarts); strbuf_release(&bw->last_key); /* the block is not owned. */ diff --git a/reftable/block.h b/reftable/block.h index 47acc62c0a..1375957fc8 100644 --- a/reftable/block.h +++ b/reftable/block.h @@ -18,6 +18,7 @@ license that can be found in the LICENSE file or at * allocation overhead. */ struct block_writer { + z_stream *zstream; uint8_t *buf; uint32_t block_size; From patchwork Mon Apr 8 12:24:40 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13621076 Received: from fhigh6-smtp.messagingengine.com (fhigh6-smtp.messagingengine.com [103.168.172.157]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 155436CDA6 for ; Mon, 8 Apr 2024 12:24:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.157 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712579086; cv=none; b=GM2OuiI9FDIa0sC7B6miP0b6YNjnQ4BcwZd0dqteo00Ni1CJ+mlY3LIZ4ukvNxFT0IMgG+8NbKxKg2t85jjknIAU0d4UssomiBZGH6CeRjy7BNkWUVjRli0hQUMRVHHl6jFPeVcLDuGMUJbNYXfQvb67gNQ0/oS/XuuIfCU6VXI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712579086; c=relaxed/simple; bh=7/H0xgSFeJ4iUUksFvwoIHmoIevf6MuSGHxoU7Lfdbk=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=DshK4qA0KFKRQiXr78+Qfq9krvVVkhU/aGqfLnP4aKfXRbx280bAuBnlu5y6zJyRgxoNzndFF2RzJUh3iz+73U5J8R1D2WIYwAOPJdbbdo7QgGxTciCi/2eD1HVOtAwcW/I7I0N86O4Ig27BafbAbg1BZBNfbJJ3LJlJFCepHD0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=e9bnhuvn; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=TuI1hPl6; arc=none smtp.client-ip=103.168.172.157 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="e9bnhuvn"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="TuI1hPl6" Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailfhigh.nyi.internal (Postfix) with ESMTP id 654AC1140126; Mon, 8 Apr 2024 08:24:44 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Mon, 08 Apr 2024 08:24:44 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm2; t=1712579084; x=1712665484; bh=VAr0clLvOF n6hHRIW2ybtPgtChyT4RNjjoRjx0LaHH0=; b=e9bnhuvnJh+k6dJYnZIjrJrJPs 4dhzCa8XaZ6DXGQO6WUVmw5e3zhifqzmH7ZRYh1mHisiosSJ+pLKfVPbChZ+nr/F iS4vCHV+H32AwFs7TIllqfNoQcMyECvxlKLq2ujhWAyTB3vE3oNrHHVrGamT3qEj IQay3zfGZDA48adOUiVsXtHlOb59Owbm2OCpG1j7UX3csenZMGZWCaZskZ/D2BNg t7Vq/yflokGg799/regUgnyW65yfS5dwHmRYsYV08b5Eu5ELXfjVfobagLCD4hW0 ow0lY3/9zgiQ6ZIAZY02dpmm26DuLqJ+v9e3Px+YEKRUuhHzs93HdWJZJyfA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; t=1712579084; x=1712665484; bh=VAr0clLvOFn6hHRIW2ybtPgtChyT 4RNjjoRjx0LaHH0=; b=TuI1hPl6KeWolaL0CaX/tK+cm2k40h0uqe4c0+/0kNe3 XcY8RdRNYDwtI/yiY9HIk9jKVp00KJ4Ow5foQnYNm7uB2PMP1SdysihS04r2+YiC /7XwFjVmkR+f2CazSnX+HyxNz7O3jFEl91a0HPi0UTxyUtM/zPx4kVT98IAsiahX Tz4Ir++Q7/EJGIqqMG3mbSs5OIMExM4aVWACYSQTIa2RKr1ebA/ZGYzx+mj5t8GT ARClKvScCb4n33dJeMyLwzD364zIsH3UztXCT3T7Ht60eIYA+BmJ4VhFPtRLklAJ 6CJ7LUFmB83ajaKHvh8ntQ/focmBoTrVn3cQeegljw== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvledrudegiedgheduucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpeffhffvvefukfhfgggtuggjsehgtd erredttddvnecuhfhrohhmpefrrghtrhhitghkucfuthgvihhnhhgrrhguthcuoehpshes phhkshdrihhmqeenucggtffrrghtthgvrhhnpeeukedtvedtffevleejtefgheehieegke eluddvfeefgeehgfeltddtheejleffteenucevlhhushhtvghrufhiiigvpedtnecurfgr rhgrmhepmhgrihhlfhhrohhmpehpshesphhkshdrihhm X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 8 Apr 2024 08:24:43 -0400 (EDT) Received: by localhost (OpenSMTPD) with ESMTPSA id 14d1959f (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Mon, 8 Apr 2024 12:24:37 +0000 (UTC) Date: Mon, 8 Apr 2024 14:24:40 +0200 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Junio C Hamano , Han-Wen Nienhuys Subject: [PATCH v3 11/11] reftable/block: reuse compressed array Message-ID: <323892841a3dc31cf0d4c5e614f9f0059a147c1d.1712578837.git.ps@pks.im> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Similar to the preceding commit, let's reuse the `compressed` array that we use to store compressed data in. This results in a small reduction in memory allocations when writing many refs. Before: HEAP SUMMARY: in use at exit: 671,931 bytes in 151 blocks total heap usage: 22,620,528 allocs, 22,620,377 frees, 1,245,549,984 bytes allocated After: HEAP SUMMARY: in use at exit: 671,931 bytes in 151 blocks total heap usage: 22,618,257 allocs, 22,618,106 frees, 1,236,351,528 bytes allocated So while the reduction in allocations isn't really all that big, it's a low hanging fruit and thus there isn't much of a reason not to pick it. Signed-off-by: Patrick Steinhardt --- reftable/block.c | 14 +++++--------- reftable/block.h | 3 +++ 2 files changed, 8 insertions(+), 9 deletions(-) diff --git a/reftable/block.c b/reftable/block.c index d182561b4d..fd494876b7 100644 --- a/reftable/block.c +++ b/reftable/block.c @@ -150,7 +150,6 @@ int block_writer_finish(struct block_writer *w) if (block_writer_type(w) == BLOCK_TYPE_LOG) { int block_header_skip = 4 + w->header_off; uLongf src_len = w->next - block_header_skip, compressed_len; - unsigned char *compressed; int ret; ret = deflateReset(w->zstream); @@ -163,9 +162,9 @@ int block_writer_finish(struct block_writer *w) * is guaranteed to return `Z_STREAM_END`. */ compressed_len = deflateBound(w->zstream, src_len); - REFTABLE_ALLOC_ARRAY(compressed, compressed_len); + REFTABLE_ALLOC_GROW(w->compressed, compressed_len, w->compressed_cap); - w->zstream->next_out = compressed; + w->zstream->next_out = w->compressed; w->zstream->avail_out = compressed_len; w->zstream->next_in = w->buf + block_header_skip; w->zstream->avail_in = src_len; @@ -177,21 +176,17 @@ int block_writer_finish(struct block_writer *w) * guaranteed to succeed according to the zlib documentation. */ ret = deflate(w->zstream, Z_FINISH); - if (ret != Z_STREAM_END) { - reftable_free(compressed); + if (ret != Z_STREAM_END) return REFTABLE_ZLIB_ERROR; - } /* * Overwrite the uncompressed data we have already written and * adjust the `next` pointer to point right after the * compressed data. */ - memcpy(w->buf + block_header_skip, compressed, + memcpy(w->buf + block_header_skip, w->compressed, w->zstream->total_out); w->next = w->zstream->total_out + block_header_skip; - - reftable_free(compressed); } return w->next; @@ -505,6 +500,7 @@ void block_writer_release(struct block_writer *bw) deflateEnd(bw->zstream); FREE_AND_NULL(bw->zstream); FREE_AND_NULL(bw->restarts); + FREE_AND_NULL(bw->compressed); strbuf_release(&bw->last_key); /* the block is not owned. */ } diff --git a/reftable/block.h b/reftable/block.h index 1375957fc8..657498014c 100644 --- a/reftable/block.h +++ b/reftable/block.h @@ -19,6 +19,9 @@ license that can be found in the LICENSE file or at */ struct block_writer { z_stream *zstream; + unsigned char *compressed; + size_t compressed_cap; + uint8_t *buf; uint32_t block_size;