From patchwork Mon Aug 5 13:07:42 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Patrick Steinhardt X-Patchwork-Id: 13753602 Received: from fout2-smtp.messagingengine.com (fout2-smtp.messagingengine.com [103.168.172.145]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 14E6B1553BD for ; Mon, 5 Aug 2024 13:07:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=103.168.172.145 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722863268; cv=none; b=oat8CfX7FD7B8JHk5/YiKM1z41U5kdwKYJc2YkQ+5m1p6xxMylX1zkbdzO90aH+7WY4Rjhp1rBvJ2fQCKRoRwPFelQ7kYZC0/BlxV98G48UyrXzzNGRfsb15BZAwunZIWCRgQJN+56aHBEhmZsOjT55DHaQ69e3amWccc/6Z+QE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722863268; c=relaxed/simple; bh=YtbwhYx/9xRYCzpbjSQEpYdvFaeI04O1SQW9Tew1puw=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=j51/b/3FZ4sqfY1WxbyNF9xulh87QiEuE7sSgsEDqOjBIv5tm5ggnoxl1UJwUfX1a5aywssz2+O8EBWNzILtZy4QYCRYyLvsys43KH15uxIlE951ZlQs8O9+KskehpfZW4XLflN51AYIpOV37IKyHVMLyLMaB043lFnlAbLrM3A= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=pks.im; spf=pass smtp.mailfrom=pks.im; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b=ISKD8sPH; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b=T8svHqvf; arc=none smtp.client-ip=103.168.172.145 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=pks.im Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pks.im Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=pks.im header.i=@pks.im header.b="ISKD8sPH"; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="T8svHqvf" Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailfout.nyi.internal (Postfix) with ESMTP id 47B84138FC79; Mon, 5 Aug 2024 09:07:46 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute3.internal (MEProxy); Mon, 05 Aug 2024 09:07:46 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pks.im; h=cc:cc :content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm3; t=1722863266; x=1722949666; bh=1bey1DrG++ XMtdLjud32glRAexDw5z0xLuxrq4e+ytk=; b=ISKD8sPH76cxbP5yBFsCjWdpdv n5FBzOduzsVcgyIKe2HFJwbBi+Ho76zZQnW9Jo4OUwgfw+EamjaIYiGttA4y0/QC 0m56qNSVNIIjTMfDupbcAmqVS598NZnjQ0cr7jGZn6TXOfb8FjdQRgdJISJmyED1 n7JGciq6pLHvERwECKsyZ/oOS7wOeDApQiAbqEcFMf9n10RuOsHM2xUX0ifYKRwn MvvyC5tMPvL9Xq9Vl6PW9/UuTM9j6Zjlz4m1xhfEeqtDWVyl+QcL7dZ1WTJq7Mn2 wnW0uIAV727h348bed3D2J6R0ZmNcefQiF9k0CtJWlAmRRs4fV1LTBond9OQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s= fm3; t=1722863266; x=1722949666; bh=1bey1DrG++XMtdLjud32glRAexDw 5z0xLuxrq4e+ytk=; b=T8svHqvf2L6TIWbbPKHsa3IlvZRgRSlEd2LowHIeLbq7 s3rLp8SK1Z+qHwC74KOIOZGDXQ73dFx7Q8XRSz5FVyk/LmfoRzWFeLtjDikmb8sR x2Z+0JB3qqxUHMfVDCzZLeFd0BUtcVRwC52PFkrypGXxsmucvLmVVVUsqd2JkPpp Op0P9qrGrSRsCZO9yIoCo1lQ+JqeOQmpIynYQGe3PTR3AiQqawLduZwsT0py9rWA RkjvRjq8WnRuHu4lSwOufRn0Sen1xLjsP30GWgOXtBdCS64B53M0kuOdXzYcpWXo daQu8iXWUkXy2/K1z0EFhNve/d6rx/Axy9raIok2mw== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeftddrkeeigdeiudcutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpeffhffvvefukfhfgggtuggjsehgtderredttddvnecuhfhrohhmpefrrghtrhhi tghkucfuthgvihhnhhgrrhguthcuoehpshesphhkshdrihhmqeenucggtffrrghtthgvrh hnpeeukedtvedtffevleejtefgheehieegkeeluddvfeefgeehgfeltddtheejleffteen ucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpehpshesph hkshdrihhmpdhnsggprhgtphhtthhopedt X-ME-Proxy: Feedback-ID: i197146af:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 5 Aug 2024 09:07:45 -0400 (EDT) Received: by vm-mail (OpenSMTPD) with ESMTPSA id 12d38d0a (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO); Mon, 5 Aug 2024 13:07:43 +0000 (UTC) Date: Mon, 5 Aug 2024 15:07:42 +0200 From: Patrick Steinhardt To: git@vger.kernel.org Cc: Justin Tobler , Junio C Hamano Subject: [PATCH v2 0/9] reftable: improvements and fixes for compaction Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Hi, this is the second version of my patch series that aims to improve the way reftable stack perform compaction. Changes compared to v1: - Extract a test helper function that sets up a stack with N tables containing refs. - Reuse file descriptor that we have already stored in a local variable instead of calling `lock_file_fd()` a second time. - Remove a no-op change in the last patch. - Add a comment explaining why we have to allocate N+1 many table names. - Some typo fixes. Thanks! Patrick Patrick Steinhardt (9): reftable/stack: refactor function to gather table sizes reftable/stack: extract function to setup stack with N tables reftable/stack: test compaction with already-locked tables reftable/stack: update stats on failed full compaction reftable/stack: simplify tracking of table locks reftable/stack: do not die when fsyncing lock file files reftable/stack: use lock_file when adding table to "tables.list" reftable/stack: fix corruption on concurrent compaction reftable/stack: handle locked tables during auto-compaction reftable/stack.c | 231 +++++++++++++++++++++++++++++-------- reftable/stack_test.c | 138 +++++++++++++++++----- t/t0610-reftable-basics.sh | 21 ++-- 3 files changed, 306 insertions(+), 84 deletions(-) Range-diff against v1: 1: 5d99191f5c = 1: 6d2b54ba8e reftable/stack: refactor function to gather table sizes -: ---------- > 2: ff17306cc0 reftable/stack: extract function to setup stack with N tables 2: 123fb9d80e ! 3: 63e64c8d82 reftable/stack: test compaction with already-locked tables @@ reftable/stack_test.c: static void test_reftable_stack_auto_compaction(void) + err = reftable_new_stack(&st, dir, &opts); + EXPECT_ERR(err); + -+ for (size_t i = 0; i < 5; i++) { -+ struct reftable_ref_record ref = { -+ .update_index = reftable_stack_next_update_index(st), -+ .value_type = REFTABLE_REF_VAL1, -+ .value.val1 = { i }, -+ }; -+ -+ strbuf_reset(&buf); -+ strbuf_addf(&buf, "refs/heads/branch-%04" PRIuMAX, (uintmax_t) i); -+ ref.refname = buf.buf; -+ -+ err = reftable_stack_add(st, &write_test_ref, &ref); -+ EXPECT_ERR(err); -+ } ++ write_n_ref_tables(st, &opts, 5); + EXPECT(st->merged->stack_len == 5); + + /* @@ reftable/stack_test.c: static void test_reftable_stack_add_performs_auto_compact + err = reftable_new_stack(&st, dir, &opts); + EXPECT_ERR(err); + -+ for (size_t i = 0; i < 3; i++) { -+ struct reftable_ref_record ref = { -+ .update_index = reftable_stack_next_update_index(st), -+ .value_type = REFTABLE_REF_VAL1, -+ .value.val1 = { i }, -+ }; -+ -+ strbuf_reset(&buf); -+ strbuf_addf(&buf, "refs/heads/branch-%04" PRIuMAX, (uintmax_t) i); -+ ref.refname = buf.buf; -+ -+ err = reftable_stack_add(st, &write_test_ref, &ref); -+ EXPECT_ERR(err); -+ } ++ write_n_ref_tables(st, &opts, 3); + EXPECT(st->merged->stack_len == 3); + + /* Lock one of the tables that we're about to compact. */ 3: 1fa7acbddf = 4: 1989dafcb4 reftable/stack: update stats on failed full compaction 4: 40d9f75cf2 = 5: 73e5d104eb reftable/stack: simplify tracking of table locks 5: 9233c36894 = 6: e411e14904 reftable/stack: do not die when fsyncing lock file files 6: 9703246156 ! 7: b868a518d6 reftable/stack: use lock_file when adding table to "tables.list" @@ Commit message compacting the stack, we manually handle the lock when adding a new table to it. While not wrong, it is at least inconsistent. - Refactor the code to consistenly lock "tables.list" via the `lock_file` + Refactor the code to consistently lock "tables.list" via the `lock_file` subsytem. Signed-off-by: Patrick Steinhardt @@ reftable/stack.c: int reftable_addition_commit(struct reftable_addition *add) goto done; } -- err = fsync_component(FSYNC_COMPONENT_REFERENCE, lock_file_fd); -+ err = fsync_component(FSYNC_COMPONENT_REFERENCE, -+ get_lock_file_fd(&add->tables_list_lock)); - if (err < 0) { - err = REFTABLE_IO_ERROR; - goto done; - } - - err = rename_tempfile(&add->lock_file, add->stack->list_file); + err = commit_lock_file(&add->tables_list_lock); if (err < 0) { 7: b746565bf0 ! 8: ff17414d26 reftable/stack: fix corruption on concurrent compaction @@ Commit message Letting concurrent processes modify the "tables.list" file while we are doing the compaction is very much part of the design and thus expected. After all, it may take some time to compact tables in the case where we - are compacting a lot or very large tables. + are compacting a lot of very large tables. But there is a bug in the code. Suppose we have two processes which are compacting two slices of the table. Given that we lock each of the @@ Commit message process will always impact what the other process needs to write to the "tables.list" file. - Right now , we do not check whether the "tables.list" has been - changed after we have locked it for the second time in (5). This has the + Right now, we do not check whether the "tables.list" has been changed + after we have locked it for the second time in (5). This has the consequence that we will always commit the old, cached in-core tables to disk without paying to respect what the other process has written. This scenario would then lead to data loss and corruption. @@ reftable/stack.c: static int stack_compact_range(struct reftable_stack *st, + last_to_replace = last + (new_offset - first); + first_to_replace = new_offset; + } else { ++ /* ++ * `fd_read_lines()` uses a `NULL` sentinel to indicate that ++ * the array is at its end. As we use `free_names()` to free ++ * the array, we need to include this sentinel value here and ++ * thus have to allocate `stack_len + 1` many entries. ++ */ + REFTABLE_CALLOC_ARRAY(names, st->merged->stack_len + 1); + for (size_t i = 0; i < st->merged->stack_len; i++) + names[i] = xstrdup(st->readers[i]->name); 8: dc22178307 ! 9: 1ef560feb1 reftable/stack: handle locked tables during auto-compaction @@ reftable/stack.c: static int stack_compact_range(struct reftable_stack *st, } /* -@@ reftable/stack.c: static int stack_compact_range(struct reftable_stack *st, - * delete the files after we closed them on Windows, so this needs to - * happen first. - */ -- err = reftable_stack_reload_maybe_reuse(st, first < last); -+ err = reftable_stack_reload_maybe_reuse(st, first_to_replace < last_to_replace); - if (err < 0) - goto done; - @@ reftable/stack.c: static int stack_compact_range(struct reftable_stack *st, static int stack_compact_range_stats(struct reftable_stack *st, base-commit: 39bf06adf96da25b87c9aa7d35a32ef3683eb4a4