From patchwork Thu Apr 13 05:57:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 13209779 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 53E33C77B71 for ; Thu, 13 Apr 2023 05:57:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229744AbjDMF5s (ORCPT ); Thu, 13 Apr 2023 01:57:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56388 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229899AbjDMF5p (ORCPT ); Thu, 13 Apr 2023 01:57:45 -0400 Received: from smtp-out1.suse.de (smtp-out1.suse.de [IPv6:2001:67c:2178:6::1c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7EF717EC2 for ; Wed, 12 Apr 2023 22:57:40 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 8864D218D9 for ; Thu, 13 Apr 2023 05:57:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1681365457; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fTVDV+vGTTsWCmigXtAzo59KIRHAZJ/aRzBgJHz0kKs=; b=NZwfrwYiFaW048YDGvDpoYLi88XApU816m59T2SXSSd4nCTjoF519GQzcuF0BLLHdjIP6g 6PI2ttv/uFOqCYFKraYEa00U5ZMZNmCF3hN7KVGZ7ljQkkCiwkSHkluPsxrqHmN4ymrETp A0o9cb1GLcwjJv+8GGqAGnRvCpe/Fbg= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id EA0761390E for ; Thu, 13 Apr 2023 05:57:36 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id uDKhLdCZN2QxVgAAMHmgww (envelope-from ) for ; Thu, 13 Apr 2023 05:57:36 +0000 From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH 1/2] btrfs: scrub: try harder to mark RAID56 block groups read-only Date: Thu, 13 Apr 2023 13:57:17 +0800 Message-Id: <7d6a0d16c3e67bd917bb2b51d91a632c69aad09b.1681364951.git.wqu@suse.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Currently we allows a block group not to be marked read-only for scrub. But for RAID56 block groups if we require the block group to be read-only, then we're allowed to use cached content from scrub stripe to reduce unnecessary raid56 reads. So this patch would: - Make btrfs_inc_block_group_ro() to try harder During my tests, for cases like btrfs/061 and btrfs/064, we can hit -ENOSPC from btrfs_inc_block_group_ro() calls during scrub. The reason is if we only have one single data chunk, and trying to scrub it, we won't have any space left for any newer data writes. But this check should be done by the caller, especially for scrub cases we only temporarily mark the chunk read-only. And newer data writes would always try to allocate a new data chunk when needed. - Return error for scrub if we failed to mark a RAID56 chunk read-only Signed-off-by: Qu Wenruo --- fs/btrfs/block-group.c | 16 ++++++++++++++-- fs/btrfs/scrub.c | 9 ++++++++- 2 files changed, 22 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c index 957ad1c31c4f..1da798752159 100644 --- a/fs/btrfs/block-group.c +++ b/fs/btrfs/block-group.c @@ -2818,10 +2818,22 @@ int btrfs_inc_block_group_ro(struct btrfs_block_group *cache, } ret = inc_block_group_ro(cache, 0); - if (!do_chunk_alloc || ret == -ETXTBSY) - goto unlock_out; if (!ret) goto out; + if (ret == -ETXTBSY) + goto unlock_out; + + /* + * Skip chunk alloc if the bg is SYSTEM, this is to avoid + * system chunk allocation storm to exhaust the system chunk + * array. + * Otherwise we still want to try our best to mark the + * block group read-only. + */ + if (!do_chunk_alloc && ret == -ENOSPC && + (cache->flags & BTRFS_BLOCK_GROUP_SYSTEM)) + goto unlock_out; + alloc_flags = btrfs_get_alloc_profile(fs_info, cache->space_info->flags); ret = btrfs_chunk_alloc(trans, alloc_flags, CHUNK_ALLOC_FORCE); if (ret < 0) diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c index 836725a19661..22ce3a628eb5 100644 --- a/fs/btrfs/scrub.c +++ b/fs/btrfs/scrub.c @@ -2518,13 +2518,20 @@ int scrub_enumerate_chunks(struct scrub_ctx *sctx, if (ret == 0) { ro_set = 1; - } else if (ret == -ENOSPC && !sctx->is_dev_replace) { + } else if (ret == -ENOSPC && !sctx->is_dev_replace && + !(cache->flags & BTRFS_BLOCK_GROUP_RAID56_MASK)) { /* * btrfs_inc_block_group_ro return -ENOSPC when it * failed in creating new chunk for metadata. * It is not a problem for scrub, because * metadata are always cowed, and our scrub paused * commit_transactions. + * + * For RAID56 chunks, we have to mark them read-only + * for scrub, as later we would use our own cache + * out of RAID56 realm. + * Thus we want the RAID56 bg to br marked RO to + * prevent RMW from screwing up out cache. */ ro_set = 0; } else if (ret == -ETXTBSY) {