From patchwork Mon Feb 22 06:33:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12097935 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DB837C433DB for ; Mon, 22 Feb 2021 06:34:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8E2C464EB4 for ; Mon, 22 Feb 2021 06:34:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229852AbhBVGe6 (ORCPT ); Mon, 22 Feb 2021 01:34:58 -0500 Received: from mx2.suse.de ([195.135.220.15]:48642 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229518AbhBVGe6 (ORCPT ); Mon, 22 Feb 2021 01:34:58 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1613975651; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Ey9yUxBVthR1KRIXXv4cpz056bkrd8cNuTD0/sJQ98M=; b=hcWWR6+btpFRc2UIMQKCqNW+CGSC0dND+TYPZUFm09ejvDVf3YFEzwXNTjvUg/f1Clf1cN vmEc2BEWYHSWmChuD0G6e+L++XE+PrkvVfDu+tl915EYWi148FBVV3IwqQF6v2Ux1Z5hw6 umBdWtLUu6ZS1YLc6tr8xAFQh2AgcBo= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 51405B11B for ; Mon, 22 Feb 2021 06:34:11 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH 01/12] btrfs: subpage: introduce helper for subpage dirty status Date: Mon, 22 Feb 2021 14:33:46 +0800 Message-Id: <20210222063357.92930-2-wqu@suse.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210222063357.92930-1-wqu@suse.com> References: <20210222063357.92930-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org This patch introduce the following functions to handle btrfs subpage dirty status: - btrfs_subpage_set_dirty() - btrfs_subpage_clear_dirty() - btrfs_subpage_test_dirty() Those helpers can only be called when the range is ensured to be inside the page. - btrfs_page_set_dirty() - btrfs_page_clear_dirty() - btrfs_page_test_dirty() Those helpers can handle both regular sector size and subpage without problem. Thus those would be used to replace PageDirty() related calls in later commits. There is one special point to note here, just like set_page_dirty() and clear_page_dirty_for_io(), btrfs_*page_set_dirty() and btrfs_*page_clear_dirty() must be called with page locked. Signed-off-by: Qu Wenruo --- fs/btrfs/subpage.c | 42 ++++++++++++++++++++++++++++++++++++++++++ fs/btrfs/subpage.h | 15 +++++++++++++++ 2 files changed, 57 insertions(+) diff --git a/fs/btrfs/subpage.c b/fs/btrfs/subpage.c index c69049e7daa9..16dd6fcd258d 100644 --- a/fs/btrfs/subpage.c +++ b/fs/btrfs/subpage.c @@ -220,6 +220,45 @@ void btrfs_subpage_clear_error(const struct btrfs_fs_info *fs_info, spin_unlock_irqrestore(&subpage->lock, flags); } +void btrfs_subpage_set_dirty(const struct btrfs_fs_info *fs_info, + struct page *page, u64 start, u32 len) +{ + struct btrfs_subpage *subpage = (struct btrfs_subpage *)page->private; + u16 tmp = btrfs_subpage_calc_bitmap(fs_info, page, start, len); + unsigned long flags; + + spin_lock_irqsave(&subpage->lock, flags); + subpage->dirty_bitmap |= tmp; + spin_unlock_irqrestore(&subpage->lock, flags); + set_page_dirty(page); +} + +bool btrfs_subpage_clear_and_test_dirty(const struct btrfs_fs_info *fs_info, + struct page *page, u64 start, u32 len) +{ + struct btrfs_subpage *subpage = (struct btrfs_subpage *)page->private; + u16 tmp = btrfs_subpage_calc_bitmap(fs_info, page, start, len); + unsigned long flags; + bool last = false; + + + spin_lock_irqsave(&subpage->lock, flags); + subpage->dirty_bitmap &= ~tmp; + if (subpage->dirty_bitmap == 0) + last = true; + spin_unlock_irqrestore(&subpage->lock, flags); + return last; +} + +void btrfs_subpage_clear_dirty(const struct btrfs_fs_info *fs_info, + struct page *page, u64 start, u32 len) +{ + bool last; + last = btrfs_subpage_clear_and_test_dirty(fs_info, page, start, len); + if (last) + clear_page_dirty_for_io(page); +} + /* * Unlike set/clear which is dependent on each page status, for test all bits * are tested in the same way. @@ -240,6 +279,7 @@ bool btrfs_subpage_test_##name(const struct btrfs_fs_info *fs_info, \ } IMPLEMENT_BTRFS_SUBPAGE_TEST_OP(uptodate); IMPLEMENT_BTRFS_SUBPAGE_TEST_OP(error); +IMPLEMENT_BTRFS_SUBPAGE_TEST_OP(dirty); /* * Note that, in selftests (extent-io-tests), we can have empty fs_info passed @@ -276,3 +316,5 @@ bool btrfs_page_test_##name(const struct btrfs_fs_info *fs_info, \ IMPLEMENT_BTRFS_PAGE_OPS(uptodate, SetPageUptodate, ClearPageUptodate, PageUptodate); IMPLEMENT_BTRFS_PAGE_OPS(error, SetPageError, ClearPageError, PageError); +IMPLEMENT_BTRFS_PAGE_OPS(dirty, set_page_dirty, clear_page_dirty_for_io, + PageDirty); diff --git a/fs/btrfs/subpage.h b/fs/btrfs/subpage.h index b86a4881475d..adaece5ce294 100644 --- a/fs/btrfs/subpage.h +++ b/fs/btrfs/subpage.h @@ -20,6 +20,7 @@ struct btrfs_subpage { spinlock_t lock; u16 uptodate_bitmap; u16 error_bitmap; + u16 dirty_bitmap; union { /* * Structures only used by metadata @@ -87,5 +88,19 @@ bool btrfs_page_test_##name(const struct btrfs_fs_info *fs_info, \ DECLARE_BTRFS_SUBPAGE_OPS(uptodate); DECLARE_BTRFS_SUBPAGE_OPS(error); +DECLARE_BTRFS_SUBPAGE_OPS(dirty); + +/* + * Extra clear_and_test function for subpage dirty bitmap. + * + * Return true if we're the last bits in the dirty_bitmap and clear the + * dirty_bitmap. + * Return false otherwise. + * + * NOTE: Callers should manually clear page dirty for true case, as we have + * extra handling for tree blocks. + */ +bool btrfs_subpage_clear_and_test_dirty(const struct btrfs_fs_info *fs_info, + struct page *page, u64 start, u32 len); #endif From patchwork Mon Feb 22 06:33:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12097937 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EEEADC433E0 for ; Mon, 22 Feb 2021 06:35:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9EA0B64EB4 for ; Mon, 22 Feb 2021 06:35:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229902AbhBVGfA (ORCPT ); Mon, 22 Feb 2021 01:35:00 -0500 Received: from mx2.suse.de ([195.135.220.15]:48652 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229518AbhBVGfA (ORCPT ); Mon, 22 Feb 2021 01:35:00 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1613975653; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=isxCZDriwNbADVw9hMWR1mBXxpw/F1Tb19jjrH4/NDU=; b=Bh08+17RIucRkQlgH/uz8pOTgYKWnoihQQi/lqVH4et6gCThmIPaWVmrXMcLWmpJbxPIIa 7dYT+0pt2LRjmmVzcdwJuIGsML0KH0JDQMaPe4Lduu9Z8Hp1p80/PM24nKnqJuATuGSMPr HGVs8rXZx2J6LXgzDejWLcDmf99nMWU= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 4A1E8B11D for ; Mon, 22 Feb 2021 06:34:13 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH 02/12] btrfs: subpage: introduce helper for subpage writeback status Date: Mon, 22 Feb 2021 14:33:47 +0800 Message-Id: <20210222063357.92930-3-wqu@suse.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210222063357.92930-1-wqu@suse.com> References: <20210222063357.92930-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org This patch introduce the following functions to handle btrfs subpage writeback status: - btrfs_subpage_set_writeback() - btrfs_subpage_clear_writeback() - btrfs_subpage_test_writeback() Those helpers can only be called when the range is ensured to be inside the page. - btrfs_page_set_writeback() - btrfs_page_clear_writeback() - btrfs_page_test_writeback() Those helpers can handle both regular sector size and subpage without problem. Signed-off-by: Qu Wenruo --- fs/btrfs/subpage.c | 30 ++++++++++++++++++++++++++++++ fs/btrfs/subpage.h | 2 ++ 2 files changed, 32 insertions(+) diff --git a/fs/btrfs/subpage.c b/fs/btrfs/subpage.c index 16dd6fcd258d..9bc212d16d3c 100644 --- a/fs/btrfs/subpage.c +++ b/fs/btrfs/subpage.c @@ -259,6 +259,33 @@ void btrfs_subpage_clear_dirty(const struct btrfs_fs_info *fs_info, clear_page_dirty_for_io(page); } +void btrfs_subpage_set_writeback(const struct btrfs_fs_info *fs_info, + struct page *page, u64 start, u32 len) +{ + struct btrfs_subpage *subpage = (struct btrfs_subpage *)page->private; + u16 tmp = btrfs_subpage_calc_bitmap(fs_info, page, start, len); + unsigned long flags; + + spin_lock_irqsave(&subpage->lock, flags); + subpage->writeback_bitmap |= tmp; + set_page_writeback(page); + spin_unlock_irqrestore(&subpage->lock, flags); +} + +void btrfs_subpage_clear_writeback(const struct btrfs_fs_info *fs_info, + struct page *page, u64 start, u32 len) +{ + struct btrfs_subpage *subpage = (struct btrfs_subpage *)page->private; + u16 tmp = btrfs_subpage_calc_bitmap(fs_info, page, start, len); + unsigned long flags; + + spin_lock_irqsave(&subpage->lock, flags); + subpage->writeback_bitmap &= ~tmp; + if (subpage->writeback_bitmap == 0) + end_page_writeback(page); + spin_unlock_irqrestore(&subpage->lock, flags); +} + /* * Unlike set/clear which is dependent on each page status, for test all bits * are tested in the same way. @@ -280,6 +307,7 @@ bool btrfs_subpage_test_##name(const struct btrfs_fs_info *fs_info, \ IMPLEMENT_BTRFS_SUBPAGE_TEST_OP(uptodate); IMPLEMENT_BTRFS_SUBPAGE_TEST_OP(error); IMPLEMENT_BTRFS_SUBPAGE_TEST_OP(dirty); +IMPLEMENT_BTRFS_SUBPAGE_TEST_OP(writeback); /* * Note that, in selftests (extent-io-tests), we can have empty fs_info passed @@ -318,3 +346,5 @@ IMPLEMENT_BTRFS_PAGE_OPS(uptodate, SetPageUptodate, ClearPageUptodate, IMPLEMENT_BTRFS_PAGE_OPS(error, SetPageError, ClearPageError, PageError); IMPLEMENT_BTRFS_PAGE_OPS(dirty, set_page_dirty, clear_page_dirty_for_io, PageDirty); +IMPLEMENT_BTRFS_PAGE_OPS(writeback, set_page_writeback, end_page_writeback, + PageWriteback); diff --git a/fs/btrfs/subpage.h b/fs/btrfs/subpage.h index adaece5ce294..fe43267e31f3 100644 --- a/fs/btrfs/subpage.h +++ b/fs/btrfs/subpage.h @@ -21,6 +21,7 @@ struct btrfs_subpage { u16 uptodate_bitmap; u16 error_bitmap; u16 dirty_bitmap; + u16 writeback_bitmap; union { /* * Structures only used by metadata @@ -89,6 +90,7 @@ bool btrfs_page_test_##name(const struct btrfs_fs_info *fs_info, \ DECLARE_BTRFS_SUBPAGE_OPS(uptodate); DECLARE_BTRFS_SUBPAGE_OPS(error); DECLARE_BTRFS_SUBPAGE_OPS(dirty); +DECLARE_BTRFS_SUBPAGE_OPS(writeback); /* * Extra clear_and_test function for subpage dirty bitmap. From patchwork Mon Feb 22 06:33:48 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12097939 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 89543C433DB for ; Mon, 22 Feb 2021 06:35:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 544D264EB4 for ; Mon, 22 Feb 2021 06:35:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229907AbhBVGfD (ORCPT ); Mon, 22 Feb 2021 01:35:03 -0500 Received: from mx2.suse.de ([195.135.220.15]:48662 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229518AbhBVGfC (ORCPT ); Mon, 22 Feb 2021 01:35:02 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1613975656; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7VG85PunvCabKM/ek4kInj6TnCxrgu8jwummNcZ23X4=; b=lHSjcFI4mbWn3xBVLh1KS/r6i/lBqbFqrcIYe5RYQPUZpIFe5ZSWC4720qtvAiI4QsYz1G r7lADpOeA54CcH6ZpagXMg+cGpyNqqNEyNEorVz7w/VnxO1ieDRikK8NFfqVTOFogGDwOr T/RuejZGsJY43UkYeVZLxsZB/J+zGKY= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id EA6E0B11E for ; Mon, 22 Feb 2021 06:34:15 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH 03/12] btrfs: disk-io: allow btree_set_page_dirty() to do more sanity check on subpage metadata Date: Mon, 22 Feb 2021 14:33:48 +0800 Message-Id: <20210222063357.92930-4-wqu@suse.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210222063357.92930-1-wqu@suse.com> References: <20210222063357.92930-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org For btree_set_page_dirty(), we should also check the extent buffer sanity for subpage support. Unlike the regular sector size case, since one page can contain multiple extent buffers, we need to make sure there is at least one dirty extent buffer in the page. So this patch will iterate through the btrfs_subpage::dirty_bitmap to get the extent buffers, and check if any dirty extent buffer in the page range has EXTENT_BUFFER_DIRTY and proper refs. Signed-off-by: Qu Wenruo --- fs/btrfs/disk-io.c | 47 ++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 41 insertions(+), 6 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index c2576c5fe62e..437e6b2163c7 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -42,6 +42,7 @@ #include "discard.h" #include "space-info.h" #include "zoned.h" +#include "subpage.h" #define BTRFS_SUPER_FLAG_SUPP (BTRFS_HEADER_FLAG_WRITTEN |\ BTRFS_HEADER_FLAG_RELOC |\ @@ -992,14 +993,48 @@ static void btree_invalidatepage(struct page *page, unsigned int offset, static int btree_set_page_dirty(struct page *page) { #ifdef DEBUG + struct btrfs_fs_info *fs_info = btrfs_sb(page->mapping->host->i_sb); + struct btrfs_subpage *subpage; struct extent_buffer *eb; + int cur_bit; + u64 page_start = page_offset(page); + + if (fs_info->sectorsize == PAGE_SIZE) { + BUG_ON(!PagePrivate(page)); + eb = (struct extent_buffer *)page->private; + BUG_ON(!eb); + BUG_ON(!test_bit(EXTENT_BUFFER_DIRTY, &eb->bflags)); + BUG_ON(!atomic_read(&eb->refs)); + btrfs_assert_tree_locked(eb); + return __set_page_dirty_nobuffers(page); + } + ASSERT(PagePrivate(page) && page->private); + subpage = (struct btrfs_subpage *)page->private; + + ASSERT(subpage->dirty_bitmap); + while (cur_bit < BTRFS_SUBPAGE_BITMAP_SIZE) { + unsigned long flags; + u64 cur; + u16 tmp = (1 << cur_bit); + + spin_lock_irqsave(&subpage->lock, flags); + if (!(tmp & subpage->dirty_bitmap)) { + spin_unlock_irqrestore(&subpage->lock, flags); + cur_bit++; + continue; + } + spin_unlock_irqrestore(&subpage->lock, flags); + cur = page_start + cur_bit * fs_info->sectorsize; - BUG_ON(!PagePrivate(page)); - eb = (struct extent_buffer *)page->private; - BUG_ON(!eb); - BUG_ON(!test_bit(EXTENT_BUFFER_DIRTY, &eb->bflags)); - BUG_ON(!atomic_read(&eb->refs)); - btrfs_assert_tree_locked(eb); + eb = find_extent_buffer(fs_info, cur); + ASSERT(eb); + ASSERT(test_bit(EXTENT_BUFFER_DIRTY, &eb->bflags)); + ASSERT(atomic_read(&eb->refs)); + btrfs_assert_tree_locked(eb); + free_extent_buffer(eb); + + cur_bit += (fs_info->nodesize >> fs_info->sectorsize_bits); + } #endif return __set_page_dirty_nobuffers(page); } From patchwork Mon Feb 22 06:33:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12097941 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 70408C433E0 for ; Mon, 22 Feb 2021 06:35:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 14A6F64EB4 for ; Mon, 22 Feb 2021 06:35:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229918AbhBVGfH (ORCPT ); Mon, 22 Feb 2021 01:35:07 -0500 Received: from mx2.suse.de ([195.135.220.15]:48672 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229518AbhBVGfH (ORCPT ); Mon, 22 Feb 2021 01:35:07 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1613975660; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pj04379aq97FYYoWVJBcgIeXLAvDAJFQB8aT2QVoW9Q=; b=YuenxpT7/vAUfaKQcGWBqb2E4G5trLDGSAHTzUqOkt7aiZHZ6+32Gn6bgW2n9miZdpWlrb sdJqULH19ON9J3aNUpJ3vQaimSUjYIpjv1FFe1VML1PVCRPvKsZoi3Z24q0qiClwwmhzcz caYg1OsGHyVZGMmH9jLl5EBusjIFDtk= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 54AE7B11F for ; Mon, 22 Feb 2021 06:34:20 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH 04/12] btrfs: disk-io: support subpage metadata csum calculation at write time Date: Mon, 22 Feb 2021 14:33:49 +0800 Message-Id: <20210222063357.92930-5-wqu@suse.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210222063357.92930-1-wqu@suse.com> References: <20210222063357.92930-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Add a new helper, csum_dirty_subpage_buffers(), to iterate through all dirty extent buffers in one bvec. Also extract the code of calculating csum for one extent buffer into csum_one_extent_buffer(), so that both the existing csum_dirty_buffer() and the new csum_dirty_subpage_buffers() can reuse the same routine. Signed-off-by: Qu Wenruo --- fs/btrfs/disk-io.c | 96 ++++++++++++++++++++++++++++++++++------------ 1 file changed, 72 insertions(+), 24 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 437e6b2163c7..3c00a65f6679 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -441,6 +441,74 @@ static int btree_read_extent_buffer_pages(struct extent_buffer *eb, return ret; } +static int csum_one_extent_buffer(struct extent_buffer *eb) +{ + struct btrfs_fs_info *fs_info = eb->fs_info; + u8 result[BTRFS_CSUM_SIZE]; + int ret; + + ASSERT(memcmp_extent_buffer(eb, fs_info->fs_devices->metadata_uuid, + offsetof(struct btrfs_header, fsid), + BTRFS_FSID_SIZE) == 0); + csum_tree_block(eb, result); + + if (btrfs_header_level(eb)) + ret = btrfs_check_node(eb); + else + ret = btrfs_check_leaf_full(eb); + + if (ret < 0) { + btrfs_print_tree(eb, 0); + btrfs_err(fs_info, + "block=%llu write time tree block corruption detected", + eb->start); + WARN_ON(IS_ENABLED(CONFIG_BTRFS_DEBUG)); + return ret; + } + write_extent_buffer(eb, result, 0, fs_info->csum_size); + + return 0; +} + +/* Checksum all dirty extent buffers in one bio_vec. */ +static int csum_dirty_subpage_buffers(struct btrfs_fs_info *fs_info, + struct bio_vec *bvec) +{ + struct page *page = bvec->bv_page; + u64 bvec_start = page_offset(page) + bvec->bv_offset; + u64 cur; + int ret = 0; + + for (cur = bvec_start; cur < bvec_start + bvec->bv_len; + cur += fs_info->nodesize) { + struct extent_buffer *eb; + bool uptodate; + + eb = find_extent_buffer(fs_info, cur); + uptodate = btrfs_subpage_test_uptodate(fs_info, page, cur, + fs_info->nodesize); + + /* A dirty eb shouldn't disappera from buffer_radix */ + if (WARN_ON(!eb)) + return -EUCLEAN; + + if (WARN_ON(cur != btrfs_header_bytenr(eb))) { + free_extent_buffer(eb); + return -EUCLEAN; + } + if (WARN_ON(!uptodate)) { + free_extent_buffer(eb); + return -EUCLEAN; + } + + ret = csum_one_extent_buffer(eb); + free_extent_buffer(eb); + if (ret < 0) + return ret; + } + return ret; +} + /* * Checksum a dirty tree block before IO. This has extra checks to make sure * we only fill in the checksum field in the first page of a multi-page block. @@ -451,9 +519,10 @@ static int csum_dirty_buffer(struct btrfs_fs_info *fs_info, struct bio_vec *bvec struct page *page = bvec->bv_page; u64 start = page_offset(page); u64 found_start; - u8 result[BTRFS_CSUM_SIZE]; struct extent_buffer *eb; - int ret; + + if (fs_info->sectorsize < PAGE_SIZE) + return csum_dirty_subpage_buffers(fs_info, bvec); eb = (struct extent_buffer *)page->private; if (page != eb->pages[0]) @@ -475,28 +544,7 @@ static int csum_dirty_buffer(struct btrfs_fs_info *fs_info, struct bio_vec *bvec if (WARN_ON(!PageUptodate(page))) return -EUCLEAN; - ASSERT(memcmp_extent_buffer(eb, fs_info->fs_devices->metadata_uuid, - offsetof(struct btrfs_header, fsid), - BTRFS_FSID_SIZE) == 0); - - csum_tree_block(eb, result); - - if (btrfs_header_level(eb)) - ret = btrfs_check_node(eb); - else - ret = btrfs_check_leaf_full(eb); - - if (ret < 0) { - btrfs_print_tree(eb, 0); - btrfs_err(fs_info, - "block=%llu write time tree block corruption detected", - eb->start); - WARN_ON(IS_ENABLED(CONFIG_BTRFS_DEBUG)); - return ret; - } - write_extent_buffer(eb, result, 0, fs_info->csum_size); - - return 0; + return csum_one_extent_buffer(eb); } static int check_tree_block_fsid(struct extent_buffer *eb) From patchwork Mon Feb 22 06:33:50 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12097943 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F0788C433DB for ; Mon, 22 Feb 2021 06:35:40 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9CA4964EB4 for ; Mon, 22 Feb 2021 06:35:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229949AbhBVGfi (ORCPT ); Mon, 22 Feb 2021 01:35:38 -0500 Received: from mx2.suse.de ([195.135.220.15]:48794 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229943AbhBVGfi (ORCPT ); Mon, 22 Feb 2021 01:35:38 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1613975663; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gqu16gpPs/SgfjfojO/0IZnMruFKZCCIaHY9dEYtH0E=; b=Ucfp989sL7eeTzxncNHy92upernoYHkjb0WKTWEDDmcD8FGbLd+k/+XBQjQOFAuyOgYaDP 8PiqtvBJdINfcuRQZjWiHvQoWunKT75nDfR6/nM19EHdkmIKVIOKQubDazHPeM5GxuCcIC mIB1nvfFK0kSLAeuvqjpAVyaXPahZwk= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id F38F5B120 for ; Mon, 22 Feb 2021 06:34:22 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH 05/12] btrfs: extent_io: make alloc_extent_buffer() check subpage dirty bitmap Date: Mon, 22 Feb 2021 14:33:50 +0800 Message-Id: <20210222063357.92930-6-wqu@suse.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210222063357.92930-1-wqu@suse.com> References: <20210222063357.92930-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org In alloc_extent_buffer(), we make sure that the newly allocated page is never dirty. This is fine for sector size == PAGE_SIZE case, but for subpage it's possible that one extent buffer in the page is dirty, thus the whole page is marked dirty, and could cause false alert. To support subpage, call btrfs_page_test_dirty() to handle both cases. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 4dfb3ead1175..6637535d264d 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -5625,7 +5625,7 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, btrfs_page_inc_eb_refs(fs_info, p); spin_unlock(&mapping->private_lock); - WARN_ON(PageDirty(p)); + WARN_ON(btrfs_page_test_dirty(fs_info, p, eb->start, eb->len)); eb->pages[i] = p; if (!PageUptodate(p)) uptodate = 0; From patchwork Mon Feb 22 06:33:51 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12097945 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D8107C433E0 for ; Mon, 22 Feb 2021 06:35:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8790E64EB4 for ; Mon, 22 Feb 2021 06:35:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229953AbhBVGfk (ORCPT ); Mon, 22 Feb 2021 01:35:40 -0500 Received: from mx2.suse.de ([195.135.220.15]:48796 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229518AbhBVGfi (ORCPT ); Mon, 22 Feb 2021 01:35:38 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1613975664; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=8Kk3TbH+JgHWK19p7qRETcPxctW6N+H/z4WpfIoAOJk=; b=TcnDKWODME+xHM71c861EMwgY3FxZXNlPxzmjZRQQTTAq0fpF0pcFkWPu1p3cuJvsnd5DY ZgdB8IehAGObSsBW830gGT0oG9gTaTpZBo9y4IKatvmGKwaHJi7n4L+gxdTr8yzm9UsriY Y5f4Vct5Of2r1x2O7kY0yVgbImlVVYg= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id BD1D2B121 for ; Mon, 22 Feb 2021 06:34:24 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH 06/12] btrfs: extent_io: make the page uptodate assert check to handle subpage Date: Mon, 22 Feb 2021 14:33:51 +0800 Message-Id: <20210222063357.92930-7-wqu@suse.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210222063357.92930-1-wqu@suse.com> References: <20210222063357.92930-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org There are quite some assert test on page uptodate in extent buffer write accessors. They ensure the destination page is already uptodate. This is fine for regular sector size case, but not for subpage case, as for subpage we only mark the page uptodate if the page contains no hole and all its extent buffers are uptodate. So instead of checking PageUptodate(), for subpage case we check the uptodate bitmap of btrfs_subpage structure. To make the check more elegant, introduce a helper, assert_eb_page_uptodate() to do the check for both subpage and regular sector size cases. The following functions are involved: - write_extent_buffer_chunk_tree_uuid() - write_extent_buffer_fsid() - write_extent_buffer() - memzero_extent_buffer() - copy_extent_buffer() - extent_buffer_test_bit() - extent_buffer_bitmap_set() - extent_buffer_bitmap_clear() Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 42 ++++++++++++++++++++++++++++++++---------- 1 file changed, 32 insertions(+), 10 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 6637535d264d..13ba7a012425 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -6177,12 +6177,34 @@ int memcmp_extent_buffer(const struct extent_buffer *eb, const void *ptrv, return ret; } +/* + * A helper to ensure that the extent buffer is uptodate. + * + * For regular sector size == PAGE_SIZE case, check if @page is uptodate. + * For subpage case, check if the range covered by the eb has EXTENT_UPTODATE. + */ +static void assert_eb_page_uptodate(const struct extent_buffer *eb, + struct page *page) +{ + struct btrfs_fs_info *fs_info = eb->fs_info; + + if (fs_info->sectorsize < PAGE_SIZE) { + bool uptodate; + + uptodate = btrfs_subpage_test_uptodate(fs_info, page, + eb->start, eb->len); + WARN_ON(!uptodate); + } else { + WARN_ON(!PageUptodate(page)); + } +} + void write_extent_buffer_chunk_tree_uuid(const struct extent_buffer *eb, const void *srcv) { char *kaddr; - WARN_ON(!PageUptodate(eb->pages[0])); + assert_eb_page_uptodate(eb, eb->pages[0]); kaddr = page_address(eb->pages[0]) + get_eb_offset_in_page(eb, 0); memcpy(kaddr + offsetof(struct btrfs_header, chunk_tree_uuid), srcv, BTRFS_FSID_SIZE); @@ -6192,7 +6214,7 @@ void write_extent_buffer_fsid(const struct extent_buffer *eb, const void *srcv) { char *kaddr; - WARN_ON(!PageUptodate(eb->pages[0])); + assert_eb_page_uptodate(eb, eb->pages[0]); kaddr = page_address(eb->pages[0]) + get_eb_offset_in_page(eb, 0); memcpy(kaddr + offsetof(struct btrfs_header, fsid), srcv, BTRFS_FSID_SIZE); @@ -6217,7 +6239,7 @@ void write_extent_buffer(const struct extent_buffer *eb, const void *srcv, while (len > 0) { page = eb->pages[i]; - WARN_ON(!PageUptodate(page)); + assert_eb_page_uptodate(eb, page); cur = min(len, PAGE_SIZE - offset); kaddr = page_address(page); @@ -6246,7 +6268,7 @@ void memzero_extent_buffer(const struct extent_buffer *eb, unsigned long start, while (len > 0) { page = eb->pages[i]; - WARN_ON(!PageUptodate(page)); + assert_eb_page_uptodate(eb, page); cur = min(len, PAGE_SIZE - offset); kaddr = page_address(page); @@ -6304,7 +6326,7 @@ void copy_extent_buffer(const struct extent_buffer *dst, while (len > 0) { page = dst->pages[i]; - WARN_ON(!PageUptodate(page)); + assert_eb_page_uptodate(dst, page); cur = min(len, (unsigned long)(PAGE_SIZE - offset)); @@ -6366,7 +6388,7 @@ int extent_buffer_test_bit(const struct extent_buffer *eb, unsigned long start, eb_bitmap_offset(eb, start, nr, &i, &offset); page = eb->pages[i]; - WARN_ON(!PageUptodate(page)); + assert_eb_page_uptodate(eb, page); kaddr = page_address(page); return 1U & (kaddr[offset] >> (nr & (BITS_PER_BYTE - 1))); } @@ -6391,7 +6413,7 @@ void extent_buffer_bitmap_set(const struct extent_buffer *eb, unsigned long star eb_bitmap_offset(eb, start, pos, &i, &offset); page = eb->pages[i]; - WARN_ON(!PageUptodate(page)); + assert_eb_page_uptodate(eb, page); kaddr = page_address(page); while (len >= bits_to_set) { @@ -6402,7 +6424,7 @@ void extent_buffer_bitmap_set(const struct extent_buffer *eb, unsigned long star if (++offset >= PAGE_SIZE && len > 0) { offset = 0; page = eb->pages[++i]; - WARN_ON(!PageUptodate(page)); + assert_eb_page_uptodate(eb, page); kaddr = page_address(page); } } @@ -6434,7 +6456,7 @@ void extent_buffer_bitmap_clear(const struct extent_buffer *eb, eb_bitmap_offset(eb, start, pos, &i, &offset); page = eb->pages[i]; - WARN_ON(!PageUptodate(page)); + assert_eb_page_uptodate(eb, page); kaddr = page_address(page); while (len >= bits_to_clear) { @@ -6445,7 +6467,7 @@ void extent_buffer_bitmap_clear(const struct extent_buffer *eb, if (++offset >= PAGE_SIZE && len > 0) { offset = 0; page = eb->pages[++i]; - WARN_ON(!PageUptodate(page)); + assert_eb_page_uptodate(eb, page); kaddr = page_address(page); } } From patchwork Mon Feb 22 06:33:52 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12097949 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DC853C433E6 for ; Mon, 22 Feb 2021 06:35:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 91CAA64EB4 for ; Mon, 22 Feb 2021 06:35:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229991AbhBVGfy (ORCPT ); Mon, 22 Feb 2021 01:35:54 -0500 Received: from mx2.suse.de ([195.135.220.15]:48810 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229961AbhBVGfp (ORCPT ); Mon, 22 Feb 2021 01:35:45 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1613975666; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4s77KBwjFZdumwy5kDJuMYYOMm7m2X1e+Q4BIFMulMo=; b=R9ImD69331M5pnR8ESBhNXr7u0T1STcxgHay6HjiMwCt5WlGDG5GinfR6RvSMfmm9cq0Al d+OgfwgRKD6ofCgOiBUvaULt+5GXacYmwhKmV0q4iKj1Kz5uOm50smjGyy6WkA743ARiQE e8PuWXllWQZYMd97CespYgdQ+9Q4Tq0= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id D524CB124 for ; Mon, 22 Feb 2021 06:34:26 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH 07/12] btrfs: extent_io: make set/clear_extent_buffer_dirty() to support subpage sized metadata Date: Mon, 22 Feb 2021 14:33:52 +0800 Message-Id: <20210222063357.92930-8-wqu@suse.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210222063357.92930-1-wqu@suse.com> References: <20210222063357.92930-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org For set_extent_buffer_dirty() to support subpage sized metadata, just call btrfs_page_set_dirty() to handle both cases. For clear_extent_buffer_dirty(), it needs to clear the page dirty if and only if all extent buffers in the page range are no longer dirty. Also do the same for page error. This is pretty different from the exist clear_extent_buffer_dirty() routine, so add a new helper function, clear_subpage_extent_buffer_dirty() to do this for subpage metadata. Also since the main part of clearing page dirty code is still the same, extract that into btree_clear_page_dirty() so that it can be utilized for both cases. But there is a special race between set_extent_buffer_dirty() and clear_extent_buffer_dirty(), where we can clear the page dirty. [POSSIBLE RACE WINDOW] For the race window between clear_subpage_extent_buffer_dirty() and set_extent_buffer_dirty(), due to the fact that we can't call clear_page_dirty_for_io() under subpage spin lock, we can race like below: T1 (eb1 in the same page) | T2 (eb2 in the same page) -------------------------------+------------------------------ set_extent_buffer_dirty() | clear_extent_buffer_dirty() |- was_dirty = false; | |- clear_subpagE_extent_buffer_dirty() | | |- btrfs_clear_and_test_dirty() | | | Since eb2 is the last dirty page | | | we got: | | | last == true; | | | |- btrfs_page_set_dirty() | | | We set the page dirty and | | | subpage dirty bitmap | | | | |- if (last) | | | Since we don't have subpage lock | | | hold, now @last is no longer | | | correct | | |- btree_clear_page_dirty() | | Now PageDirty == false, even we | | have dirty_bitmap not zero. |- ASSERT(PageDirty()); | ^^^^ CRASH The solution here is to also lock the eb->pages[0] for subpage case of set_extent_buffer_dirty(), to prevent racing with clear_extent_buffer_dirty(). Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 65 ++++++++++++++++++++++++++++++++++++-------- 1 file changed, 53 insertions(+), 12 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 13ba7a012425..ea0089c8aefb 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -5774,28 +5774,51 @@ void free_extent_buffer_stale(struct extent_buffer *eb) release_extent_buffer(eb); } +static void btree_clear_page_dirty(struct page *page) +{ + ASSERT(PageDirty(page)); + ASSERT(PageLocked(page)); + clear_page_dirty_for_io(page); + xa_lock_irq(&page->mapping->i_pages); + if (!PageDirty(page)) + __xa_clear_mark(&page->mapping->i_pages, + page_index(page), PAGECACHE_TAG_DIRTY); + xa_unlock_irq(&page->mapping->i_pages); +} + +static void clear_subpage_extent_buffer_dirty(const struct extent_buffer *eb) +{ + struct btrfs_fs_info *fs_info = eb->fs_info; + struct page *page = eb->pages[0]; + bool last; + + /* btree_clear_page_dirty() needs page locked */ + lock_page(page); + last = btrfs_subpage_clear_and_test_dirty(fs_info, page, eb->start, + eb->len); + if (last) + btree_clear_page_dirty(page); + unlock_page(page); + WARN_ON(atomic_read(&eb->refs) == 0); +} + void clear_extent_buffer_dirty(const struct extent_buffer *eb) { int i; int num_pages; struct page *page; + if (eb->fs_info->sectorsize < PAGE_SIZE) + return clear_subpage_extent_buffer_dirty(eb); + num_pages = num_extent_pages(eb); for (i = 0; i < num_pages; i++) { page = eb->pages[i]; if (!PageDirty(page)) continue; - lock_page(page); - WARN_ON(!PagePrivate(page)); - - clear_page_dirty_for_io(page); - xa_lock_irq(&page->mapping->i_pages); - if (!PageDirty(page)) - __xa_clear_mark(&page->mapping->i_pages, - page_index(page), PAGECACHE_TAG_DIRTY); - xa_unlock_irq(&page->mapping->i_pages); + btree_clear_page_dirty(page); ClearPageError(page); unlock_page(page); } @@ -5816,10 +5839,28 @@ bool set_extent_buffer_dirty(struct extent_buffer *eb) WARN_ON(atomic_read(&eb->refs) == 0); WARN_ON(!test_bit(EXTENT_BUFFER_TREE_REF, &eb->bflags)); - if (!was_dirty) - for (i = 0; i < num_pages; i++) - set_page_dirty(eb->pages[i]); + if (!was_dirty) { + bool subpage = eb->fs_info->sectorsize < PAGE_SIZE; + /* + * For subpage case, we can have other extent buffers in the + * same page, and in clear_subpage_extent_buffer_dirty() we + * have to clear page dirty without subapge lock hold. + * This can cause race where our page gets dirty cleared after + * we just set it. + * + * Thankfully, clear_subpage_extent_buffer_dirty() has locked + * its page for other reasons, we can use page lock to + * prevent above race. + */ + if (subpage) + lock_page(eb->pages[0]); + for (i = 0; i < num_pages; i++) + btrfs_page_set_dirty(eb->fs_info, eb->pages[i], + eb->start, eb->len); + if (subpage) + unlock_page(eb->pages[0]); + } #ifdef CONFIG_BTRFS_DEBUG for (i = 0; i < num_pages; i++) ASSERT(PageDirty(eb->pages[i])); From patchwork Mon Feb 22 06:33:53 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12097947 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1D743C433E0 for ; Mon, 22 Feb 2021 06:35:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C464C64E76 for ; Mon, 22 Feb 2021 06:35:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229983AbhBVGfs (ORCPT ); Mon, 22 Feb 2021 01:35:48 -0500 Received: from mx2.suse.de ([195.135.220.15]:48812 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229518AbhBVGfp (ORCPT ); Mon, 22 Feb 2021 01:35:45 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1613975668; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zesXryega3CuVqX2gk90EQ/m2Pj2HxJSfvgWMzSQkxw=; b=UUiOtIBAH5jkVbs4sGJDyNDqFNocQdFNYN1jyPtHgkWvpv+SrX8cL8KHs+9DqhVXNlv0gJ 9v84FVQc2wMa6Mkjcd7vX8ydNUZ98QfuMc3pxzCPwGN/zaG9Fu3Ws7N/2jBsAnFiO0yNF5 wfTxE3+lJeTu/mhqU+W2yYMRJcnCmMM= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id ADFF9B125 for ; Mon, 22 Feb 2021 06:34:28 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH 08/12] btrfs: extent_io: make set_btree_ioerr() accept extent buffer and handle subpage metadata Date: Mon, 22 Feb 2021 14:33:53 +0800 Message-Id: <20210222063357.92930-9-wqu@suse.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210222063357.92930-1-wqu@suse.com> References: <20210222063357.92930-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Current set_btree_ioerr() only accepts @page parameter and grabs extent buffer from page::private. This works fine for sector size == PAGE_SIZE case, but not for subpage case. Adds an extra parameter, @eb, for callers to pass extent buffer to this function, so that subpage code can reuse this function. And also add subpage special handling to update btrfs_subpage::error_bitmap. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 18 ++++++++---------- 1 file changed, 8 insertions(+), 10 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index ea0089c8aefb..96ac72d3f3a0 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -3972,12 +3972,11 @@ static noinline_for_stack int lock_extent_buffer_for_io(struct extent_buffer *eb return ret; } -static void set_btree_ioerr(struct page *page) +static void set_btree_ioerr(struct page *page, struct extent_buffer *eb) { - struct extent_buffer *eb = (struct extent_buffer *)page->private; - struct btrfs_fs_info *fs_info; + struct btrfs_fs_info *fs_info = eb->fs_info; - SetPageError(page); + btrfs_page_set_error(fs_info, page, eb->start, eb->len); if (test_and_set_bit(EXTENT_BUFFER_WRITE_ERR, &eb->bflags)) return; @@ -3985,7 +3984,6 @@ static void set_btree_ioerr(struct page *page) * If we error out, we should add back the dirty_metadata_bytes * to make it consistent. */ - fs_info = eb->fs_info; percpu_counter_add_batch(&fs_info->dirty_metadata_bytes, eb->len, fs_info->dirty_metadata_batch); @@ -4029,13 +4027,13 @@ static void set_btree_ioerr(struct page *page) */ switch (eb->log_index) { case -1: - set_bit(BTRFS_FS_BTREE_ERR, &eb->fs_info->flags); + set_bit(BTRFS_FS_BTREE_ERR, &fs_info->flags); break; case 0: - set_bit(BTRFS_FS_LOG1_ERR, &eb->fs_info->flags); + set_bit(BTRFS_FS_LOG1_ERR, &fs_info->flags); break; case 1: - set_bit(BTRFS_FS_LOG2_ERR, &eb->fs_info->flags); + set_bit(BTRFS_FS_LOG2_ERR, &fs_info->flags); break; default: BUG(); /* unexpected, logic error */ @@ -4060,7 +4058,7 @@ static void end_bio_extent_buffer_writepage(struct bio *bio) if (bio->bi_status || test_bit(EXTENT_BUFFER_WRITE_ERR, &eb->bflags)) { ClearPageUptodate(page); - set_btree_ioerr(page); + set_btree_ioerr(page, eb); } end_page_writeback(page); @@ -4116,7 +4114,7 @@ static noinline_for_stack int write_one_eb(struct extent_buffer *eb, end_bio_extent_buffer_writepage, 0, 0, 0, false); if (ret) { - set_btree_ioerr(p); + set_btree_ioerr(p, eb); if (PageWriteback(p)) end_page_writeback(p); if (atomic_sub_and_test(num_pages - i, &eb->io_pages)) From patchwork Mon Feb 22 06:33:54 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12097951 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 40083C433E0 for ; Mon, 22 Feb 2021 06:35:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E6F6D64ED6 for ; Mon, 22 Feb 2021 06:35:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229995AbhBVGfy (ORCPT ); Mon, 22 Feb 2021 01:35:54 -0500 Received: from mx2.suse.de ([195.135.220.15]:48820 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229967AbhBVGfr (ORCPT ); Mon, 22 Feb 2021 01:35:47 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1613975670; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=TRWV6dtPSTUWKuuRZBYdjtXv/QFwGW80DdxKdvkotUw=; b=JmO04Tb1gOiowPOzpMRdLXhQsULqfOmMK3+YVAk5QU8/8yYgS29I84v+wmX5jgUgIaLKTE hfKb+TUiJo8AqrrM5g0ZTXsd9iw7eLNNkHnTDiHOZYTzG8i0lQZanrZR9iFvEwPR0DAP4f GwSx1d5KtUM1IOAhJa9H6SlgxkAK0+4= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id AF2ABB126 for ; Mon, 22 Feb 2021 06:34:30 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH 09/12] btrfs: extent_io: introduce end_bio_subpage_eb_writepage() function Date: Mon, 22 Feb 2021 14:33:54 +0800 Message-Id: <20210222063357.92930-10-wqu@suse.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210222063357.92930-1-wqu@suse.com> References: <20210222063357.92930-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org The new function, end_bio_subpage_eb_writepage(), will handle the metadata writeback endio. The major difference involved is: - How to grab extent buffer Now page::private is a pointer to btrfs_subpage, we can no longer grab extent buffer directly. Thus we need to use the bv_offset to locate the extent buffer manually and iterate through the whole range. - Use btrfs_subpage_end_writeback() caller This helper will handle the subpage writeback for us. Since this function is executed under endio context, when grabbing extent buffers it can't grab eb->refs_lock as that lock is not designed to be grabbed under hardirq context. So here introduce a helper, find_extent_buffer_nospinlock(), for such situation, and convert find_extent_buffer() to use that helper. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 135 +++++++++++++++++++++++++++++++++---------- 1 file changed, 106 insertions(+), 29 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 96ac72d3f3a0..d0afff0eb252 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -4040,13 +4040,97 @@ static void set_btree_ioerr(struct page *page, struct extent_buffer *eb) } } +/* + * This is the endio specific version which won't touch any unsafe spinlock + * in endio context. + */ +static struct extent_buffer *find_extent_buffer_nospinlock( + struct btrfs_fs_info *fs_info, u64 start) +{ + struct extent_buffer *eb; + + rcu_read_lock(); + eb = radix_tree_lookup(&fs_info->buffer_radix, + start >> fs_info->sectorsize_bits); + if (eb && atomic_inc_not_zero(&eb->refs)) { + rcu_read_unlock(); + return eb; + } + rcu_read_unlock(); + return NULL; +} +/* + * The endio function for subpage extent buffer write. + * + * Unlike end_bio_extent_buffer_writepage(), we only call end_page_writeback() + * after all extent buffers in the page has finished their writeback. + */ +static void end_bio_subpage_eb_writepage(struct btrfs_fs_info *fs_info, + struct bio *bio) +{ + struct bio_vec *bvec; + struct bvec_iter_all iter_all; + + ASSERT(!bio_flagged(bio, BIO_CLONED)); + bio_for_each_segment_all(bvec, bio, iter_all) { + struct page *page = bvec->bv_page; + u64 bvec_start = page_offset(page) + bvec->bv_offset; + u64 bvec_end = bvec_start + bvec->bv_len - 1; + u64 cur_bytenr = bvec_start; + + ASSERT(IS_ALIGNED(bvec->bv_len, fs_info->nodesize)); + + /* Iterate through all extent buffers in the range */ + while (cur_bytenr <= bvec_end) { + struct extent_buffer *eb; + int done; + + /* + * Here we can't use find_extent_buffer(), as it may + * try to lock eb->refs_lock, which is not safe in endio + * context. + */ + eb = find_extent_buffer_nospinlock(fs_info, cur_bytenr); + ASSERT(eb); + + cur_bytenr = eb->start + eb->len; + + ASSERT(test_bit(EXTENT_BUFFER_WRITEBACK, &eb->bflags)); + done = atomic_dec_and_test(&eb->io_pages); + ASSERT(done); + + if (bio->bi_status || + test_bit(EXTENT_BUFFER_WRITE_ERR, &eb->bflags)) { + ClearPageUptodate(page); + set_btree_ioerr(page, eb); + } + + btrfs_subpage_clear_writeback(fs_info, page, eb->start, + eb->len); + end_extent_buffer_writeback(eb); + /* + * free_extent_buffer() will grab spinlock which is not + * safe in endio context. Thus here we manually dec + * the ref. + */ + atomic_dec(&eb->refs); + } + } + bio_put(bio); +} + static void end_bio_extent_buffer_writepage(struct bio *bio) { + struct btrfs_fs_info *fs_info; struct bio_vec *bvec; struct extent_buffer *eb; int done; struct bvec_iter_all iter_all; + fs_info = btrfs_sb(bio_first_page_all(bio)->mapping->host->i_sb); + if (fs_info->sectorsize < PAGE_SIZE) + return end_bio_subpage_eb_writepage(fs_info, bio); + ASSERT(!bio_flagged(bio, BIO_CLONED)); bio_for_each_segment_all(bvec, bio, iter_all) { struct page *page = bvec->bv_page; @@ -5427,36 +5511,29 @@ struct extent_buffer *find_extent_buffer(struct btrfs_fs_info *fs_info, { struct extent_buffer *eb; - rcu_read_lock(); - eb = radix_tree_lookup(&fs_info->buffer_radix, - start >> fs_info->sectorsize_bits); - if (eb && atomic_inc_not_zero(&eb->refs)) { - rcu_read_unlock(); - /* - * Lock our eb's refs_lock to avoid races with - * free_extent_buffer. When we get our eb it might be flagged - * with EXTENT_BUFFER_STALE and another task running - * free_extent_buffer might have seen that flag set, - * eb->refs == 2, that the buffer isn't under IO (dirty and - * writeback flags not set) and it's still in the tree (flag - * EXTENT_BUFFER_TREE_REF set), therefore being in the process - * of decrementing the extent buffer's reference count twice. - * So here we could race and increment the eb's reference count, - * clear its stale flag, mark it as dirty and drop our reference - * before the other task finishes executing free_extent_buffer, - * which would later result in an attempt to free an extent - * buffer that is dirty. - */ - if (test_bit(EXTENT_BUFFER_STALE, &eb->bflags)) { - spin_lock(&eb->refs_lock); - spin_unlock(&eb->refs_lock); - } - mark_extent_buffer_accessed(eb, NULL); - return eb; + eb = find_extent_buffer_nospinlock(fs_info, start); + if (!eb) + return NULL; + /* + * Lock our eb's refs_lock to avoid races with free_extent_buffer(). + * When we get our eb it might be flagged with EXTENT_BUFFER_STALE and + * another task running free_extent_buffer() might have seen that flag + * set, eb->refs == 2, that the buffer isn't under IO (dirty and + * writeback flags not set) and it's still in the tree (flag + * EXTENT_BUFFER_TREE_REF set), therefore being in the process + * of decrementing the extent buffer's reference count twice. + * So here we could race and increment the eb's reference count, + * clear its stale flag, mark it as dirty and drop our reference + * before the other task finishes executing free_extent_buffer, + * which would later result in an attempt to free an extent + * buffer that is dirty. + */ + if (test_bit(EXTENT_BUFFER_STALE, &eb->bflags)) { + spin_lock(&eb->refs_lock); + spin_unlock(&eb->refs_lock); } - rcu_read_unlock(); - - return NULL; + mark_extent_buffer_accessed(eb, NULL); + return eb; } #ifdef CONFIG_BTRFS_FS_RUN_SANITY_TESTS From patchwork Mon Feb 22 06:33:55 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12097953 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DBEB4C433E9 for ; Mon, 22 Feb 2021 06:35:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9467764E76 for ; Mon, 22 Feb 2021 06:35:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229999AbhBVGf4 (ORCPT ); Mon, 22 Feb 2021 01:35:56 -0500 Received: from mx2.suse.de ([195.135.220.15]:48818 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229966AbhBVGfr (ORCPT ); Mon, 22 Feb 2021 01:35:47 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1613975673; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CAemwNuTMOLE3xNCcYQwyeX7JZLs7sR5/WFUAAzd4/s=; b=KzvvuOJRvJftjIcdXd9BO97dRFTpWULdi0vogrApYSh2+93GYckU+vwC5Te+TRssOeqyTU TiDE0AcY/QA/dIaOUFQeo+gf5tCq38qnuSGCZPwkVd3lcR7Bet9+RE1svKNadx90qembvh xnj8/wTvCxXoDT9aIhqmtbArZ8mikYA= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 4A58BB128 for ; Mon, 22 Feb 2021 06:34:33 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH 10/12] btrfs: extent_io: introduce write_one_subpage_eb() function Date: Mon, 22 Feb 2021 14:33:55 +0800 Message-Id: <20210222063357.92930-11-wqu@suse.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210222063357.92930-1-wqu@suse.com> References: <20210222063357.92930-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org The new function, write_one_subpage_eb(), as a subroutine for subpage metadata write, will handle the extent buffer bio submission. The main difference between the new write_one_subpage_eb() and write_one_eb() is: - No page locking When entering write_one_subpage_eb() the page is no longer locked. We only lock the page for its status update, and unlock immediately. Now we completely rely on extent io tree locking. - Extra bitmap update along with page status update Now page dirty and writeback is controlled by btrfs_subpage dirty_bitmap and writeback_bitmap. They both follow the schema that any sector is dirty/writeback, then the full page get dirty/writeback. - When to update the nr_written number Now we take a short cut, if we have cleared the last dirty bit of the page, we update nr_written. This is not completely perfect, but should emulate the old behavior good enough. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 55 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 55 insertions(+) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index d0afff0eb252..ea603c1f2994 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -4156,6 +4156,58 @@ static void end_bio_extent_buffer_writepage(struct bio *bio) bio_put(bio); } +/* + * Unlike the work in write_one_eb(), we rely completely on extent locking. + * Page locking is only utizlied at minimal to keep the VM code happy. + * + * Caller should still call write_one_eb() other than this function directly. + * As write_one_eb() has extra prepration before submitting the extent buffer. + */ +static int write_one_subpage_eb(struct extent_buffer *eb, + struct writeback_control *wbc, + struct extent_page_data *epd) +{ + struct btrfs_fs_info *fs_info = eb->fs_info; + struct page *page = eb->pages[0]; + unsigned int write_flags = wbc_to_write_flags(wbc) | REQ_META; + bool no_dirty_ebs = false; + int ret; + + /* clear_page_dirty_for_io() in subpage helper need page locked. */ + lock_page(page); + btrfs_subpage_set_writeback(fs_info, page, eb->start, eb->len); + + /* If we're the last dirty bit to update nr_written */ + no_dirty_ebs = btrfs_subpage_clear_and_test_dirty(fs_info, page, + eb->start, eb->len); + if (no_dirty_ebs) + clear_page_dirty_for_io(page); + + ret = submit_extent_page(REQ_OP_WRITE | write_flags, wbc, page, + eb->start, eb->len, eb->start - page_offset(page), + &epd->bio, end_bio_extent_buffer_writepage, 0, 0, 0, + false); + if (ret) { + btrfs_subpage_clear_writeback(fs_info, page, eb->start, + eb->len); + set_btree_ioerr(page, eb); + unlock_page(page); + + if (atomic_dec_and_test(&eb->io_pages)) + end_extent_buffer_writeback(eb); + return -EIO; + } + unlock_page(page); + /* + * Submission finishes without problem, if no range of the page is + * dirty anymore, we have submitted a page. + * Update the nr_written in wbc. + */ + if (no_dirty_ebs) + update_nr_written(wbc, 1); + return ret; +} + static noinline_for_stack int write_one_eb(struct extent_buffer *eb, struct writeback_control *wbc, struct extent_page_data *epd) @@ -4187,6 +4239,9 @@ static noinline_for_stack int write_one_eb(struct extent_buffer *eb, memzero_extent_buffer(eb, start, end - start); } + if (eb->fs_info->sectorsize < PAGE_SIZE) + return write_one_subpage_eb(eb, wbc, epd); + for (i = 0; i < num_pages; i++) { struct page *p = eb->pages[i]; From patchwork Mon Feb 22 06:33:56 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12097955 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AA147C433E0 for ; Mon, 22 Feb 2021 06:36:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 62FA664EB4 for ; Mon, 22 Feb 2021 06:36:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230006AbhBVGf6 (ORCPT ); Mon, 22 Feb 2021 01:35:58 -0500 Received: from mx2.suse.de ([195.135.220.15]:48824 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229989AbhBVGft (ORCPT ); Mon, 22 Feb 2021 01:35:49 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1613975675; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vvuUuzSKfj/jx0OJYHxsZG+rzK+hrgLUfMHDGvrqel8=; b=mXE7c6GxLlvPaRZCLrWPnmCeFKIpgC91I3OE9Sihv5OsMZQgfHBEkmfhXNs6kVyfZi0JIP HthyJInUhylfQP52cJ4ZJZFErPLe8BLn6Nl7buKYBaQTnCbSjMhEsBQpDQZUaG+XR10Xve Sgmi5VVAdsoU/D4KhHwey1nsp2c1X28= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 671E1B12A for ; Mon, 22 Feb 2021 06:34:35 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH 11/12] btrfs: extent_io: make lock_extent_buffer_for_io() to support subpage metadata Date: Mon, 22 Feb 2021 14:33:56 +0800 Message-Id: <20210222063357.92930-12-wqu@suse.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210222063357.92930-1-wqu@suse.com> References: <20210222063357.92930-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org For subpage metadata, we don't use page locking at all. So just skip the page locking part for subpage. All the remaining routine can be reused. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index ea603c1f2994..edfca14a158e 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -3927,7 +3927,13 @@ static noinline_for_stack int lock_extent_buffer_for_io(struct extent_buffer *eb btrfs_tree_unlock(eb); - if (!ret) + /* + * Either we don't need to submit any tree block, or we're submitting + * subpage. + * Subpage metadata doesn't use page locking at all, so we can skip + * the page locking. + */ + if (!ret || fs_info->sectorsize < PAGE_SIZE) return ret; num_pages = num_extent_pages(eb); From patchwork Mon Feb 22 06:33:57 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12097957 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B4908C433DB for ; Mon, 22 Feb 2021 06:36:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 86EB364EB4 for ; Mon, 22 Feb 2021 06:36:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230014AbhBVGgA (ORCPT ); Mon, 22 Feb 2021 01:36:00 -0500 Received: from mx2.suse.de ([195.135.220.15]:48822 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229987AbhBVGft (ORCPT ); Mon, 22 Feb 2021 01:35:49 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1613975677; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Iva4/kpHQOKTgngGUiAe4tEuGw/7aB2OVt2YUrVq7E0=; b=M15r+LDqgm/TfJNezP2oOChPCltySpYcEyCEG+2oQZvVqLI7fmFhL4Z/tqYXBl+iZoi+VC UlL85HUobjTwu5f5Ylm6Of5euc38rb1uKnRPmm7XQaCmn0WicopsAxK8IyEPlECgedkoO1 hCfAVe7vYvWkqWHOGFM4PrvfhOa2fWA= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 39E1FB12B for ; Mon, 22 Feb 2021 06:34:37 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH 12/12] btrfs: extent_io: introduce submit_eb_subpage() to submit a subpage metadata page Date: Mon, 22 Feb 2021 14:33:57 +0800 Message-Id: <20210222063357.92930-13-wqu@suse.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210222063357.92930-1-wqu@suse.com> References: <20210222063357.92930-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org The new function, submit_eb_subpage(), will submit all the dirty extent buffers in the page. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 95 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 95 insertions(+) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index edfca14a158e..191bd47c04e0 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -4283,6 +4283,98 @@ static noinline_for_stack int write_one_eb(struct extent_buffer *eb, return ret; } +/* + * Submit one subpage btree page. + * + * The main difference between submit_eb_page() is: + * - Page locking + * For subpage, we don't rely on page locking at all. + * + * - Flush write bio + * We only flush bio if we may be unable to fit current extent buffers into + * current bio. + * + * Return >=0 for the number of submitted extent buffers. + * Return <0 for fatal error. + */ +static int submit_eb_subpage(struct page *page, + struct writeback_control *wbc, + struct extent_page_data *epd) +{ + struct btrfs_fs_info *fs_info = btrfs_sb(page->mapping->host->i_sb); + int submitted = 0; + u64 page_start = page_offset(page); + int bit_start = 0; + int nbits = BTRFS_SUBPAGE_BITMAP_SIZE; + int sectors_per_node = fs_info->nodesize >> fs_info->sectorsize_bits; + int ret; + + /* Lock and write each dirty extent buffers in the range */ + while (bit_start < nbits) { + struct btrfs_subpage *subpage = (struct btrfs_subpage *)page->private; + struct extent_buffer *eb; + unsigned long flags; + u64 start; + + /* + * Take private lock to ensure the subpage won't be detached + * halfway. + */ + spin_lock(&page->mapping->private_lock); + if (!PagePrivate(page)) { + spin_unlock(&page->mapping->private_lock); + break; + } + spin_lock_irqsave(&subpage->lock, flags); + if (!((1 << bit_start) & subpage->dirty_bitmap)) { + spin_unlock_irqrestore(&subpage->lock, flags); + spin_unlock(&page->mapping->private_lock); + bit_start++; + continue; + } + + start = page_start + bit_start * fs_info->sectorsize; + bit_start += sectors_per_node; + + /* + * Here we just want to grab the eb without touching extra + * spin locks. So here we call find_extent_buffer_nospinlock(). + */ + eb = find_extent_buffer_nospinlock(fs_info, start); + spin_unlock_irqrestore(&subpage->lock, flags); + spin_unlock(&page->mapping->private_lock); + + /* + * The eb has already reached 0 refs thus find_extent_buffer() + * doesn't return it. We don't need to write back such eb + * anyway. + */ + if (!eb) + continue; + + ret = lock_extent_buffer_for_io(eb, epd); + if (ret == 0) { + free_extent_buffer(eb); + continue; + } + if (ret < 0) { + free_extent_buffer(eb); + goto cleanup; + } + ret = write_one_eb(eb, wbc, epd); + free_extent_buffer(eb); + if (ret < 0) + goto cleanup; + submitted++; + } + return submitted; + +cleanup: + /* We hit error, end bio for the submitted extent buffers */ + end_write_bio(epd, ret); + return ret; +} + /* * Submit all page(s) of one extent buffer. * @@ -4315,6 +4407,9 @@ static int submit_eb_page(struct page *page, struct writeback_control *wbc, if (!PagePrivate(page)) return 0; + if (btrfs_sb(page->mapping->host->i_sb)->sectorsize < PAGE_SIZE) + return submit_eb_subpage(page, wbc, epd); + spin_lock(&mapping->private_lock); if (!PagePrivate(page)) { spin_unlock(&mapping->private_lock);