From patchwork Sat Jan 16 07:15:16 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12024639 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 14DB3C4332B for ; Sat, 16 Jan 2021 07:16:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BE94D23AC4 for ; Sat, 16 Jan 2021 07:16:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726518AbhAPHQ2 (ORCPT ); Sat, 16 Jan 2021 02:16:28 -0500 Received: from mx2.suse.de ([195.135.220.15]:55940 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726391AbhAPHQ2 (ORCPT ); Sat, 16 Jan 2021 02:16:28 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1610781341; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jT2FlbFb5n2kN9losXqoiv1O32/bpQbWYnwrcDXeOlk=; b=HZb3cm5Qajqijp1i85+Frto5Ssurzsl4uYQRDphTWoTdNO0FRay6Jn4RIVUubLD057+GIA ehQxj/GWQu0T8xTvjE206zflw+zbP+fGKqjXyp+dzrKyCdwBxqxV2NHLhw5AUtjXFUEN4J bpR6r//qOtfllp/G37c21167xtUmanw= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 825D6AC63 for ; Sat, 16 Jan 2021 07:15:41 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 01/18] btrfs: update locked page dirty/writeback/error bits in __process_pages_contig() Date: Sat, 16 Jan 2021 15:15:16 +0800 Message-Id: <20210116071533.105780-2-wqu@suse.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210116071533.105780-1-wqu@suse.com> References: <20210116071533.105780-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org When __process_pages_contig() get called for extent_clear_unlock_delalloc(), if we hit the locked page, only Private2 bit is updated, but dirty/writeback/error bits are all skipped. There are several call sites call extent_clear_unlock_delalloc() with @locked_page and PAGE_CLEAR_DIRTY/PAGE_SET_WRITEBACK/PAGE_END_WRITEBACK - cow_file_range() - run_delalloc_nocow() - cow_file_range_async() All for their error handling branches. For those call sites, since we skip the locked page for dirty/error/writeback bit update, the locked page will still have its dirty bit remaining. Thankfully, since all those call sites can only be hit with various serious errors, it's pretty hard to hit and shouldn't affect regular btrfs operations. But still, we shouldn't leave the locked_page with its dirty/error/writeback bits untouched. Fix this by only skipping lock/unlock page operations for locked_page. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 7f689ad7709c..3442f1746683 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -1970,11 +1970,6 @@ static int __process_pages_contig(struct address_space *mapping, if (page_ops & PAGE_SET_PRIVATE2) SetPagePrivate2(pages[i]); - if (locked_page && pages[i] == locked_page) { - put_page(pages[i]); - pages_processed++; - continue; - } if (page_ops & PAGE_CLEAR_DIRTY) clear_page_dirty_for_io(pages[i]); if (page_ops & PAGE_SET_WRITEBACK) @@ -1983,6 +1978,11 @@ static int __process_pages_contig(struct address_space *mapping, SetPageError(pages[i]); if (page_ops & PAGE_END_WRITEBACK) end_page_writeback(pages[i]); + if (locked_page && pages[i] == locked_page) { + put_page(pages[i]); + pages_processed++; + continue; + } if (page_ops & PAGE_UNLOCK) unlock_page(pages[i]); if (page_ops & PAGE_LOCK) { From patchwork Sat Jan 16 07:15:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12024635 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A1530C433E6 for ; Sat, 16 Jan 2021 07:16:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6A12923AC4 for ; Sat, 16 Jan 2021 07:16:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726652AbhAPHQn (ORCPT ); Sat, 16 Jan 2021 02:16:43 -0500 Received: from mx2.suse.de ([195.135.220.15]:55950 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726391AbhAPHQc (ORCPT ); Sat, 16 Jan 2021 02:16:32 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1610781345; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=rDtEJOEBHYJgLrGCKmJp5aaZXYYlFJTnPoqgovS0O7M=; b=nIBhXzHkNiLzqGrGLQ3qtrMj5YOMALyTX+IneIetNzPuY96l18joLbDUm56pbzPU8XwXky /5Jzwa1nR8U8Nn/ATTN/bEkk+etAYVKT4fAKa+72XtRfHx2j3N9qhrd7Zz7eoiziwpZrbr lf5nU60iMBkxW6u8DQ2DE50pXai3Vm8= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 151BAB7F3 for ; Sat, 16 Jan 2021 07:15:45 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 02/18] btrfs: merge PAGE_CLEAR_DIRTY and PAGE_SET_WRITEBACK into PAGE_START_WRITEBACK Date: Sat, 16 Jan 2021 15:15:17 +0800 Message-Id: <20210116071533.105780-3-wqu@suse.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210116071533.105780-1-wqu@suse.com> References: <20210116071533.105780-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org PAGE_CLEAR_DIRTY and PAGE_SET_WRITEBACK are two macros used in __process_pages_contig(), to inform the function to clear page dirty and then set page writeback. However page write back and dirty are two conflict status (at least for sector size == PAGE_SIZE case), this means those two macros are always called together. This means we can merge PAGE_CLEAR_DIRTY and PAGE_SET_WRITEBACK, into one macro, PAGE_START_WRITEBACK. Signed-off-by: Qu Wenruo Reviewed-by: Josef Bacik --- fs/btrfs/extent_io.c | 4 ++-- fs/btrfs/extent_io.h | 12 ++++++------ fs/btrfs/inode.c | 28 ++++++++++------------------ 3 files changed, 18 insertions(+), 26 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 3442f1746683..a816ba4a8537 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -1970,10 +1970,10 @@ static int __process_pages_contig(struct address_space *mapping, if (page_ops & PAGE_SET_PRIVATE2) SetPagePrivate2(pages[i]); - if (page_ops & PAGE_CLEAR_DIRTY) + if (page_ops & PAGE_START_WRITEBACK) { clear_page_dirty_for_io(pages[i]); - if (page_ops & PAGE_SET_WRITEBACK) set_page_writeback(pages[i]); + } if (page_ops & PAGE_SET_ERROR) SetPageError(pages[i]); if (page_ops & PAGE_END_WRITEBACK) diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h index 19221095c635..bedf761a0300 100644 --- a/fs/btrfs/extent_io.h +++ b/fs/btrfs/extent_io.h @@ -35,12 +35,12 @@ enum { /* these are flags for __process_pages_contig */ #define PAGE_UNLOCK (1 << 0) -#define PAGE_CLEAR_DIRTY (1 << 1) -#define PAGE_SET_WRITEBACK (1 << 2) -#define PAGE_END_WRITEBACK (1 << 3) -#define PAGE_SET_PRIVATE2 (1 << 4) -#define PAGE_SET_ERROR (1 << 5) -#define PAGE_LOCK (1 << 6) +/* This one will clera page dirty and then set paeg writeback */ +#define PAGE_START_WRITEBACK (1 << 1) +#define PAGE_END_WRITEBACK (1 << 2) +#define PAGE_SET_PRIVATE2 (1 << 3) +#define PAGE_SET_ERROR (1 << 4) +#define PAGE_LOCK (1 << 5) /* * page->private values. Every page that is controlled by the extent diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index ef6cb7b620d0..1ab5cb89c530 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -692,8 +692,7 @@ static noinline int compress_file_range(struct async_chunk *async_chunk) NULL, clear_flags, PAGE_UNLOCK | - PAGE_CLEAR_DIRTY | - PAGE_SET_WRITEBACK | + PAGE_START_WRITEBACK | page_error_op | PAGE_END_WRITEBACK); @@ -934,8 +933,7 @@ static noinline void submit_compressed_extents(struct async_chunk *async_chunk) async_extent->start + async_extent->ram_size - 1, NULL, EXTENT_LOCKED | EXTENT_DELALLOC, - PAGE_UNLOCK | PAGE_CLEAR_DIRTY | - PAGE_SET_WRITEBACK); + PAGE_UNLOCK | PAGE_START_WRITEBACK); if (btrfs_submit_compressed_write(inode, async_extent->start, async_extent->ram_size, ins.objectid, @@ -971,9 +969,8 @@ static noinline void submit_compressed_extents(struct async_chunk *async_chunk) NULL, EXTENT_LOCKED | EXTENT_DELALLOC | EXTENT_DELALLOC_NEW | EXTENT_DEFRAG | EXTENT_DO_ACCOUNTING, - PAGE_UNLOCK | PAGE_CLEAR_DIRTY | - PAGE_SET_WRITEBACK | PAGE_END_WRITEBACK | - PAGE_SET_ERROR); + PAGE_UNLOCK | PAGE_START_WRITEBACK | + PAGE_END_WRITEBACK | PAGE_SET_ERROR); free_async_extent_pages(async_extent); kfree(async_extent); goto again; @@ -1071,8 +1068,7 @@ static noinline int cow_file_range(struct btrfs_inode *inode, EXTENT_LOCKED | EXTENT_DELALLOC | EXTENT_DELALLOC_NEW | EXTENT_DEFRAG | EXTENT_DO_ACCOUNTING, PAGE_UNLOCK | - PAGE_CLEAR_DIRTY | PAGE_SET_WRITEBACK | - PAGE_END_WRITEBACK); + PAGE_START_WRITEBACK | PAGE_END_WRITEBACK); *nr_written = *nr_written + (end - start + PAGE_SIZE) / PAGE_SIZE; *page_started = 1; @@ -1194,8 +1190,7 @@ static noinline int cow_file_range(struct btrfs_inode *inode, out_unlock: clear_bits = EXTENT_LOCKED | EXTENT_DELALLOC | EXTENT_DELALLOC_NEW | EXTENT_DEFRAG | EXTENT_CLEAR_META_RESV; - page_ops = PAGE_UNLOCK | PAGE_CLEAR_DIRTY | PAGE_SET_WRITEBACK | - PAGE_END_WRITEBACK; + page_ops = PAGE_UNLOCK | PAGE_START_WRITEBACK | PAGE_END_WRITEBACK; /* * If we reserved an extent for our delalloc range (or a subrange) and * failed to create the respective ordered extent, then it means that @@ -1320,9 +1315,8 @@ static int cow_file_range_async(struct btrfs_inode *inode, unsigned clear_bits = EXTENT_LOCKED | EXTENT_DELALLOC | EXTENT_DELALLOC_NEW | EXTENT_DEFRAG | EXTENT_DO_ACCOUNTING; - unsigned long page_ops = PAGE_UNLOCK | PAGE_CLEAR_DIRTY | - PAGE_SET_WRITEBACK | PAGE_END_WRITEBACK | - PAGE_SET_ERROR; + unsigned long page_ops = PAGE_UNLOCK | PAGE_START_WRITEBACK | + PAGE_END_WRITEBACK | PAGE_SET_ERROR; extent_clear_unlock_delalloc(inode, start, end, locked_page, clear_bits, page_ops); @@ -1519,8 +1513,7 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode, EXTENT_LOCKED | EXTENT_DELALLOC | EXTENT_DO_ACCOUNTING | EXTENT_DEFRAG, PAGE_UNLOCK | - PAGE_CLEAR_DIRTY | - PAGE_SET_WRITEBACK | + PAGE_START_WRITEBACK | PAGE_END_WRITEBACK); return -ENOMEM; } @@ -1842,8 +1835,7 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode, locked_page, EXTENT_LOCKED | EXTENT_DELALLOC | EXTENT_DEFRAG | EXTENT_DO_ACCOUNTING, PAGE_UNLOCK | - PAGE_CLEAR_DIRTY | - PAGE_SET_WRITEBACK | + PAGE_START_WRITEBACK | PAGE_END_WRITEBACK); btrfs_free_path(path); return ret; From patchwork Sat Jan 16 07:15:18 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12024633 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B84AAC433E9 for ; Sat, 16 Jan 2021 07:16:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 883F423AC0 for ; Sat, 16 Jan 2021 07:16:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726640AbhAPHQn (ORCPT ); Sat, 16 Jan 2021 02:16:43 -0500 Received: from mx2.suse.de ([195.135.220.15]:55960 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726545AbhAPHQe (ORCPT ); Sat, 16 Jan 2021 02:16:34 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1610781347; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pwqT+kxeNLaN7MWepezkQc+BA1lzrQHStydNIfTLKMU=; b=Amytb6ExS4g7MuhFDQGftu1rsuf2Ps29dX7Nh7kcqjYPBwbDJLNxwnV/LP10msCdLdI8+a ddv4OjeVXM1KCeVB1AwnozwXMYFhMjNYMyzUQU2jP98XhZ5A/RrphuqbrNBl+SPShTEldZ uWjoKeOBj6dB6mZ2jrqAHnABuQjkegI= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id A404FB7F4; Sat, 16 Jan 2021 07:15:47 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Cc: Josef Bacik Subject: [PATCH v4 03/18] btrfs: introduce the skeleton of btrfs_subpage structure Date: Sat, 16 Jan 2021 15:15:18 +0800 Message-Id: <20210116071533.105780-4-wqu@suse.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210116071533.105780-1-wqu@suse.com> References: <20210116071533.105780-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org For btrfs subpage support, we need a structure to record extra info for the status of each sectors of a page. This patch will introduce the skeleton structure for future btrfs subpage support. All subpage related code would go to subpage.[ch] to avoid populating the existing code base. Reviewed-by: Josef Bacik Signed-off-by: Qu Wenruo --- fs/btrfs/Makefile | 3 ++- fs/btrfs/subpage.c | 39 +++++++++++++++++++++++++++++++++++++++ fs/btrfs/subpage.h | 31 +++++++++++++++++++++++++++++++ 3 files changed, 72 insertions(+), 1 deletion(-) create mode 100644 fs/btrfs/subpage.c create mode 100644 fs/btrfs/subpage.h diff --git a/fs/btrfs/Makefile b/fs/btrfs/Makefile index 9f1b1a88e317..942562e11456 100644 --- a/fs/btrfs/Makefile +++ b/fs/btrfs/Makefile @@ -11,7 +11,8 @@ btrfs-y += super.o ctree.o extent-tree.o print-tree.o root-tree.o dir-item.o \ compression.o delayed-ref.o relocation.o delayed-inode.o scrub.o \ reada.o backref.o ulist.o qgroup.o send.o dev-replace.o raid56.o \ uuid-tree.o props.o free-space-tree.o tree-checker.o space-info.o \ - block-rsv.o delalloc-space.o block-group.o discard.o reflink.o + block-rsv.o delalloc-space.o block-group.o discard.o reflink.o \ + subpage.o btrfs-$(CONFIG_BTRFS_FS_POSIX_ACL) += acl.o btrfs-$(CONFIG_BTRFS_FS_CHECK_INTEGRITY) += check-integrity.o diff --git a/fs/btrfs/subpage.c b/fs/btrfs/subpage.c new file mode 100644 index 000000000000..c6ab32db3995 --- /dev/null +++ b/fs/btrfs/subpage.c @@ -0,0 +1,39 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include "subpage.h" + +int btrfs_attach_subpage(struct btrfs_fs_info *fs_info, struct page *page) +{ + struct btrfs_subpage *subpage; + + /* + * We have cases like a dummy extent buffer page, which is not + * mappped and doesn't need to be locked. + */ + if (page->mapping) + ASSERT(PageLocked(page)); + /* Either not subpage, or the page already has private attached */ + if (fs_info->sectorsize == PAGE_SIZE || PagePrivate(page)) + return 0; + + subpage = kzalloc(sizeof(*subpage), GFP_NOFS); + if (!subpage) + return -ENOMEM; + + spin_lock_init(&subpage->lock); + attach_page_private(page, subpage); + return 0; +} + +void btrfs_detach_subpage(struct btrfs_fs_info *fs_info, struct page *page) +{ + struct btrfs_subpage *subpage; + + /* Either not subpage, or already detached */ + if (fs_info->sectorsize == PAGE_SIZE || !PagePrivate(page)) + return; + + subpage = (struct btrfs_subpage *)detach_page_private(page); + ASSERT(subpage); + kfree(subpage); +} diff --git a/fs/btrfs/subpage.h b/fs/btrfs/subpage.h new file mode 100644 index 000000000000..96f3b226913e --- /dev/null +++ b/fs/btrfs/subpage.h @@ -0,0 +1,31 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#ifndef BTRFS_SUBPAGE_H +#define BTRFS_SUBPAGE_H + +#include +#include "ctree.h" + +/* + * Since the maximum page size btrfs is going to support is 64K while the + * minimum sectorsize is 4K, this means a u16 bitmap is enough. + * + * The regular bitmap requires 32 bits as minimal bitmap size, so we can't use + * existing bitmap_* helpers here. + */ +#define BTRFS_SUBPAGE_BITMAP_SIZE 16 + +/* + * Structure to trace status of each sector inside a page. + * + * Will be attached to page::private for both data and metadata inodes. + */ +struct btrfs_subpage { + /* Common members for both data and metadata pages */ + spinlock_t lock; +}; + +int btrfs_attach_subpage(struct btrfs_fs_info *fs_info, struct page *page); +void btrfs_detach_subpage(struct btrfs_fs_info *fs_info, struct page *page); + +#endif /* BTRFS_SUBPAGE_H */ From patchwork Sat Jan 16 07:15:19 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12024637 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C5BCFC43381 for ; Sat, 16 Jan 2021 07:16:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A164123AC1 for ; Sat, 16 Jan 2021 07:16:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726615AbhAPHQm (ORCPT ); Sat, 16 Jan 2021 02:16:42 -0500 Received: from mx2.suse.de ([195.135.220.15]:55972 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726578AbhAPHQi (ORCPT ); Sat, 16 Jan 2021 02:16:38 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1610781351; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=l4I2zDzKnp2EJsv2/i6cS6Rqzk5uTgvT2mC+pztUcVo=; b=E67SE5hz3PevKAn/pFuJBhggzvZf2COfaYteyJi300c1Guf6wkgoaEMKGGveG5vN92mJbr hFcyCFVKIsuPM18dEUCLyNR1p2br2cnoIZTVOixzgz1Zr0b5gAeUbZbiCj8ZN2fN2ZIwwx KafRKuZaHhiOwAMc3pj+T4r9BCYUeQ8= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 10EC4B7F5 for ; Sat, 16 Jan 2021 07:15:51 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 04/18] btrfs: make attach_extent_buffer_page() to handle subpage case Date: Sat, 16 Jan 2021 15:15:19 +0800 Message-Id: <20210116071533.105780-5-wqu@suse.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210116071533.105780-1-wqu@suse.com> References: <20210116071533.105780-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org For subpage case, we need to allocate new memory for each metadata page. So we need to: - Allow attach_extent_buffer_page() to return int To indicate allocation failure - Prealloc btrfs_subpage structure for alloc_extent_buffer() We don't want to call memory allocation with spinlock hold, so do preallocation before we acquire mapping->private_lock. - Handle subpage and regular case differently in attach_extent_buffer_page() For regular case, just do the usual thing. For subpage case, allocate new memory or use the preallocated memory. For future subpage metadata, we will make more usage of radix tree to grab extnet buffer. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 75 ++++++++++++++++++++++++++++++++++++++------ fs/btrfs/subpage.h | 17 ++++++++++ 2 files changed, 82 insertions(+), 10 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index a816ba4a8537..320731487ac0 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -24,6 +24,7 @@ #include "rcu-string.h" #include "backref.h" #include "disk-io.h" +#include "subpage.h" static struct kmem_cache *extent_state_cache; static struct kmem_cache *extent_buffer_cache; @@ -3140,9 +3141,13 @@ static int submit_extent_page(unsigned int opf, return ret; } -static void attach_extent_buffer_page(struct extent_buffer *eb, - struct page *page) +static int attach_extent_buffer_page(struct extent_buffer *eb, + struct page *page, + struct btrfs_subpage *prealloc) { + struct btrfs_fs_info *fs_info = eb->fs_info; + int ret; + /* * If the page is mapped to btree inode, we should hold the private * lock to prevent race. @@ -3152,10 +3157,32 @@ static void attach_extent_buffer_page(struct extent_buffer *eb, if (page->mapping) lockdep_assert_held(&page->mapping->private_lock); - if (!PagePrivate(page)) - attach_page_private(page, eb); - else - WARN_ON(page->private != (unsigned long)eb); + if (fs_info->sectorsize == PAGE_SIZE) { + if (!PagePrivate(page)) + attach_page_private(page, eb); + else + WARN_ON(page->private != (unsigned long)eb); + return 0; + } + + /* Already mapped, just free prealloc */ + if (PagePrivate(page)) { + kfree(prealloc); + return 0; + } + + if (prealloc) { + /* Has preallocated memory for subpage */ + spin_lock_init(&prealloc->lock); + attach_page_private(page, prealloc); + } else { + /* Do new allocation to attach subpage */ + ret = btrfs_attach_subpage(fs_info, page); + if (ret < 0) + return ret; + } + + return 0; } void set_page_extent_mapped(struct page *page) @@ -5062,21 +5089,29 @@ struct extent_buffer *btrfs_clone_extent_buffer(const struct extent_buffer *src) if (new == NULL) return NULL; + set_bit(EXTENT_BUFFER_UPTODATE, &new->bflags); + set_bit(EXTENT_BUFFER_UNMAPPED, &new->bflags); + for (i = 0; i < num_pages; i++) { + int ret; + p = alloc_page(GFP_NOFS); if (!p) { btrfs_release_extent_buffer(new); return NULL; } - attach_extent_buffer_page(new, p); + ret = attach_extent_buffer_page(new, p, NULL); + if (ret < 0) { + put_page(p); + btrfs_release_extent_buffer(new); + return NULL; + } WARN_ON(PageDirty(p)); SetPageUptodate(p); new->pages[i] = p; copy_page(page_address(p), page_address(src->pages[i])); } - set_bit(EXTENT_BUFFER_UPTODATE, &new->bflags); - set_bit(EXTENT_BUFFER_UNMAPPED, &new->bflags); return new; } @@ -5308,12 +5343,28 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, num_pages = num_extent_pages(eb); for (i = 0; i < num_pages; i++, index++) { + struct btrfs_subpage *prealloc = NULL; + p = find_or_create_page(mapping, index, GFP_NOFS|__GFP_NOFAIL); if (!p) { exists = ERR_PTR(-ENOMEM); goto free_eb; } + /* + * Preallocate page->private for subpage case, so that + * we won't allocate memory with private_lock hold. + * The memory will be freed by attach_extent_buffer_page() or + * freed manually if exit earlier. + */ + ret = btrfs_alloc_subpage(fs_info, &prealloc); + if (ret < 0) { + unlock_page(p); + put_page(p); + exists = ERR_PTR(ret); + goto free_eb; + } + spin_lock(&mapping->private_lock); exists = grab_extent_buffer(p); if (exists) { @@ -5321,10 +5372,14 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, unlock_page(p); put_page(p); mark_extent_buffer_accessed(exists, p); + kfree(prealloc); goto free_eb; } - attach_extent_buffer_page(eb, p); + /* Should not fail, as we have preallocated the memory */ + ret = attach_extent_buffer_page(eb, p, prealloc); + ASSERT(!ret); spin_unlock(&mapping->private_lock); + WARN_ON(PageDirty(p)); eb->pages[i] = p; if (!PageUptodate(p)) diff --git a/fs/btrfs/subpage.h b/fs/btrfs/subpage.h index 96f3b226913e..f701256dd1e2 100644 --- a/fs/btrfs/subpage.h +++ b/fs/btrfs/subpage.h @@ -23,8 +23,25 @@ struct btrfs_subpage { /* Common members for both data and metadata pages */ spinlock_t lock; + union { + /* Structures only used by metadata */ + /* Structures only used by data */ + }; }; +/* For rare cases where we need to pre-allocate a btrfs_subpage structure */ +static inline int btrfs_alloc_subpage(struct btrfs_fs_info *fs_info, + struct btrfs_subpage **ret) +{ + if (fs_info->sectorsize == PAGE_SIZE) + return 0; + + *ret = kzalloc(sizeof(struct btrfs_subpage), GFP_NOFS); + if (!*ret) + return -ENOMEM; + return 0; +} + int btrfs_attach_subpage(struct btrfs_fs_info *fs_info, struct page *page); void btrfs_detach_subpage(struct btrfs_fs_info *fs_info, struct page *page); From patchwork Sat Jan 16 07:15:20 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12024641 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5D152C433E0 for ; Sat, 16 Jan 2021 07:17:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 02B9B23AC4 for ; Sat, 16 Jan 2021 07:17:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726680AbhAPHRN (ORCPT ); Sat, 16 Jan 2021 02:17:13 -0500 Received: from mx2.suse.de ([195.135.220.15]:56144 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726653AbhAPHRN (ORCPT ); Sat, 16 Jan 2021 02:17:13 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1610781354; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dIUiWj736mPErI3yMq7kL8uX6l7OOGl8OPpmeFasdaI=; b=qaMchPwruLPMR/mGdESbxSkuxHJ0Od1toL5DdOc7aWcuEPAaaGEi4vWRleZ/aYCxVydDrH rIEf5yySnv89FeOr5YkpbLBuJE3rcscaMMAfQRd+akp2XOvdUvotkLKmBYYBXKG12vBREF CUBtcPhq8mmU1wYDd7aasYWXYWYuq2M= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 498B9B7F6 for ; Sat, 16 Jan 2021 07:15:54 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 05/18] btrfs: make grab_extent_buffer_from_page() to handle subpage case Date: Sat, 16 Jan 2021 15:15:20 +0800 Message-Id: <20210116071533.105780-6-wqu@suse.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210116071533.105780-1-wqu@suse.com> References: <20210116071533.105780-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org For subpage case, grab_extent_buffer() can't really get an extent buffer just from btrfs_subpage. Thankfully we have radix tree lock protecting us from inserting the same eb into the tree. Thus we don't really need to do the extra hassle, just let alloc_extent_buffer() to handle existing eb in radix tree. Now if two ebs are being allocated as the same time, one will fail with -EEIXST when inserting its eb into the radix tree. So for grab_extent_buffer(), just always return NULL for subpage case. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 320731487ac0..b2f8ac5e9a9e 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -5282,10 +5282,19 @@ struct extent_buffer *alloc_test_extent_buffer(struct btrfs_fs_info *fs_info, } #endif -static struct extent_buffer *grab_extent_buffer(struct page *page) +static struct extent_buffer *grab_extent_buffer( + struct btrfs_fs_info *fs_info, struct page *page) { struct extent_buffer *exists; + /* + * For subpage case, we completely rely on radix tree to ensure we + * don't try to insert two eb for the same bytenr. + * So here we alwasy return NULL and just continue. + */ + if (fs_info->sectorsize < PAGE_SIZE) + return NULL; + /* Page not yet attached to an extent buffer */ if (!PagePrivate(page)) return NULL; @@ -5366,7 +5375,7 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, } spin_lock(&mapping->private_lock); - exists = grab_extent_buffer(p); + exists = grab_extent_buffer(fs_info, p); if (exists) { spin_unlock(&mapping->private_lock); unlock_page(p); From patchwork Sat Jan 16 07:15:21 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12024643 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 817B3C433DB for ; Sat, 16 Jan 2021 07:17:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2E24023AC6 for ; Sat, 16 Jan 2021 07:17:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726703AbhAPHRO (ORCPT ); Sat, 16 Jan 2021 02:17:14 -0500 Received: from mx2.suse.de ([195.135.220.15]:56142 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726571AbhAPHRN (ORCPT ); Sat, 16 Jan 2021 02:17:13 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1610781356; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=MkGUkfEed0KdSnnXqwvejbqJynWDHfb7s5OM6xbT2GE=; b=EhxA8sBR8RiTF0gqe3z2e0/wFLwv7jTY8IrLgqt4g5kFd7+c8TMVM/OWhQequU+keDVNCi lDXyr+NXMsaxHjG9gL1VqYmN539pzroQt46Nv4s/zbzCNtknB+aY4gBzzLeTHfHq4w+ppL 2P1JVb8Reaxmgyr3KZ4kn2Batb2i0WY= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id BE986B7F8 for ; Sat, 16 Jan 2021 07:15:56 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 06/18] btrfs: support subpage for extent buffer page release Date: Sat, 16 Jan 2021 15:15:21 +0800 Message-Id: <20210116071533.105780-7-wqu@suse.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210116071533.105780-1-wqu@suse.com> References: <20210116071533.105780-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org In btrfs_release_extent_buffer_pages(), we need to add extra handling for subpage. To do so, introduce a new helper, detach_extent_buffer_page(), to do different handling for regular and subpage cases. For subpage case, the new trick is about when to detach the page private. For unammped (dummy or cloned) ebs, we can detach the page private immediately as the page can only be attached to one unmapped eb. For mapped ebs, we have to ensure there are no eb in the page range before we delete it, as page->private is shared between all ebs in the same page. But there is a subpage specific race, where we can race with extent buffer allocation, and clear the page private while new eb is still being utilized, like this: Extent buffer A is the new extent buffer which will be allocated, while extent buffer B is the last existing extent buffer of the page. T1 (eb A) | T2 (eb B) -------------------------------+------------------------------ alloc_extent_buffer() | btrfs_release_extent_buffer_pages() |- p = find_or_create_page() | | |- attach_extent_buffer_page() | | | | |- detach_extent_buffer_page() | | |- if (!page_range_has_eb()) | | | No new eb in the page range yet | | | As new eb A hasn't yet been | | | inserted into radix tree. | | |- btrfs_detach_subpage() | | |- detach_page_private(); |- radix_tree_insert() | Then we have a metadata eb whose page has no private bit. To avoid such race, we introduce a subpage metadata specific member, btrfs_subpage::under_alloc. In alloc_extent_buffer() we set that bit with the critical section of private_lock. So that page_range_has_eb() will return true for detach_extent_buffer_page(), and not to detach page private. New helpers are introduced to do the start/end work: - btrfs_page_start_meta_alloc() - btrfs_page_end_meta_alloc() Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 123 +++++++++++++++++++++++++++++++++++++------ fs/btrfs/subpage.h | 33 ++++++++++++ 2 files changed, 139 insertions(+), 17 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index b2f8ac5e9a9e..fb800f237099 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -4997,25 +4997,55 @@ int extent_buffer_under_io(const struct extent_buffer *eb) test_bit(EXTENT_BUFFER_DIRTY, &eb->bflags)); } -/* - * Release all pages attached to the extent buffer. - */ -static void btrfs_release_extent_buffer_pages(struct extent_buffer *eb) +static bool page_range_has_eb(struct btrfs_fs_info *fs_info, + struct page *page) { - int i; - int num_pages; - int mapped = !test_bit(EXTENT_BUFFER_UNMAPPED, &eb->bflags); + struct extent_buffer *gang[BTRFS_SUBPAGE_BITMAP_SIZE]; + struct btrfs_subpage *subpage; + int ret; - BUG_ON(extent_buffer_under_io(eb)); + lockdep_assert_held(&fs_info->buffer_lock); + lockdep_assert_held(&page->mapping->private_lock); + ASSERT(PAGE_SIZE / fs_info->nodesize <= BTRFS_SUBPAGE_BITMAP_SIZE); - num_pages = num_extent_pages(eb); - for (i = 0; i < num_pages; i++) { - struct page *page = eb->pages[i]; + /* We have eb under allocation in the page */ + if (PagePrivate(page)) { + subpage = (struct btrfs_subpage *)page->private; + if (subpage->under_alloc) + return true; + } + ret = radix_tree_gang_lookup(&fs_info->buffer_radix, (void **)gang, + page_offset(page) >> fs_info->sectorsize_bits, + PAGE_SIZE / fs_info->nodesize); + /* + * Either no eb at all, or the first found eb is already beyond the + * page end, then it means no eb in the page range. + */ + if (ret == 0 || gang[0]->start >= page_offset(page) + PAGE_SIZE) + return false; + return true; +} - if (!page) - continue; +static void detach_extent_buffer_page(struct extent_buffer *eb, + struct page *page) +{ + struct btrfs_fs_info *fs_info = eb->fs_info; + bool mapped = !test_bit(EXTENT_BUFFER_UNMAPPED, &eb->bflags); + + /* + * For mapped eb, we're going to change the page private, which should be + * done under the private_lock. + */ + if (mapped) + spin_lock(&page->mapping->private_lock); + + if (!PagePrivate(page)) { if (mapped) - spin_lock(&page->mapping->private_lock); + spin_unlock(&page->mapping->private_lock); + return; + } + + if (fs_info->sectorsize == PAGE_SIZE) { /* * We do this since we'll remove the pages after we've * removed the eb from the radix tree, so we could race @@ -5034,9 +5064,54 @@ static void btrfs_release_extent_buffer_pages(struct extent_buffer *eb) */ detach_page_private(page); } - if (mapped) spin_unlock(&page->mapping->private_lock); + return; + } + + /* + * For subpage, we can have dummy eb with page private. + * In this case, we can directly detach the private as such page is + * only attached to one dummy eb, no sharing. + */ + if (!mapped) { + btrfs_detach_subpage(fs_info, page); + return; + } + + /* + * We can only detach the page private if there are no other eb in the + * page range. + * + * We want an atomic snapshot of the radix tree, thus we go spinlock + * other than RCU here. + */ + spin_lock(&fs_info->buffer_lock); + if (!page_range_has_eb(fs_info, page)) + btrfs_detach_subpage(fs_info, page); + spin_unlock(&fs_info->buffer_lock); + + spin_unlock(&page->mapping->private_lock); +} + +/* + * Release all pages attached to the extent buffer. + */ +static void btrfs_release_extent_buffer_pages(struct extent_buffer *eb) +{ + int i; + int num_pages; + + ASSERT(!extent_buffer_under_io(eb)); + + num_pages = num_extent_pages(eb); + for (i = 0; i < num_pages; i++) { + struct page *page = eb->pages[i]; + + if (!page) + continue; + + detach_extent_buffer_page(eb, page); /* One for when we allocated the page */ put_page(page); @@ -5387,6 +5462,12 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, /* Should not fail, as we have preallocated the memory */ ret = attach_extent_buffer_page(eb, p, prealloc); ASSERT(!ret); + /* + * To inform we have extra eb under allocation, so that + * detach_extent_buffer_page() won't release the page private + * when the eb hasn't yet been inserted into radix tree. + */ + btrfs_page_start_meta_alloc(fs_info, p); spin_unlock(&mapping->private_lock); WARN_ON(PageDirty(p)); @@ -5432,15 +5513,23 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, * btree_releasepage will correctly detect that a page belongs to a * live buffer and won't free them prematurely. */ - for (i = 0; i < num_pages; i++) + for (i = 0; i < num_pages; i++) { + /* + * The eb is in radix tree now, no longer needs the extra + * indicator. + */ + btrfs_page_end_meta_alloc(fs_info, eb->pages[i]); unlock_page(eb->pages[i]); + } return eb; free_eb: WARN_ON(!atomic_dec_and_test(&eb->refs)); for (i = 0; i < num_pages; i++) { - if (eb->pages[i]) + if (eb->pages[i]) { + btrfs_page_end_meta_alloc(fs_info, eb->pages[i]); unlock_page(eb->pages[i]); + } } btrfs_release_extent_buffer(eb); diff --git a/fs/btrfs/subpage.h b/fs/btrfs/subpage.h index f701256dd1e2..d8b34879368d 100644 --- a/fs/btrfs/subpage.h +++ b/fs/btrfs/subpage.h @@ -25,6 +25,7 @@ struct btrfs_subpage { spinlock_t lock; union { /* Structures only used by metadata */ + bool under_alloc; /* Structures only used by data */ }; }; @@ -42,6 +43,38 @@ static inline int btrfs_alloc_subpage(struct btrfs_fs_info *fs_info, return 0; } +/* + * To inform that the page is under metadata allocation, so that + * page private shouldn't be freed. + */ +static inline void btrfs_page_start_meta_alloc(struct btrfs_fs_info *fs_info, + struct page *page) +{ + struct btrfs_subpage *subpage; + + if (fs_info->sectorsize == PAGE_SIZE) + return; + + ASSERT(PagePrivate(page) && page->mapping); + + subpage = (struct btrfs_subpage *)page->private; + subpage->under_alloc = true; +} + +static inline void btrfs_page_end_meta_alloc(struct btrfs_fs_info *fs_info, + struct page *page) +{ + struct btrfs_subpage *subpage; + + if (fs_info->sectorsize == PAGE_SIZE) + return; + + ASSERT(PagePrivate(page) && page->mapping); + + subpage = (struct btrfs_subpage *)page->private; + subpage->under_alloc = false; +} + int btrfs_attach_subpage(struct btrfs_fs_info *fs_info, struct page *page); void btrfs_detach_subpage(struct btrfs_fs_info *fs_info, struct page *page); From patchwork Sat Jan 16 07:15:22 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12024647 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B872AC433E6 for ; Sat, 16 Jan 2021 07:17:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5E30F23AF8 for ; Sat, 16 Jan 2021 07:17:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726714AbhAPHRO (ORCPT ); Sat, 16 Jan 2021 02:17:14 -0500 Received: from mx2.suse.de ([195.135.220.15]:56156 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726701AbhAPHRO (ORCPT ); Sat, 16 Jan 2021 02:17:14 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1610781358; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=YUfWAyhbQG+1m6zDQGC63vBbQkN2NuPvOKVSpKvbno8=; b=H/xBxIQBADFqmgzkK1W6z33Ro1B5fA+L+n633IvZT5OJERLtkKFumS5MXY5p5sqOgKTr/f k/kv1wEPnIxTtnvcW8YbQUYHwQ4F3r5g47aNLsdqsp2IPgTm3/KODoBfOTbKufDOi3YPvL pU+x0vapHJcYE82Tu5x3Tg9iEaKU314= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id D786DB8F2 for ; Sat, 16 Jan 2021 07:15:58 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 07/18] btrfs: attach private to dummy extent buffer pages Date: Sat, 16 Jan 2021 15:15:22 +0800 Message-Id: <20210116071533.105780-8-wqu@suse.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210116071533.105780-1-wqu@suse.com> References: <20210116071533.105780-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Even for regular btrfs, there are locations where we allocate dummy extent buffers for temporary usage. Like tree_mod_log_rewind() and get_old_root(). Those dummy extent buffers will be handled by the same eb accessors, and if they don't have page::private subpage eb accessors can fail. To address such problems, make __alloc_dummy_extent_buffer() to attach page private for dummy extent buffers too. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index fb800f237099..7f94f00936d7 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -5204,9 +5204,14 @@ struct extent_buffer *__alloc_dummy_extent_buffer(struct btrfs_fs_info *fs_info, num_pages = num_extent_pages(eb); for (i = 0; i < num_pages; i++) { + int ret; + eb->pages[i] = alloc_page(GFP_NOFS); if (!eb->pages[i]) goto err; + ret = attach_extent_buffer_page(eb, eb->pages[i], NULL); + if (ret < 0) + goto err; } set_extent_buffer_uptodate(eb); btrfs_set_header_nritems(eb, 0); @@ -5214,8 +5219,10 @@ struct extent_buffer *__alloc_dummy_extent_buffer(struct btrfs_fs_info *fs_info, return eb; err: - for (; i > 0; i--) + for (; i > 0; i--) { + detach_extent_buffer_page(eb, eb->pages[i - 1]); __free_page(eb->pages[i - 1]); + } __free_extent_buffer(eb); return NULL; } From patchwork Sat Jan 16 07:15:23 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12024645 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CD8ABC433E9 for ; Sat, 16 Jan 2021 07:17:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8F02F23AC4 for ; Sat, 16 Jan 2021 07:17:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726722AbhAPHRP (ORCPT ); Sat, 16 Jan 2021 02:17:15 -0500 Received: from mx2.suse.de ([195.135.220.15]:56154 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726653AbhAPHRP (ORCPT ); Sat, 16 Jan 2021 02:17:15 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1610781365; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0uWXYvTEdjZsKzp6m1nuGl0nVLq/mq2rMKyNgnW8ysQ=; b=adOSqopPrt8QW7SUYfgu2UMXnyR3NYWS4GmJ+TWwUHelDpgVvnbalCn8GmaVPUsvpg/+1x RsEETtKvft3qO5vUwVB2jc4tXOfwYhC86DwDbZ//YCdZ+kZ2o7G8Z9h4RIIXtIU9tZifPH cNecLEnQcxldcgCv0MlwofoTqR4FuPM= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 1AB42B7F9 for ; Sat, 16 Jan 2021 07:16:05 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 08/18] btrfs: introduce helper for subpage uptodate status Date: Sat, 16 Jan 2021 15:15:23 +0800 Message-Id: <20210116071533.105780-9-wqu@suse.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210116071533.105780-1-wqu@suse.com> References: <20210116071533.105780-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org This patch introduce the following functions to handle btrfs subpage uptodate status: - btrfs_subpage_set_uptodate() - btrfs_subpage_clear_uptodate() - btrfs_subpage_test_uptodate() Those helpers can only be called when the range is ensured to be inside the page. - btrfs_page_set_uptodate() - btrfs_page_clear_uptodate() - btrfs_page_test_uptodate() Those helpers can handle both regular sector size and subpage without problem. Although caller should still ensure that the range is inside the page. Signed-off-by: Qu Wenruo --- fs/btrfs/subpage.h | 115 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 115 insertions(+) diff --git a/fs/btrfs/subpage.h b/fs/btrfs/subpage.h index d8b34879368d..3373ef4ffec1 100644 --- a/fs/btrfs/subpage.h +++ b/fs/btrfs/subpage.h @@ -23,6 +23,7 @@ struct btrfs_subpage { /* Common members for both data and metadata pages */ spinlock_t lock; + u16 uptodate_bitmap; union { /* Structures only used by metadata */ bool under_alloc; @@ -78,4 +79,118 @@ static inline void btrfs_page_end_meta_alloc(struct btrfs_fs_info *fs_info, int btrfs_attach_subpage(struct btrfs_fs_info *fs_info, struct page *page); void btrfs_detach_subpage(struct btrfs_fs_info *fs_info, struct page *page); +/* + * Convert the [start, start + len) range into a u16 bitmap + * + * E.g. if start == page_offset() + 16K, len = 16K, we get 0x00f0. + */ +static inline u16 btrfs_subpage_calc_bitmap(struct btrfs_fs_info *fs_info, + struct page *page, u64 start, u32 len) +{ + int bit_start = offset_in_page(start) >> fs_info->sectorsize_bits; + int nbits = len >> fs_info->sectorsize_bits; + + /* Basic checks */ + ASSERT(PagePrivate(page) && page->private); + ASSERT(IS_ALIGNED(start, fs_info->sectorsize) && + IS_ALIGNED(len, fs_info->sectorsize)); + + /* + * The range check only works for mapped page, we can + * still have unampped page like dummy extent buffer pages. + */ + if (page->mapping) + ASSERT(page_offset(page) <= start && + start + len <= page_offset(page) + PAGE_SIZE); + /* + * Here nbits can be 16, thus can go beyond u16 range. Here we make the + * first left shift to be calculated in unsigned long (u32), then + * truncate the result to u16. + */ + return (u16)(((1UL << nbits) - 1) << bit_start); +} + +static inline void btrfs_subpage_set_uptodate(struct btrfs_fs_info *fs_info, + struct page *page, u64 start, u32 len) +{ + struct btrfs_subpage *subpage = (struct btrfs_subpage *)page->private; + u16 tmp = btrfs_subpage_calc_bitmap(fs_info, page, start, len); + unsigned long flags; + + spin_lock_irqsave(&subpage->lock, flags); + subpage->uptodate_bitmap |= tmp; + if (subpage->uptodate_bitmap == U16_MAX) + SetPageUptodate(page); + spin_unlock_irqrestore(&subpage->lock, flags); +} + +static inline void btrfs_subpage_clear_uptodate(struct btrfs_fs_info *fs_info, + struct page *page, u64 start, u32 len) +{ + struct btrfs_subpage *subpage = (struct btrfs_subpage *)page->private; + u16 tmp = btrfs_subpage_calc_bitmap(fs_info, page, start, len); + unsigned long flags; + + spin_lock_irqsave(&subpage->lock, flags); + subpage->uptodate_bitmap &= ~tmp; + ClearPageUptodate(page); + spin_unlock_irqrestore(&subpage->lock, flags); +} + +/* + * Unlike set/clear which is dependent on each page status, for test all bits + * are tested in the same way. + */ +#define DECLARE_BTRFS_SUBPAGE_TEST_OP(name) \ +static inline bool btrfs_subpage_test_##name(struct btrfs_fs_info *fs_info, \ + struct page *page, u64 start, u32 len) \ +{ \ + struct btrfs_subpage *subpage = (struct btrfs_subpage *)page->private; \ + u16 tmp = btrfs_subpage_calc_bitmap(fs_info, page, start, len); \ + unsigned long flags; \ + bool ret; \ + \ + spin_lock_irqsave(&subpage->lock, flags); \ + ret = ((subpage->name##_bitmap & tmp) == tmp); \ + spin_unlock_irqrestore(&subpage->lock, flags); \ + return ret; \ +} +DECLARE_BTRFS_SUBPAGE_TEST_OP(uptodate); + +/* + * Note that, in selftest, especially extent-io-tests, we can have empty + * fs_info passed in. + * Thankfully in selftest, we only test sectorsize == PAGE_SIZE cases so far, + * thus we can fall back to regular sectorsize branch. + */ +#define DECLARE_BTRFS_PAGE_OPS(name, set_page_func, clear_page_func, \ + test_page_func) \ +static inline void btrfs_page_set_##name(struct btrfs_fs_info *fs_info, \ + struct page *page, u64 start, u32 len) \ +{ \ + if (unlikely(!fs_info) || fs_info->sectorsize == PAGE_SIZE) { \ + set_page_func(page); \ + return; \ + } \ + btrfs_subpage_set_##name(fs_info, page, start, len); \ +} \ +static inline void btrfs_page_clear_##name(struct btrfs_fs_info *fs_info, \ + struct page *page, u64 start, u32 len) \ +{ \ + if (unlikely(!fs_info) || fs_info->sectorsize == PAGE_SIZE) { \ + clear_page_func(page); \ + return; \ + } \ + btrfs_subpage_clear_##name(fs_info, page, start, len); \ +} \ +static inline bool btrfs_page_test_##name(struct btrfs_fs_info *fs_info, \ + struct page *page, u64 start, u32 len) \ +{ \ + if (unlikely(!fs_info) || fs_info->sectorsize == PAGE_SIZE) \ + return test_page_func(page); \ + return btrfs_subpage_test_##name(fs_info, page, start, len); \ +} +DECLARE_BTRFS_PAGE_OPS(uptodate, SetPageUptodate, ClearPageUptodate, + PageUptodate); + #endif /* BTRFS_SUBPAGE_H */ From patchwork Sat Jan 16 07:15:24 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12024653 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 54931C4332B for ; Sat, 16 Jan 2021 07:17:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F04D123AC6 for ; Sat, 16 Jan 2021 07:17:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726763AbhAPHRT (ORCPT ); Sat, 16 Jan 2021 02:17:19 -0500 Received: from mx2.suse.de ([195.135.220.15]:56168 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726727AbhAPHRT (ORCPT ); Sat, 16 Jan 2021 02:17:19 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1610781368; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=EuRHCjMv7e8q3orOfqgdyxwEpoXwto5U/KOfeFagjxQ=; b=jlKYPc92e9fCiQLaJEdKwz2xfVCwajCiBMYJOZ+II3kFKpBfpvtMZwT7glTCfkk5efm+oY iK7afIFXChyT1AF9cAIasA1LRmny8pDK7l56lbvX1Y1aQlIh4QRW4wSncXhWuIn2CztmdR wJiWnwmGiA4j86NENguNSw569aGgklw= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 73C10B900 for ; Sat, 16 Jan 2021 07:16:08 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 09/18] btrfs: introduce helper for subpage error status Date: Sat, 16 Jan 2021 15:15:24 +0800 Message-Id: <20210116071533.105780-10-wqu@suse.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210116071533.105780-1-wqu@suse.com> References: <20210116071533.105780-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org This patch introduce the following functions to handle btrfs subpage error status: - btrfs_subpage_set_error() - btrfs_subpage_clear_error() - btrfs_subpage_test_error() Those helpers can only be called when the range is ensured to be inside the page. - btrfs_page_set_error() - btrfs_page_clear_error() - btrfs_page_test_error() Those helpers can handle both regular sector size and subpage without problem. Signed-off-by: Qu Wenruo --- fs/btrfs/subpage.h | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) diff --git a/fs/btrfs/subpage.h b/fs/btrfs/subpage.h index 3373ef4ffec1..5da5441c08cb 100644 --- a/fs/btrfs/subpage.h +++ b/fs/btrfs/subpage.h @@ -24,6 +24,7 @@ struct btrfs_subpage { /* Common members for both data and metadata pages */ spinlock_t lock; u16 uptodate_bitmap; + u16 error_bitmap; union { /* Structures only used by metadata */ bool under_alloc; @@ -137,6 +138,35 @@ static inline void btrfs_subpage_clear_uptodate(struct btrfs_fs_info *fs_info, spin_unlock_irqrestore(&subpage->lock, flags); } +static inline void btrfs_subpage_set_error(struct btrfs_fs_info *fs_info, + struct page *page, u64 start, + u32 len) +{ + struct btrfs_subpage *subpage = (struct btrfs_subpage *)page->private; + u16 tmp = btrfs_subpage_calc_bitmap(fs_info, page, start, len); + unsigned long flags; + + spin_lock_irqsave(&subpage->lock, flags); + subpage->error_bitmap |= tmp; + SetPageError(page); + spin_unlock_irqrestore(&subpage->lock, flags); +} + +static inline void btrfs_subpage_clear_error(struct btrfs_fs_info *fs_info, + struct page *page, u64 start, + u32 len) +{ + struct btrfs_subpage *subpage = (struct btrfs_subpage *)page->private; + u16 tmp = btrfs_subpage_calc_bitmap(fs_info, page, start, len); + unsigned long flags; + + spin_lock_irqsave(&subpage->lock, flags); + subpage->error_bitmap &= ~tmp; + if (subpage->error_bitmap == 0) + ClearPageError(page); + spin_unlock_irqrestore(&subpage->lock, flags); +} + /* * Unlike set/clear which is dependent on each page status, for test all bits * are tested in the same way. @@ -156,6 +186,7 @@ static inline bool btrfs_subpage_test_##name(struct btrfs_fs_info *fs_info, \ return ret; \ } DECLARE_BTRFS_SUBPAGE_TEST_OP(uptodate); +DECLARE_BTRFS_SUBPAGE_TEST_OP(error); /* * Note that, in selftest, especially extent-io-tests, we can have empty @@ -192,5 +223,6 @@ static inline bool btrfs_page_test_##name(struct btrfs_fs_info *fs_info, \ } DECLARE_BTRFS_PAGE_OPS(uptodate, SetPageUptodate, ClearPageUptodate, PageUptodate); +DECLARE_BTRFS_PAGE_OPS(error, SetPageError, ClearPageError, PageError); #endif /* BTRFS_SUBPAGE_H */ From patchwork Sat Jan 16 07:15:25 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12024651 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.0 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, UNWANTED_LANGUAGE_BODY,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2488AC43381 for ; Sat, 16 Jan 2021 07:17:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C5E8823AF8 for ; Sat, 16 Jan 2021 07:17:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726754AbhAPHRS (ORCPT ); Sat, 16 Jan 2021 02:17:18 -0500 Received: from mx2.suse.de ([195.135.220.15]:56166 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726653AbhAPHRS (ORCPT ); Sat, 16 Jan 2021 02:17:18 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1610781370; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tsHc5uf3/iu/mZIDjAjKCyHl7cajkBKg5XGXWlUfG/Y=; b=o9oWyAOfI8XtnLcCuH072O2FFNr5EIEcIWmtkGx4qk2Rn/rRucl3Ba9QmgKZcoaUu1R4l8 r09E8CC87ZIdBQxnzqpueh8wc2NdFYbQ1dOcNfWK1nXAd4GaIcYSRkTgiZUuyY63iBxeXw DlbO+ui1/+hd6IAyRjTK1ajLoVMw6bI= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 95100B902 for ; Sat, 16 Jan 2021 07:16:10 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 10/18] btrfs: make set/clear_extent_buffer_uptodate() to support subpage size Date: Sat, 16 Jan 2021 15:15:25 +0800 Message-Id: <20210116071533.105780-11-wqu@suse.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210116071533.105780-1-wqu@suse.com> References: <20210116071533.105780-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org For those functions, to support subpage size they just need to call btrfs_page_set/clear_uptodate() wrappers. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 7f94f00936d7..c2459cf56950 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -5690,30 +5690,33 @@ bool set_extent_buffer_dirty(struct extent_buffer *eb) void clear_extent_buffer_uptodate(struct extent_buffer *eb) { - int i; + struct btrfs_fs_info *fs_info = eb->fs_info; struct page *page; int num_pages; + int i; clear_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags); num_pages = num_extent_pages(eb); for (i = 0; i < num_pages; i++) { page = eb->pages[i]; if (page) - ClearPageUptodate(page); + btrfs_page_clear_uptodate(fs_info, page, + eb->start, eb->len); } } void set_extent_buffer_uptodate(struct extent_buffer *eb) { - int i; + struct btrfs_fs_info *fs_info = eb->fs_info; struct page *page; int num_pages; + int i; set_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags); num_pages = num_extent_pages(eb); for (i = 0; i < num_pages; i++) { page = eb->pages[i]; - SetPageUptodate(page); + btrfs_page_set_uptodate(fs_info, page, eb->start, eb->len); } } From patchwork Sat Jan 16 07:15:26 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12024649 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5DE85C4332D for ; Sat, 16 Jan 2021 07:17:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2826D23AFE for ; Sat, 16 Jan 2021 07:17:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726780AbhAPHRV (ORCPT ); Sat, 16 Jan 2021 02:17:21 -0500 Received: from mx2.suse.de ([195.135.220.15]:56170 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726727AbhAPHRU (ORCPT ); Sat, 16 Jan 2021 02:17:20 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1610781372; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=67oxTlFBGfM+7X1D+mSit2Atd1veT8boa2zvEYJRE4w=; b=dnBscI17ABPbN+LbHNieVMxT1NEzqXsXTyk00jXOEowCkjtCNLYySS67g7TohASNF+DWhK Kizct4OUCAj3GpK0L1cbdb5rtLIOVdQJRQvfgXfSXuhWt7v44z8RnELWOJOkm2xuZjGH74 Fpy1VzOWAqd91bRPQMUklcY71xxkU9Q= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 9B164B903 for ; Sat, 16 Jan 2021 07:16:12 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 11/18] btrfs: make btrfs_clone_extent_buffer() to be subpage compatible Date: Sat, 16 Jan 2021 15:15:26 +0800 Message-Id: <20210116071533.105780-12-wqu@suse.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210116071533.105780-1-wqu@suse.com> References: <20210116071533.105780-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org For btrfs_clone_extent_buffer(), it's mostly the same code of __alloc_dummy_extent_buffer(), except it has extra page copy. So to make it subpage compatible, we only need to: - Call set_extent_buffer_uptodate() instead of SetPageUptodate() This will set correct uptodate bit for subpage and regular sector size cases. Since we're calling set_extent_buffer_uptodate() which will also set EXTENT_BUFFER_UPTODATE bit, we don't need to manually set that bit either. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index c2459cf56950..74a37eec921f 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -5164,7 +5164,6 @@ struct extent_buffer *btrfs_clone_extent_buffer(const struct extent_buffer *src) if (new == NULL) return NULL; - set_bit(EXTENT_BUFFER_UPTODATE, &new->bflags); set_bit(EXTENT_BUFFER_UNMAPPED, &new->bflags); for (i = 0; i < num_pages; i++) { @@ -5182,11 +5181,10 @@ struct extent_buffer *btrfs_clone_extent_buffer(const struct extent_buffer *src) return NULL; } WARN_ON(PageDirty(p)); - SetPageUptodate(p); new->pages[i] = p; copy_page(page_address(p), page_address(src->pages[i])); } - + set_extent_buffer_uptodate(new); return new; } From patchwork Sat Jan 16 07:15:27 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12024657 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8D041C4332E for ; Sat, 16 Jan 2021 07:17:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 51E1823AF8 for ; Sat, 16 Jan 2021 07:17:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726787AbhAPHRX (ORCPT ); Sat, 16 Jan 2021 02:17:23 -0500 Received: from mx2.suse.de ([195.135.220.15]:56172 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726774AbhAPHRV (ORCPT ); Sat, 16 Jan 2021 02:17:21 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1610781376; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=S+9vN7M2PZa6q9YYsjhrEE6T+u8wJu0+OeJ8lum1vZ4=; b=rlDaLD/TcNnWQOnjY1KAgToFzY6k2lukQe/VS2cBiF7j7SdrlBT515YvCNfYdqKkVrVIx2 UQUsTbVBaf/OeXdy68H/ugpAiOU3Kx5G4sgQaaqwKfuBFRB9rBok8isg6v909A2ElpB7sj B/y9MDu9sOQS+oi9bMPjz0hIk8PMdA4= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 141B8B904 for ; Sat, 16 Jan 2021 07:16:16 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 12/18] btrfs: implement try_release_extent_buffer() for subpage metadata support Date: Sat, 16 Jan 2021 15:15:27 +0800 Message-Id: <20210116071533.105780-13-wqu@suse.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210116071533.105780-1-wqu@suse.com> References: <20210116071533.105780-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Unlike the original try_release_extent_buffer(), try_release_subpage_extent_buffer() will iterate through all the ebs in the page, and try to release each eb. And only if the page and no private attached, which implies we have released all ebs of the page, then we can release the full page. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 106 ++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 104 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 74a37eec921f..9414219fa28b 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -6335,13 +6335,115 @@ void memmove_extent_buffer(const struct extent_buffer *dst, } } +static struct extent_buffer *get_next_extent_buffer( + struct btrfs_fs_info *fs_info, struct page *page, u64 bytenr) +{ + struct extent_buffer *gang[BTRFS_SUBPAGE_BITMAP_SIZE]; + struct extent_buffer *found = NULL; + u64 page_start = page_offset(page); + int ret; + int i; + + ASSERT(in_range(bytenr, page_start, PAGE_SIZE)); + ASSERT(PAGE_SIZE / fs_info->nodesize <= BTRFS_SUBPAGE_BITMAP_SIZE); + lockdep_assert_held(&fs_info->buffer_lock); + + ret = radix_tree_gang_lookup(&fs_info->buffer_radix, (void **)gang, + bytenr >> fs_info->sectorsize_bits, + PAGE_SIZE / fs_info->nodesize); + for (i = 0; i < ret; i++) { + /* Already beyond page end */ + if (gang[i]->start >= page_start + PAGE_SIZE) + break; + /* Found one */ + if (gang[i]->start >= bytenr) { + found = gang[i]; + break; + } + } + return found; +} + +static int try_release_subpage_extent_buffer(struct page *page) +{ + struct btrfs_fs_info *fs_info = btrfs_sb(page->mapping->host->i_sb); + u64 cur = page_offset(page); + const u64 end = page_offset(page) + PAGE_SIZE; + int ret; + + while (cur < end) { + struct extent_buffer *eb = NULL; + + /* + * Unlike try_release_extent_buffer() which uses page->private + * to grab buffer, for subpage case we rely on radix tree, thus + * we need to ensure radix tree consistency. + * + * We also want an atomic snapshot of the radix tree, thus go + * spinlock other than RCU. + */ + spin_lock(&fs_info->buffer_lock); + eb = get_next_extent_buffer(fs_info, page, cur); + if (!eb) { + /* No more eb in the page range after or at @cur */ + spin_unlock(&fs_info->buffer_lock); + break; + } + cur = eb->start + eb->len; + + /* + * The same as try_release_extent_buffer(), to ensure the eb + * won't disappear out from under us. + */ + spin_lock(&eb->refs_lock); + if (atomic_read(&eb->refs) != 1 || extent_buffer_under_io(eb)) { + spin_unlock(&eb->refs_lock); + spin_unlock(&fs_info->buffer_lock); + continue; + } + spin_unlock(&fs_info->buffer_lock); + + /* + * If tree ref isn't set then we know the ref on this eb is a + * real ref, so just return, this eb will likely be freed soon + * anyway. + */ + if (!test_and_clear_bit(EXTENT_BUFFER_TREE_REF, &eb->bflags)) { + spin_unlock(&eb->refs_lock); + continue; + } + + /* + * Here we don't care the return value, we will always check + * the page private at the end. + * And release_extent_buffer() will release the refs_lock. + */ + release_extent_buffer(eb); + } + /* + * Finally to check if we have cleared page private, as if we have + * released all ebs in the page, the page private should be cleared now. + */ + spin_lock(&page->mapping->private_lock); + if (!PagePrivate(page)) + ret = 1; + else + ret = 0; + spin_unlock(&page->mapping->private_lock); + return ret; + +} + int try_release_extent_buffer(struct page *page) { struct extent_buffer *eb; + if (btrfs_sb(page->mapping->host->i_sb)->sectorsize < PAGE_SIZE) + return try_release_subpage_extent_buffer(page); + /* - * We need to make sure nobody is attaching this page to an eb right - * now. + * We need to make sure nobody is change page->private, as we rely on + * page->private as the pointer to extent buffer. */ spin_lock(&page->mapping->private_lock); if (!PagePrivate(page)) { From patchwork Sat Jan 16 07:15:28 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12024655 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AB2D8C43332 for ; Sat, 16 Jan 2021 07:17:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7969823AFE for ; Sat, 16 Jan 2021 07:17:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726820AbhAPHRZ (ORCPT ); Sat, 16 Jan 2021 02:17:25 -0500 Received: from mx2.suse.de ([195.135.220.15]:56174 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726774AbhAPHRZ (ORCPT ); Sat, 16 Jan 2021 02:17:25 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1610781378; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/sT2fqdjxOyZSG2aBq9ajFfzhdjRWPKjHFubMvulyHA=; b=IQwt4xiF+2Z7zGTOhcmFV29nv1tuM1xGJyjrD/RMpe7WnSA5aWeZ0HR6SCLR3FmKG0JVb4 ohNnQr4Y9xwp+ldGhCK30lmMAv2gvKfvxf2oHU6xkHS5tskqMAXz31eF3Pl/PthbFRgFPA uvWzb0awGe4j0cqTXYh8mCuOjXj37LU= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 86BCAB905 for ; Sat, 16 Jan 2021 07:16:18 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 13/18] btrfs: introduce read_extent_buffer_subpage() Date: Sat, 16 Jan 2021 15:15:28 +0800 Message-Id: <20210116071533.105780-14-wqu@suse.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210116071533.105780-1-wqu@suse.com> References: <20210116071533.105780-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Introduce a new helper, read_extent_buffer_subpage(), to do the subpage extent buffer read. The difference between regular and subpage routines are: - No page locking Here we completely rely on extent locking. Page locking can reduce the concurrency greatly, as if we lock one page to read one extent buffer, all the other extent buffers in the same page will have to wait. - Extent uptodate condition Despite the existing PageUptodate() and EXTENT_BUFFER_UPTODATE check, We also need to check btrfs_subpage::uptodate_bitmap. - No page loop Just one page, no need to loop, this greately simplified the subpage routine. This patch only implemented the bio submit part, no endio support yet. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 70 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 70 insertions(+) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 9414219fa28b..291ff76d5b2e 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -5718,6 +5718,73 @@ void set_extent_buffer_uptodate(struct extent_buffer *eb) } } +static int read_extent_buffer_subpage(struct extent_buffer *eb, int wait, + int mirror_num) +{ + struct btrfs_fs_info *fs_info = eb->fs_info; + struct extent_io_tree *io_tree; + struct page *page = eb->pages[0]; + struct bio *bio = NULL; + int ret = 0; + + ASSERT(!test_bit(EXTENT_BUFFER_UNMAPPED, &eb->bflags)); + ASSERT(PagePrivate(page)); + io_tree = &BTRFS_I(fs_info->btree_inode)->io_tree; + + if (wait == WAIT_NONE) { + ret = try_lock_extent(io_tree, eb->start, + eb->start + eb->len - 1); + if (ret <= 0) + return ret; + } else { + ret = lock_extent(io_tree, eb->start, eb->start + eb->len - 1); + if (ret < 0) + return ret; + } + + ret = 0; + if (test_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags) || + PageUptodate(page) || + btrfs_subpage_test_uptodate(fs_info, page, eb->start, eb->len)) { + set_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags); + unlock_extent(io_tree, eb->start, eb->start + eb->len - 1); + return ret; + } + + clear_bit(EXTENT_BUFFER_READ_ERR, &eb->bflags); + eb->read_mirror = 0; + atomic_set(&eb->io_pages, 1); + check_buffer_tree_ref(eb); + + ret = submit_extent_page(REQ_OP_READ | REQ_META, NULL, page, eb->start, + eb->len, eb->start - page_offset(page), &bio, + end_bio_extent_readpage, mirror_num, 0, 0, + true); + if (ret) { + /* + * In the endio function, if we hit something wrong we will + * increase the io_pages, so here we need to decrease it for error + * path. + */ + atomic_dec(&eb->io_pages); + } + if (bio) { + int tmp; + + tmp = submit_one_bio(bio, mirror_num, 0); + if (tmp < 0) + return tmp; + } + if (ret || wait != WAIT_COMPLETE) + return ret; + + wait_extent_bit(io_tree, eb->start, eb->start + eb->len - 1, + EXTENT_LOCKED); + if (!test_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags)) + ret = -EIO; + return ret; +} + int read_extent_buffer_pages(struct extent_buffer *eb, int wait, int mirror_num) { int i; @@ -5734,6 +5801,9 @@ int read_extent_buffer_pages(struct extent_buffer *eb, int wait, int mirror_num) if (test_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags)) return 0; + if (eb->fs_info->sectorsize < PAGE_SIZE) + return read_extent_buffer_subpage(eb, wait, mirror_num); + num_pages = num_extent_pages(eb); for (i = 0; i < num_pages; i++) { page = eb->pages[i]; From patchwork Sat Jan 16 07:15:29 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12024659 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E02C0C43331 for ; Sat, 16 Jan 2021 07:17:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A278023AFB for ; Sat, 16 Jan 2021 07:17:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726817AbhAPHRZ (ORCPT ); Sat, 16 Jan 2021 02:17:25 -0500 Received: from mx2.suse.de ([195.135.220.15]:56176 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726788AbhAPHRZ (ORCPT ); Sat, 16 Jan 2021 02:17:25 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1610781381; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=iXI4NdOM1RXIwxjqJJA5y+kcBUsHY7XuwMtAopogOjA=; b=VH2h3EsWSDN96ZusE27u+b1XMJQShF0ZoWHGdP0IwST0qJwvjyLRKAsBJ7Y0yLYYbm7u4P PG+L4gYiH/kGpZiXEPOjI0haA6ckOQrIBgcfuB/DsS5HAuAJ9p10xWbPd0dl5fKh022gC2 Xl4qo9dpRbjR8S1pdQFylCREoUa6p7o= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id E8DE2B906 for ; Sat, 16 Jan 2021 07:16:20 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 14/18] btrfs: extent_io: make endio_readpage_update_page_status() to handle subpage case Date: Sat, 16 Jan 2021 15:15:29 +0800 Message-Id: <20210116071533.105780-15-wqu@suse.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210116071533.105780-1-wqu@suse.com> References: <20210116071533.105780-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org To handle subpage status update, add the following new tricks: - Use btrfs_page_*() helpers to update page status Now we can handle both cases well. - No page unlock for subpage metadata Since subpage metadata doesn't utilize page locking at all, skip it. For subpage data locking, it's handled in later commits. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 21 +++++++++++++++------ 1 file changed, 15 insertions(+), 6 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 291ff76d5b2e..35fbef15d84e 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -2839,15 +2839,24 @@ static void endio_readpage_release_extent(struct processed_extent *processed, processed->uptodate = uptodate; } -static void endio_readpage_update_page_status(struct page *page, bool uptodate) +static void endio_readpage_update_page_status(struct page *page, bool uptodate, + u64 start, u32 len) { + struct btrfs_fs_info *fs_info = btrfs_sb(page->mapping->host->i_sb); + + ASSERT(page_offset(page) <= start && + start + len <= page_offset(page) + PAGE_SIZE); + if (uptodate) { - SetPageUptodate(page); + btrfs_page_set_uptodate(fs_info, page, start, len); } else { - ClearPageUptodate(page); - SetPageError(page); + btrfs_page_clear_uptodate(fs_info, page, start, len); + btrfs_page_set_error(fs_info, page, start, len); } - unlock_page(page); + + if (fs_info->sectorsize == PAGE_SIZE) + unlock_page(page); + /* Subpage locking will be handled in later patches */ } /* @@ -2984,7 +2993,7 @@ static void end_bio_extent_readpage(struct bio *bio) bio_offset += len; /* Update page status and unlock */ - endio_readpage_update_page_status(page, uptodate); + endio_readpage_update_page_status(page, uptodate, start, len); endio_readpage_release_extent(&processed, BTRFS_I(inode), start, end, uptodate); } From patchwork Sat Jan 16 07:15:30 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12024661 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 26952C433DB for ; Sat, 16 Jan 2021 07:17:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D9C3B23AC6 for ; Sat, 16 Jan 2021 07:17:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726831AbhAPHRe (ORCPT ); Sat, 16 Jan 2021 02:17:34 -0500 Received: from mx2.suse.de ([195.135.220.15]:56144 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726774AbhAPHRe (ORCPT ); Sat, 16 Jan 2021 02:17:34 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1610781383; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9vs/i6yNyAbx7v+kPD+/+rY6/TnnwFabTg4YyoXv5RU=; b=tvtCKtYitoT/ijn1OAA3cXf6cPZhWKhT/gMwHYwrMo5iOx8GPe4FEcoWjgfqaxdHyUrnVC o+U3+Lyi6owLVVCtRLRdHSPQNabYIG7J/SsrBDNAU3J7giTK6PZnQ9QAaikJKTw36t/NeG O9PywXEFl2DhgvcMFSaS+vp0Tvg7nbE= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id ED5EEB907 for ; Sat, 16 Jan 2021 07:16:22 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 15/18] btrfs: disk-io: introduce subpage metadata validation check Date: Sat, 16 Jan 2021 15:15:30 +0800 Message-Id: <20210116071533.105780-16-wqu@suse.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210116071533.105780-1-wqu@suse.com> References: <20210116071533.105780-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org For subpage metadata validation check, there are some difference: - Read must finish in one bvec Since we're just reading one subpage range in one page, it should never be split into two bios nor two bvecs. - How to grab the existing eb Instead of grabbing eb using page->private, we have to go search radix tree as we don't have any direct pointer at hand. Signed-off-by: Qu Wenruo --- fs/btrfs/disk-io.c | 57 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 57 insertions(+) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 5473bed6a7e8..7d2875c18958 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -591,6 +591,59 @@ static int validate_extent_buffer(struct extent_buffer *eb) return ret; } +static int validate_subpage_buffer(struct page *page, u64 start, u64 end, + int mirror) +{ + struct btrfs_fs_info *fs_info = btrfs_sb(page->mapping->host->i_sb); + struct extent_buffer *eb; + int reads_done; + int ret = 0; + + /* + * We don't allow bio merge for subpage metadata read, so we should + * only get one eb for each endio hook. + */ + ASSERT(end == start + fs_info->nodesize - 1); + ASSERT(PagePrivate(page)); + + eb = find_extent_buffer(fs_info, start); + /* + * When we are reading one tree block, eb must have been + * inserted into the radix tree. If not something is wrong. + */ + ASSERT(eb); + + reads_done = atomic_dec_and_test(&eb->io_pages); + /* Subpage read must finish in page read */ + ASSERT(reads_done); + + eb->read_mirror = mirror; + if (test_bit(EXTENT_BUFFER_READ_ERR, &eb->bflags)) { + ret = -EIO; + goto err; + } + ret = validate_extent_buffer(eb); + if (ret < 0) + goto err; + + if (test_and_clear_bit(EXTENT_BUFFER_READAHEAD, &eb->bflags)) + btree_readahead_hook(eb, ret); + + set_extent_buffer_uptodate(eb); + + free_extent_buffer(eb); + return ret; +err: + /* + * end_bio_extent_readpage decrements io_pages in case of error, + * make sure it has something to decrement. + */ + atomic_inc(&eb->io_pages); + clear_extent_buffer_uptodate(eb); + free_extent_buffer(eb); + return ret; +} + int btrfs_validate_metadata_buffer(struct btrfs_io_bio *io_bio, struct page *page, u64 start, u64 end, int mirror) @@ -600,6 +653,10 @@ int btrfs_validate_metadata_buffer(struct btrfs_io_bio *io_bio, int reads_done; ASSERT(page->private); + + if (btrfs_sb(page->mapping->host->i_sb)->sectorsize < PAGE_SIZE) + return validate_subpage_buffer(page, start, end, mirror); + eb = (struct extent_buffer *)page->private; /* From patchwork Sat Jan 16 07:15:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12024667 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 61E91C433DB for ; Sat, 16 Jan 2021 07:18:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1AFF323AFB for ; Sat, 16 Jan 2021 07:18:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726864AbhAPHSB (ORCPT ); Sat, 16 Jan 2021 02:18:01 -0500 Received: from mx2.suse.de ([195.135.220.15]:56242 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725781AbhAPHSA (ORCPT ); Sat, 16 Jan 2021 02:18:00 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1610781385; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tw+OsxQmg8nbTd7z1xiPfaoSu7x97nH9o9A0jG3lQgg=; b=I+5RXPXcp/LYUq4qbRCuSewE71odjRM7RyPcWfxmveiOyLG3uIR6LhTkl6l3gQz73jLmV8 U5zb2Xz7aLuoHkLrvjATCt0S56WNWw5p7eTjx0LPcUWXS7nWGYlXfMSJAr8HqnxFE6D8Xk RCmcuUmmuhOAaMactYHnbXEiZbB699Q= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 3F1C1B908 for ; Sat, 16 Jan 2021 07:16:25 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 16/18] btrfs: introduce btrfs_subpage for data inodes Date: Sat, 16 Jan 2021 15:15:31 +0800 Message-Id: <20210116071533.105780-17-wqu@suse.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210116071533.105780-1-wqu@suse.com> References: <20210116071533.105780-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org To support subpage sector size, data also need extra info to make sure which sectors in a page are uptodate/dirty/... This patch will make pages for data inodes to get btrfs_subpage structure attached, and detached when the page is freed. This patch also slightly changes the timing when set_page_extent_mapped() to make sure: - We have page->mapping set page->mapping->host is used to grab btrfs_fs_info, thus we can only call this function after page is mapped to an inode. One call site attaches pages to inode manually, thus we have to modify the timing of set_page_extent_mapped() a little. - As soon as possible, before other operations Since memory allocation can fail, we have to do extra error handling. Calling set_page_extent_mapped() as soon as possible can simply the error handling for several call sites. The idea is pretty much the same as iomap_page, but with more bitmaps for btrfs specific cases. Currently the plan is to switch iomap if iomap can provide sector aligned write back (only write back dirty sectors, but not the full page, data balance require this feature). So we will stick to btrfs specific bitmap for now. Signed-off-by: Qu Wenruo --- fs/btrfs/compression.c | 10 ++++++-- fs/btrfs/extent_io.c | 46 +++++++++++++++++++++++++++++++++---- fs/btrfs/extent_io.h | 3 ++- fs/btrfs/file.c | 24 ++++++++----------- fs/btrfs/free-space-cache.c | 15 +++++++++--- fs/btrfs/inode.c | 12 ++++++---- fs/btrfs/ioctl.c | 5 +++- fs/btrfs/reflink.c | 5 +++- fs/btrfs/relocation.c | 12 ++++++++-- 9 files changed, 99 insertions(+), 33 deletions(-) diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c index 5ae3fa0386b7..6d203acfdeb3 100644 --- a/fs/btrfs/compression.c +++ b/fs/btrfs/compression.c @@ -542,13 +542,19 @@ static noinline int add_ra_bio_pages(struct inode *inode, goto next; } - end = last_offset + PAGE_SIZE - 1; /* * at this point, we have a locked page in the page cache * for these bytes in the file. But, we have to make * sure they map to this compressed extent on disk. */ - set_page_extent_mapped(page); + ret = set_page_extent_mapped(page); + if (ret < 0) { + unlock_page(page); + put_page(page); + break; + } + + end = last_offset + PAGE_SIZE - 1; lock_extent(tree, last_offset, end); read_lock(&em_tree->lock); em = lookup_extent_mapping(em_tree, last_offset, diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 35fbef15d84e..4bce03fed205 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -3194,10 +3194,39 @@ static int attach_extent_buffer_page(struct extent_buffer *eb, return 0; } -void set_page_extent_mapped(struct page *page) +int __must_check set_page_extent_mapped(struct page *page) { + struct btrfs_fs_info *fs_info; + + ASSERT(page->mapping); + + if (PagePrivate(page)) + return 0; + + fs_info = btrfs_sb(page->mapping->host->i_sb); + + if (fs_info->sectorsize < PAGE_SIZE) + return btrfs_attach_subpage(fs_info, page); + + attach_page_private(page, (void *)EXTENT_PAGE_PRIVATE); + return 0; + +} + +void clear_page_extent_mapped(struct page *page) +{ + struct btrfs_fs_info *fs_info; + + ASSERT(page->mapping); + if (!PagePrivate(page)) - attach_page_private(page, (void *)EXTENT_PAGE_PRIVATE); + return; + + fs_info = btrfs_sb(page->mapping->host->i_sb); + if (fs_info->sectorsize < PAGE_SIZE) + return btrfs_detach_subpage(fs_info, page); + + detach_page_private(page); } static struct extent_map * @@ -3254,7 +3283,12 @@ int btrfs_do_readpage(struct page *page, struct extent_map **em_cached, unsigned long this_bio_flag = 0; struct extent_io_tree *tree = &BTRFS_I(inode)->io_tree; - set_page_extent_mapped(page); + ret = set_page_extent_mapped(page); + if (ret < 0) { + unlock_extent(tree, start, end); + SetPageError(page); + goto out; + } if (!PageUptodate(page)) { if (cleancache_get_page(page) == 0) { @@ -3694,7 +3728,11 @@ static int __extent_writepage(struct page *page, struct writeback_control *wbc, flush_dcache_page(page); } - set_page_extent_mapped(page); + ret = set_page_extent_mapped(page); + if (ret < 0) { + SetPageError(page); + goto done; + } if (!epd->extent_locked) { ret = writepage_delalloc(BTRFS_I(inode), page, wbc, start, diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h index bedf761a0300..357a3380cd42 100644 --- a/fs/btrfs/extent_io.h +++ b/fs/btrfs/extent_io.h @@ -178,7 +178,8 @@ int btree_write_cache_pages(struct address_space *mapping, void extent_readahead(struct readahead_control *rac); int extent_fiemap(struct btrfs_inode *inode, struct fiemap_extent_info *fieinfo, u64 start, u64 len); -void set_page_extent_mapped(struct page *page); +int __must_check set_page_extent_mapped(struct page *page); +void clear_page_extent_mapped(struct page *page); struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, u64 start, u64 owner_root, int level); diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index d81ae1f518f2..63b290210eaa 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -1369,6 +1369,12 @@ static noinline int prepare_pages(struct inode *inode, struct page **pages, goto fail; } + err = set_page_extent_mapped(pages[i]); + if (err < 0) { + faili = i; + goto fail; + } + if (i == 0) err = prepare_uptodate_page(inode, pages[i], pos, force_uptodate); @@ -1453,23 +1459,11 @@ lock_and_cleanup_extent_if_need(struct btrfs_inode *inode, struct page **pages, } /* - * It's possible the pages are dirty right now, but we don't want - * to clean them yet because copy_from_user may catch a page fault - * and we might have to fall back to one page at a time. If that - * happens, we'll unlock these pages and we'd have a window where - * reclaim could sneak in and drop the once-dirty page on the floor - * without writing it. - * - * We have the pages locked and the extent range locked, so there's - * no way someone can start IO on any dirty pages in this range. - * - * We'll call btrfs_dirty_pages() later on, and that will flip around - * delalloc bits and dirty the pages as required. + * We should be called after prepare_pages() which should have + * locked all pages in the range. */ - for (i = 0; i < num_pages; i++) { - set_page_extent_mapped(pages[i]); + for (i = 0; i < num_pages; i++) WARN_ON(!PageLocked(pages[i])); - } return ret; } diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c index fd6ddd6b8165..379bef967e1d 100644 --- a/fs/btrfs/free-space-cache.c +++ b/fs/btrfs/free-space-cache.c @@ -431,11 +431,22 @@ static int io_ctl_prepare_pages(struct btrfs_io_ctl *io_ctl, bool uptodate) int i; for (i = 0; i < io_ctl->num_pages; i++) { + int ret; + page = find_or_create_page(inode->i_mapping, i, mask); if (!page) { io_ctl_drop_pages(io_ctl); return -ENOMEM; } + + ret = set_page_extent_mapped(page); + if (ret < 0) { + unlock_page(page); + put_page(page); + io_ctl_drop_pages(io_ctl); + return -ENOMEM; + } + io_ctl->pages[i] = page; if (uptodate && !PageUptodate(page)) { btrfs_readpage(NULL, page); @@ -455,10 +466,8 @@ static int io_ctl_prepare_pages(struct btrfs_io_ctl *io_ctl, bool uptodate) } } - for (i = 0; i < io_ctl->num_pages; i++) { + for (i = 0; i < io_ctl->num_pages; i++) clear_page_dirty_for_io(io_ctl->pages[i]); - set_page_extent_mapped(io_ctl->pages[i]); - } return 0; } diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 1ab5cb89c530..a4c40a4b794f 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -4712,6 +4712,9 @@ int btrfs_truncate_block(struct btrfs_inode *inode, loff_t from, loff_t len, ret = -ENOMEM; goto out; } + ret = set_page_extent_mapped(page); + if (ret < 0) + goto out_unlock; if (!PageUptodate(page)) { ret = btrfs_readpage(NULL, page); @@ -4729,7 +4732,6 @@ int btrfs_truncate_block(struct btrfs_inode *inode, loff_t from, loff_t len, wait_on_page_writeback(page); lock_extent_bits(io_tree, block_start, block_end, &cached_state); - set_page_extent_mapped(page); ordered = btrfs_lookup_ordered_extent(inode, block_start); if (ordered) { @@ -8107,7 +8109,7 @@ static int __btrfs_releasepage(struct page *page, gfp_t gfp_flags) { int ret = try_release_extent_mapping(page, gfp_flags); if (ret == 1) - detach_page_private(page); + clear_page_extent_mapped(page); return ret; } @@ -8266,7 +8268,7 @@ static void btrfs_invalidatepage(struct page *page, unsigned int offset, } ClearPageChecked(page); - detach_page_private(page); + clear_page_extent_mapped(page); } /* @@ -8345,7 +8347,9 @@ vm_fault_t btrfs_page_mkwrite(struct vm_fault *vmf) wait_on_page_writeback(page); lock_extent_bits(io_tree, page_start, page_end, &cached_state); - set_page_extent_mapped(page); + ret2 = set_page_extent_mapped(page); + if (ret2 < 0) + goto out_unlock; /* * we can't set the delalloc bits if there are pending ordered diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 7f2935ea8d3a..50a9d784bdc2 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -1314,6 +1314,10 @@ static int cluster_pages_for_defrag(struct inode *inode, if (!page) break; + ret = set_page_extent_mapped(page); + if (ret < 0) + break; + page_start = page_offset(page); page_end = page_start + PAGE_SIZE - 1; while (1) { @@ -1435,7 +1439,6 @@ static int cluster_pages_for_defrag(struct inode *inode, for (i = 0; i < i_done; i++) { clear_page_dirty_for_io(pages[i]); ClearPageChecked(pages[i]); - set_page_extent_mapped(pages[i]); set_page_dirty(pages[i]); unlock_page(pages[i]); put_page(pages[i]); diff --git a/fs/btrfs/reflink.c b/fs/btrfs/reflink.c index b03e7891394e..b24396cf2f99 100644 --- a/fs/btrfs/reflink.c +++ b/fs/btrfs/reflink.c @@ -81,7 +81,10 @@ static int copy_inline_to_page(struct btrfs_inode *inode, goto out_unlock; } - set_page_extent_mapped(page); + ret = set_page_extent_mapped(page); + if (ret < 0) + goto out_unlock; + clear_extent_bit(&inode->io_tree, file_offset, range_end, EXTENT_DELALLOC | EXTENT_DO_ACCOUNTING | EXTENT_DEFRAG, 0, 0, NULL); diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c index 9f2289bcdde6..eb2f9da1e06d 100644 --- a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -2681,6 +2681,16 @@ static int relocate_file_extent_cluster(struct inode *inode, goto out; } } + ret = set_page_extent_mapped(page); + if (ret < 0) { + btrfs_delalloc_release_metadata(BTRFS_I(inode), + PAGE_SIZE, true); + btrfs_delalloc_release_extents(BTRFS_I(inode), + PAGE_SIZE); + unlock_page(page); + put_page(page); + goto out; + } if (PageReadahead(page)) { page_cache_async_readahead(inode->i_mapping, @@ -2708,8 +2718,6 @@ static int relocate_file_extent_cluster(struct inode *inode, lock_extent(&BTRFS_I(inode)->io_tree, page_start, page_end); - set_page_extent_mapped(page); - if (nr < cluster->nr && page_start + offset == cluster->boundary[nr]) { set_extent_bits(&BTRFS_I(inode)->io_tree, From patchwork Sat Jan 16 07:15:32 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12024663 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3F566C433E0 for ; Sat, 16 Jan 2021 07:17:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0A52523AFB for ; Sat, 16 Jan 2021 07:17:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726848AbhAPHRf (ORCPT ); Sat, 16 Jan 2021 02:17:35 -0500 Received: from mx2.suse.de ([195.135.220.15]:56142 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726825AbhAPHRf (ORCPT ); Sat, 16 Jan 2021 02:17:35 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1610781387; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ftEmcAZbK5lU8qY2RSHJljJaUYFTlSu/YeoGKDBN9J4=; b=GWnZczLTk5JYrg4MWWbChcYP9chGRd+NVLsXtGh0O3Lj1DvJIaTyvKWebvVdFniAKYCZSl fIMOe+qFuBmFraJ3GA9JJQcA1Mj3pYaiKh4w17goWyVJK02AWhOsjk3cB95nPqNsg+Stw0 frPPEyYXHvgFzwVPMQ7HUsbt9iPmGxg= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id B54D2AB7A for ; Sat, 16 Jan 2021 07:16:27 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 17/18] btrfs: integrate page status update for data read path into begin/end_page_read() Date: Sat, 16 Jan 2021 15:15:32 +0800 Message-Id: <20210116071533.105780-18-wqu@suse.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210116071533.105780-1-wqu@suse.com> References: <20210116071533.105780-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org In btrfs data page read path, the page status update are handled in two different locations: btrfs_do_read_page() { while (cur <= end) { /* No need to read from disk */ if (HOLE/PREALLOC/INLINE){ memset(); set_extent_uptodate(); continue; } /* Read from disk */ ret = submit_extent_page(end_bio_extent_readpage); } end_bio_extent_readpage() { endio_readpage_uptodate_page_status(); } This is fine for sectorsize == PAGE_SIZE case, as for above loop we should only hit one branch and then exit. But for subpage, there are more works to be done in page status update: - Page Unlock condition Unlike regular page size == sectorsize case, we can no longer just unlock a page without a brain. Only the last reader of the page can unlock the page. This means, we can unlock the page either in the while() loop, or in the endio function. - Page uptodate condition Since we have multiple sectors to read for a page, we can only mark the full page uptodate if all sectors are uptodate. To handle both subpage and regular cases, introduce a pair of functions to help handling page status update: - being_page_read() For regular case, it does nothing. For subpage case, it update the reader counters so that later end_page_read() can know who is the last one to unlock the page. - end_page_read() This is just endio_readpage_uptodate_page_status() renamed. The original name is a little too long and too specific for endio. The only new trick added is the condition for page unlock. Now for subage data, we unlock the page if we're the last reader. This does not only provide the basis for subpage data read, but also hide the special handling of page read from the main read loop. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 38 +++++++++++++++++++---------- fs/btrfs/subpage.h | 57 +++++++++++++++++++++++++++++++++++--------- 2 files changed, 72 insertions(+), 23 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 4bce03fed205..6ae820144ec7 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -2839,8 +2839,17 @@ static void endio_readpage_release_extent(struct processed_extent *processed, processed->uptodate = uptodate; } -static void endio_readpage_update_page_status(struct page *page, bool uptodate, - u64 start, u32 len) +static void begin_data_page_read(struct btrfs_fs_info *fs_info, struct page *page) +{ + ASSERT(PageLocked(page)); + if (fs_info->sectorsize == PAGE_SIZE) + return; + + ASSERT(PagePrivate(page)); + btrfs_subpage_start_reader(fs_info, page, page_offset(page), PAGE_SIZE); +} + +static void end_page_read(struct page *page, bool uptodate, u64 start, u32 len) { struct btrfs_fs_info *fs_info = btrfs_sb(page->mapping->host->i_sb); @@ -2856,7 +2865,12 @@ static void endio_readpage_update_page_status(struct page *page, bool uptodate, if (fs_info->sectorsize == PAGE_SIZE) unlock_page(page); - /* Subpage locking will be handled in later patches */ + else if (is_data_inode(page->mapping->host)) + /* + * For subpage data, unlock the page if we're the last reader. + * For subpage metadata, page lock is not utilized for read. + */ + btrfs_subpage_end_reader(fs_info, page, start, len); } /* @@ -2993,7 +3007,7 @@ static void end_bio_extent_readpage(struct bio *bio) bio_offset += len; /* Update page status and unlock */ - endio_readpage_update_page_status(page, uptodate, start, len); + end_page_read(page, uptodate, start, len); endio_readpage_release_extent(&processed, BTRFS_I(inode), start, end, uptodate); } @@ -3267,6 +3281,7 @@ int btrfs_do_readpage(struct page *page, struct extent_map **em_cached, unsigned int read_flags, u64 *prev_em_start) { struct inode *inode = page->mapping->host; + struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); u64 start = page_offset(page); const u64 end = start + PAGE_SIZE - 1; u64 cur = start; @@ -3310,6 +3325,7 @@ int btrfs_do_readpage(struct page *page, struct extent_map **em_cached, kunmap_atomic(userpage); } } + begin_data_page_read(fs_info, page); while (cur <= end) { bool force_bio_submit = false; u64 disk_bytenr; @@ -3327,13 +3343,14 @@ int btrfs_do_readpage(struct page *page, struct extent_map **em_cached, &cached, GFP_NOFS); unlock_extent_cached(tree, cur, cur + iosize - 1, &cached); + end_page_read(page, true, cur, iosize); break; } em = __get_extent_map(inode, page, pg_offset, cur, end - cur + 1, em_cached); if (IS_ERR_OR_NULL(em)) { - SetPageError(page); unlock_extent(tree, cur, end); + end_page_read(page, false, cur, end + 1 - cur); break; } extent_offset = cur - em->start; @@ -3416,6 +3433,7 @@ int btrfs_do_readpage(struct page *page, struct extent_map **em_cached, &cached, GFP_NOFS); unlock_extent_cached(tree, cur, cur + iosize - 1, &cached); + end_page_read(page, true, cur, iosize); cur = cur + iosize; pg_offset += iosize; continue; @@ -3425,6 +3443,7 @@ int btrfs_do_readpage(struct page *page, struct extent_map **em_cached, EXTENT_UPTODATE, 1, NULL)) { check_page_uptodate(tree, page); unlock_extent(tree, cur, cur + iosize - 1); + end_page_read(page, true, cur, iosize); cur = cur + iosize; pg_offset += iosize; continue; @@ -3433,8 +3452,8 @@ int btrfs_do_readpage(struct page *page, struct extent_map **em_cached, * to date. Error out */ if (block_start == EXTENT_MAP_INLINE) { - SetPageError(page); unlock_extent(tree, cur, cur + iosize - 1); + end_page_read(page, false, cur, iosize); cur = cur + iosize; pg_offset += iosize; continue; @@ -3451,19 +3470,14 @@ int btrfs_do_readpage(struct page *page, struct extent_map **em_cached, nr++; *bio_flags = this_bio_flag; } else { - SetPageError(page); unlock_extent(tree, cur, cur + iosize - 1); + end_page_read(page, false, cur, iosize); goto out; } cur = cur + iosize; pg_offset += iosize; } out: - if (!nr) { - if (!PageError(page)) - SetPageUptodate(page); - unlock_page(page); - } return ret; } diff --git a/fs/btrfs/subpage.h b/fs/btrfs/subpage.h index 5da5441c08cb..b85d4ccd79da 100644 --- a/fs/btrfs/subpage.h +++ b/fs/btrfs/subpage.h @@ -29,6 +29,9 @@ struct btrfs_subpage { /* Structures only used by metadata */ bool under_alloc; /* Structures only used by data */ + struct { + atomic_t readers; + }; }; }; @@ -80,22 +83,13 @@ static inline void btrfs_page_end_meta_alloc(struct btrfs_fs_info *fs_info, int btrfs_attach_subpage(struct btrfs_fs_info *fs_info, struct page *page); void btrfs_detach_subpage(struct btrfs_fs_info *fs_info, struct page *page); -/* - * Convert the [start, start + len) range into a u16 bitmap - * - * E.g. if start == page_offset() + 16K, len = 16K, we get 0x00f0. - */ -static inline u16 btrfs_subpage_calc_bitmap(struct btrfs_fs_info *fs_info, - struct page *page, u64 start, u32 len) +static inline void btrfs_subpage_assert(struct btrfs_fs_info *fs_info, + struct page *page, u64 start, u32 len) { - int bit_start = offset_in_page(start) >> fs_info->sectorsize_bits; - int nbits = len >> fs_info->sectorsize_bits; - /* Basic checks */ ASSERT(PagePrivate(page) && page->private); ASSERT(IS_ALIGNED(start, fs_info->sectorsize) && IS_ALIGNED(len, fs_info->sectorsize)); - /* * The range check only works for mapped page, we can * still have unampped page like dummy extent buffer pages. @@ -103,6 +97,21 @@ static inline u16 btrfs_subpage_calc_bitmap(struct btrfs_fs_info *fs_info, if (page->mapping) ASSERT(page_offset(page) <= start && start + len <= page_offset(page) + PAGE_SIZE); +} + +/* + * Convert the [start, start + len) range into a u16 bitmap + * + * E.g. if start == page_offset() + 16K, len = 16K, we get 0x00f0. + */ +static inline u16 btrfs_subpage_calc_bitmap(struct btrfs_fs_info *fs_info, + struct page *page, u64 start, u32 len) +{ + int bit_start = offset_in_page(start) >> fs_info->sectorsize_bits; + int nbits = len >> fs_info->sectorsize_bits; + + btrfs_subpage_assert(fs_info, page, start, len); + /* * Here nbits can be 16, thus can go beyond u16 range. Here we make the * first left shift to be calculated in unsigned long (u32), then @@ -111,6 +120,32 @@ static inline u16 btrfs_subpage_calc_bitmap(struct btrfs_fs_info *fs_info, return (u16)(((1UL << nbits) - 1) << bit_start); } +static inline void btrfs_subpage_start_reader(struct btrfs_fs_info *fs_info, + struct page *page, u64 start, + u32 len) +{ + struct btrfs_subpage *subpage = (struct btrfs_subpage *)page->private; + int nbits = len >> fs_info->sectorsize_bits; + int ret; + + btrfs_subpage_assert(fs_info, page, start, len); + + ret = atomic_add_return(nbits, &subpage->readers); + ASSERT(ret == nbits); +} + +static inline void btrfs_subpage_end_reader(struct btrfs_fs_info *fs_info, + struct page *page, u64 start, u32 len) +{ + struct btrfs_subpage *subpage = (struct btrfs_subpage *)page->private; + int nbits = len >> fs_info->sectorsize_bits; + + btrfs_subpage_assert(fs_info, page, start, len); + ASSERT(atomic_read(&subpage->readers) >= nbits); + if (atomic_sub_and_test(nbits, &subpage->readers)) + unlock_page(page); +} + static inline void btrfs_subpage_set_uptodate(struct btrfs_fs_info *fs_info, struct page *page, u64 start, u32 len) { From patchwork Sat Jan 16 07:15:33 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 12024665 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3134CC433E0 for ; Sat, 16 Jan 2021 07:18:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E971723AC6 for ; Sat, 16 Jan 2021 07:18:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726860AbhAPHSA (ORCPT ); Sat, 16 Jan 2021 02:18:00 -0500 Received: from mx2.suse.de ([195.135.220.15]:56244 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726825AbhAPHSA (ORCPT ); Sat, 16 Jan 2021 02:18:00 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1610781391; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BW+Jj9D1+y42DvPT96GS/HqZgtXI/ZxjeKPWqtRhfFg=; b=JBWNeA7NBc65wvrTnMMbo/hAuEsrcpUuzCaTAhjmYS1zM4ENT+gVXPwUrppEN2mw4UvC/g n5wAOQyCdoadZBkD/n4qDSBaiqWm8LMKW7qpfjS5/7V+HoeRZ+e2RIIULeM3Jw7l5aL2kn DA3oeKtGgoQ59N3APWBbQf+JG3p8k+4= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id E5C16AC63 for ; Sat, 16 Jan 2021 07:16:30 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 18/18] btrfs: allow RO mount of 4K sector size fs on 64K page system Date: Sat, 16 Jan 2021 15:15:33 +0800 Message-Id: <20210116071533.105780-19-wqu@suse.com> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210116071533.105780-1-wqu@suse.com> References: <20210116071533.105780-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org This adds the basic RO mount ability for 4K sector size on 64K page system. Currently we only plan to support 4K and 64K page system. Signed-off-by: Qu Wenruo --- fs/btrfs/disk-io.c | 24 +++++++++++++++++++++--- fs/btrfs/super.c | 7 +++++++ 2 files changed, 28 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 7d2875c18958..be9de12d272b 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -2483,13 +2483,21 @@ static int validate_super(struct btrfs_fs_info *fs_info, btrfs_err(fs_info, "invalid sectorsize %llu", sectorsize); ret = -EINVAL; } - /* Only PAGE SIZE is supported yet */ - if (sectorsize != PAGE_SIZE) { + + /* + * For 4K page size, we only support 4K sector size. + * For 64K page size, we support RW for 64K sector size, and RO for + * 4K sector size. + */ + if ((SZ_4K == PAGE_SIZE && sectorsize != PAGE_SIZE) || + (SZ_64K == PAGE_SIZE && (sectorsize != SZ_4K && + sectorsize != SZ_64K))) { btrfs_err(fs_info, - "sectorsize %llu not supported yet, only support %lu", + "sectorsize %llu not supported yet for page size %lu", sectorsize, PAGE_SIZE); ret = -EINVAL; } + if (!is_power_of_2(nodesize) || nodesize < sectorsize || nodesize > BTRFS_MAX_METADATA_BLOCKSIZE) { btrfs_err(fs_info, "invalid nodesize %llu", nodesize); @@ -3248,6 +3256,16 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device goto fail_alloc; } + /* For 4K sector size support, it's only read-only yet */ + if (PAGE_SIZE == SZ_64K && sectorsize == SZ_4K) { + if (!sb_rdonly(sb) || btrfs_super_log_root(disk_super)) { + btrfs_err(fs_info, + "subpage sector size only support RO yet"); + err = -EINVAL; + goto fail_alloc; + } + } + ret = btrfs_init_workqueues(fs_info, fs_devices); if (ret) { err = ret; diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 12d7d3be7cd4..5bbc23597a93 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -2028,6 +2028,13 @@ static int btrfs_remount(struct super_block *sb, int *flags, char *data) ret = -EINVAL; goto restore; } + if (fs_info->sectorsize < PAGE_SIZE) { + btrfs_warn(fs_info, + "read-write mount is not yet allowed for sector size %u page size %lu", + fs_info->sectorsize, PAGE_SIZE); + ret = -EINVAL; + goto restore; + } /* * NOTE: when remounting with a change that does writes, don't