From patchwork Wed Nov 18 08:53:06 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11914541 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C9F0BC5519F for ; Wed, 18 Nov 2020 08:53:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6F6EB20719 for ; Wed, 18 Nov 2020 08:53:29 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="Z8ZRWh4w" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726860AbgKRIx2 (ORCPT ); Wed, 18 Nov 2020 03:53:28 -0500 Received: from mx2.suse.de ([195.135.220.15]:47520 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725964AbgKRIx2 (ORCPT ); Wed, 18 Nov 2020 03:53:28 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1605689607; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=AJ5FDgM+IJav9Rao+mHDPtw0VBKfmTYwhA7oZlDtzyQ=; b=Z8ZRWh4wdQBp5buUEaydRfFR+W5hJOwaRJy4Ogk1Mk5zr4AfVNIGqN1Xe4xLAwrkeSwTCO 47TKhXscXjAOYINUNgEywd42iFA/8YWMWfI7cniWV/VgJ/EpCgU+/aW64mbYw4ULXwiZhw ScFDbISxcKbE8+a3nfz2uZm0g0YirnM= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 85924AD2F for ; Wed, 18 Nov 2020 08:53:27 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH 01/14] btrfs: extent_io: Use detach_page_private() for alloc_extent_buffer() Date: Wed, 18 Nov 2020 16:53:06 +0800 Message-Id: <20201118085319.56668-2-wqu@suse.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201118085319.56668-1-wqu@suse.com> References: <20201118085319.56668-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org In alloc_extent_buffer(), after we got a page from btree inode, we check if that page has private pointer attached. If attached, we check if the existing extent buffer has a proper refs. If not (the eb is being freed), we will detach that private eb pointer. The point here is, we are detaching that eb pointer by calling: - ClearPagePrivate() - put_page() The put_page() here is especially confusing, as it's decreaing the ref caused by attach_page_private(). Without knowing that, it looks like the put_page() is for the find_or_create_page() call, confusing the read. Since we're always modifing page private with attach_page_private() and detach_page_private(), the only open-coded detach_page_private() here is really confusing. Fix it by calling detach_page_private(). Signed-off-by: Qu Wenruo Reviewed-by: Johannes Thumshirn --- fs/btrfs/extent_io.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index f305777ee1a3..55115f485d09 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -5310,14 +5310,13 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, goto free_eb; } exists = NULL; + WARN_ON(PageDirty(p)); /* * Do this so attach doesn't complain and we need to * drop the ref the old guy had. */ - ClearPagePrivate(p); - WARN_ON(PageDirty(p)); - put_page(p); + detach_page_private(page); } attach_extent_buffer_page(eb, p); spin_unlock(&mapping->private_lock); From patchwork Wed Nov 18 08:53:07 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11914543 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6AE97C56202 for ; Wed, 18 Nov 2020 08:53:34 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0D28F20719 for ; Wed, 18 Nov 2020 08:53:33 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="tffQSsUR" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726998AbgKRIxc (ORCPT ); Wed, 18 Nov 2020 03:53:32 -0500 Received: from mx2.suse.de ([195.135.220.15]:47572 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725964AbgKRIxc (ORCPT ); Wed, 18 Nov 2020 03:53:32 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1605689610; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=K8tpplqD6qD6QSswY6SdMOGx3ZFJSlFmwmbwwL1xy5s=; b=tffQSsURdafsnvTtRagUQGyqRiVEj1iQKFgzLz44LdD1tsTRStMCl93ifAy748ipkDk1wF 7E7XD8Mh9/blZI3OgaZRpV4mf13xQbGZ0Xqqgf2Z6k4QePcuxbjqFDvnqiHZ10zYMmpOHI jNTBtGkNdnIq/CUIhbQ1hDhiBymbQSQ= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 8FC61AD2F for ; Wed, 18 Nov 2020 08:53:30 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH 02/14] btrfs: extent_io: introduce a helper to grab an existing extent buffer from a page Date: Wed, 18 Nov 2020 16:53:07 +0800 Message-Id: <20201118085319.56668-3-wqu@suse.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201118085319.56668-1-wqu@suse.com> References: <20201118085319.56668-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org This patch will extract the code to grab an extent buffer from a page into a helper, grab_extent_buffer_from_page(). This reduces one indent level, and provides the work place for later expansion for subapge support. Signed-off-by: Qu Wenruo Reviewed-by: Johannes Thumshirn --- fs/btrfs/extent_io.c | 60 ++++++++++++++++++++++++++------------------ 1 file changed, 36 insertions(+), 24 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 55115f485d09..759d2f2292ed 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -5249,6 +5249,36 @@ struct extent_buffer *alloc_test_extent_buffer(struct btrfs_fs_info *fs_info, } #endif +static struct extent_buffer *grab_extent_buffer_from_page(struct page *page) +{ + struct extent_buffer *exists; + + /* Page not yet attached to an extent buffer */ + if (!PagePrivate(page)) + return NULL; + + /* + * We could have already allocated an eb for this page + * and attached one so lets see if we can get a ref on + * the existing eb, and if we can we know it's good and + * we can just return that one, else we know we can just + * overwrite page->private. + */ + exists = (struct extent_buffer *)page->private; + if (atomic_inc_not_zero(&exists->refs)) { + mark_extent_buffer_accessed(exists, page); + return exists; + } + + WARN_ON(PageDirty(page)); + /* + * The page belongs to an eb which is being freed. + * Detach it from previous eb so that we can reuse it. + */ + detach_page_private(page); + return NULL; +} + struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, u64 start, u64 owner_root, int level) { @@ -5293,30 +5323,12 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, } spin_lock(&mapping->private_lock); - if (PagePrivate(p)) { - /* - * We could have already allocated an eb for this page - * and attached one so lets see if we can get a ref on - * the existing eb, and if we can we know it's good and - * we can just return that one, else we know we can just - * overwrite page->private. - */ - exists = (struct extent_buffer *)p->private; - if (atomic_inc_not_zero(&exists->refs)) { - spin_unlock(&mapping->private_lock); - unlock_page(p); - put_page(p); - mark_extent_buffer_accessed(exists, p); - goto free_eb; - } - exists = NULL; - WARN_ON(PageDirty(p)); - - /* - * Do this so attach doesn't complain and we need to - * drop the ref the old guy had. - */ - detach_page_private(page); + exists = grab_extent_buffer_from_page(p); + if (exists) { + spin_unlock(&mapping->private_lock); + unlock_page(p); + put_page(p); + goto free_eb; } attach_extent_buffer_page(eb, p); spin_unlock(&mapping->private_lock); From patchwork Wed Nov 18 08:53:08 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11914545 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 85C0BC5519F for ; Wed, 18 Nov 2020 08:53:36 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 30D2024248 for ; Wed, 18 Nov 2020 08:53:36 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="cOAjxmRn" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727029AbgKRIxf (ORCPT ); Wed, 18 Nov 2020 03:53:35 -0500 Received: from mx2.suse.de ([195.135.220.15]:47624 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725964AbgKRIxf (ORCPT ); Wed, 18 Nov 2020 03:53:35 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1605689613; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=DUj0cWKiIEgRFvhOFz5LC2T6WZX++Fj1bHql6QSzjXk=; b=cOAjxmRndtR8uS/x0nVrUpTXXKecX9HpiMZ3h/ZTdeFPksy4q6t+Wlf08CjfyAUFNbbgCf nJ4CLffuy/EjD6ijMf3S1tyMu42FVbhelU1jJVIm/BWgui1KomAix31sEHFoLilpl5J0Fj 3fhJ3GyTbT2pcp6qU0tbKCRSTFzCnto= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id AB0F6AD2F for ; Wed, 18 Nov 2020 08:53:33 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH 03/14] btrfs: extent_io: introduce the skeleton of btrfs_subpage structure Date: Wed, 18 Nov 2020 16:53:08 +0800 Message-Id: <20201118085319.56668-4-wqu@suse.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201118085319.56668-1-wqu@suse.com> References: <20201118085319.56668-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org For btrfs subpage support, we need a structure for record extra info for a page so that we can know things like which sector in the page is uptodate/dirty. This patch will introduce the skeleton structure for future btrfs subpage support. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 32 ++++++++++++++++++++++++++++++++ fs/btrfs/extent_io.h | 8 ++++++++ 2 files changed, 40 insertions(+) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 759d2f2292ed..2eaf09ff59ca 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -5279,6 +5279,38 @@ static struct extent_buffer *grab_extent_buffer_from_page(struct page *page) return NULL; } +int btrfs_attach_subpage(struct btrfs_fs_info *fs_info, struct page *page) +{ + struct btrfs_subpage *subpage; + + ASSERT(PageLocked(page)); + /* Either not subpage, or the page already has private attached */ + if (!btrfs_is_subpage(fs_info) || PagePrivate(page)) + return 0; + + subpage = kzalloc(sizeof(*subpage), GFP_NOFS); + if (!subpage) + return -ENOMEM; + + spin_lock_init(&subpage->lock); + attach_page_private(page, subpage); + return 0; +} + +void btrfs_detach_subpage(struct btrfs_fs_info *fs_info, struct page *page) +{ + struct btrfs_subpage *subpage; + + /* Either not subpage, or already detached */ + if (!btrfs_is_subpage(fs_info) || !PagePrivate(page)) + return; + + subpage = (struct btrfs_subpage *)detach_page_private(page); + ASSERT(subpage && bitmap_empty(subpage->tree_block_bitmap, + BTRFS_SUBPAGE_BITMAP_SIZE)); + kfree(subpage); +} + struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, u64 start, u64 owner_root, int level) { diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h index 0123c75ee203..4251bef25aac 100644 --- a/fs/btrfs/extent_io.h +++ b/fs/btrfs/extent_io.h @@ -307,6 +307,14 @@ blk_status_t btrfs_submit_read_repair(struct inode *inode, u64 start, u64 end, int failed_mirror, submit_bio_hook_t *submit_bio_hook); +#define BTRFS_SUBPAGE_BITMAP_SIZE (SZ_64K / SZ_4K) +struct btrfs_subpage { + spinlock_t lock; + DECLARE_BITMAP(tree_block_bitmap, BTRFS_SUBPAGE_BITMAP_SIZE); +}; + +int btrfs_attach_subpage(struct btrfs_fs_info *fs_info, struct page *page); +void btrfs_detach_subpage(struct btrfs_fs_info *fs_info, struct page *page); #ifdef CONFIG_BTRFS_FS_RUN_SANITY_TESTS bool find_lock_delalloc_range(struct inode *inode, struct page *locked_page, u64 *start, From patchwork Wed Nov 18 08:53:09 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11914547 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6D2E2C56202 for ; Wed, 18 Nov 2020 08:53:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1702220719 for ; Wed, 18 Nov 2020 08:53:38 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="C8X4UIyx" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727045AbgKRIxh (ORCPT ); Wed, 18 Nov 2020 03:53:37 -0500 Received: from mx2.suse.de ([195.135.220.15]:47650 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725964AbgKRIxh (ORCPT ); Wed, 18 Nov 2020 03:53:37 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1605689615; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=wAYWSbxWl4uls7k5AsHrvLlNpSM0BJ7NbJvg/bcx3Zg=; b=C8X4UIyxIq7NpL5Ri8j+WSQyJmqDibI4dqBljjRjYJ4ZwVUfsHMeQCQUCE9SDq0AkFnPEe D7FmbC47+mrai4/J4AxDuGZdKsj7lXcNtEnSJJL5ed61BV2vo20z8fBWt5uW9iVX2338e8 vzq4T9D7W0rbk9l6OssTRCFN9sWbJU4= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 59F8FAE91 for ; Wed, 18 Nov 2020 08:53:35 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH 04/14] btrfs: extent_io: make attach_extent_buffer_page() to handle subpage case Date: Wed, 18 Nov 2020 16:53:09 +0800 Message-Id: <20201118085319.56668-5-wqu@suse.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201118085319.56668-1-wqu@suse.com> References: <20201118085319.56668-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org For subpage case, we need to allocate new memory for each metadata page. So we need to: - Allow attach_extent_buffer_page() to return int To indicate allocation failure - Prealloc page->private for alloc_extent_buffer() We don't want to call memory allocation with spinlock hold, so do preallocation before we acquire the spin lock. - Handle subpage and regular case differently in attach_extent_buffer_page() For regular case, just do the usual thing. For subpage case, allocate new memory and update the tree_block bitmap. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 77 ++++++++++++++++++++++++++++++++++++-------- 1 file changed, 63 insertions(+), 14 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 2eaf09ff59ca..94101d1e04eb 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -3142,22 +3142,50 @@ static int submit_extent_page(unsigned int opf, return ret; } -static void attach_extent_buffer_page(struct extent_buffer *eb, +static int attach_extent_buffer_page(struct extent_buffer *eb, struct page *page) { - /* - * If the page is mapped to btree inode, we should hold the private - * lock to prevent race. - * For cloned or dummy extent buffers, their pages are not mapped and - * will not race with any other ebs. - */ - if (page->mapping) - lockdep_assert_held(&page->mapping->private_lock); + struct btrfs_fs_info *fs_info = eb->fs_info; + struct btrfs_subpage *subpage; + int start; + int nbits; + int ret; - if (!PagePrivate(page)) - attach_page_private(page, eb); - else - WARN_ON(page->private != (unsigned long)eb); + if (!btrfs_is_subpage(fs_info)) { + /* + * If the page is mapped to btree inode, we should hold the + * private lock to prevent race. + * For cloned or dummy extent buffers, their pages are not + * mapped and will not race with any other ebs. + */ + if (page->mapping) + lockdep_assert_held(&page->mapping->private_lock); + + if (!PagePrivate(page)) + attach_page_private(page, eb); + else + WARN_ON(page->private != (unsigned long)eb); + return 0; + } + + /* Already mapped, just update the existing range */ + if (PagePrivate(page)) + goto update_bitmap; + + /* Do new allocation to attach subpage */ + ret = btrfs_attach_subpage(fs_info, page); + if (ret < 0) + return ret; + +update_bitmap: + start = (eb->start - page_offset(page)) >> fs_info->sectorsize_bits; + nbits = eb->len >> fs_info->sectorsize_bits; + + subpage = (struct btrfs_subpage *)page->private; + spin_lock_bh(&subpage->lock); + bitmap_set(subpage->tree_block_bitmap, start, nbits); + spin_unlock_bh(&subpage->lock); + return 0; } void set_page_extent_mapped(struct page *page) @@ -5065,12 +5093,19 @@ struct extent_buffer *btrfs_clone_extent_buffer(const struct extent_buffer *src) return NULL; for (i = 0; i < num_pages; i++) { + int ret; + p = alloc_page(GFP_NOFS); if (!p) { btrfs_release_extent_buffer(new); return NULL; } - attach_extent_buffer_page(new, p); + ret = attach_extent_buffer_page(new, p); + if (ret < 0) { + put_page(p); + btrfs_release_extent_buffer(new); + return NULL; + } WARN_ON(PageDirty(p)); SetPageUptodate(p); new->pages[i] = p; @@ -5354,6 +5389,18 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, goto free_eb; } + /* + * Preallocate page->private for subpage case, so that + * we won't allocate memory with private_lock hold. + */ + ret = btrfs_attach_subpage(fs_info, p); + if (ret < 0) { + unlock_page(p); + put_page(p); + exists = ERR_PTR(-ENOMEM); + goto free_eb; + } + spin_lock(&mapping->private_lock); exists = grab_extent_buffer_from_page(p); if (exists) { @@ -5362,8 +5409,10 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, put_page(p); goto free_eb; } + /* Should not fail, as we have attached the subpage already */ attach_extent_buffer_page(eb, p); spin_unlock(&mapping->private_lock); + WARN_ON(PageDirty(p)); eb->pages[i] = p; if (!PageUptodate(p)) From patchwork Wed Nov 18 08:53:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11914549 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0375CC63697 for ; Wed, 18 Nov 2020 08:53:40 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9DC4C20719 for ; Wed, 18 Nov 2020 08:53:39 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="TKiSXSfU" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727120AbgKRIxi (ORCPT ); Wed, 18 Nov 2020 03:53:38 -0500 Received: from mx2.suse.de ([195.135.220.15]:47716 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725964AbgKRIxi (ORCPT ); Wed, 18 Nov 2020 03:53:38 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1605689617; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+hRBqJWdeoJm3C/t21Lef3o9zCWoAC6evNfOSfYUuHU=; b=TKiSXSfUd7q7zyUPx0lINpMkfhmvPXdn4u5lfH3KjnDn+ueDRh93PVy30IStsfK9LdosMW EGGdMyEl2HweJLvENhFDIfaCdD18FvXZLXbiYgfxhWAFk7LAytgwuIwul9Ty0ip8PubLKl b+28prBS6a9yWd1F1UecFqtQx20wxMo= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 0AD95ABF4 for ; Wed, 18 Nov 2020 08:53:37 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH 05/14] btrfs: extent_io: make grab_extent_buffer_from_page() to handle subpage case Date: Wed, 18 Nov 2020 16:53:10 +0800 Message-Id: <20201118085319.56668-6-wqu@suse.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201118085319.56668-1-wqu@suse.com> References: <20201118085319.56668-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org For subpage case, grab_extent_buffer_from_page() can't really get an extent buffer just from btrfs_subpage. Although we have btrfs_subpage::tree_block_bitmap, which can be used to grab the bytenr of an existing extent buffer, and can then go radix tree search to grab that existing eb. However we are still doing radix tree insert check in alloc_extent_buffer(), thus we don't really need to do the extra hassle, just let alloc_extent_buffer() to handle existing eb in radix tree. So for grab_extent_buffer_from_page(), just always return NULL for subpage case. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 94101d1e04eb..f424a26a695e 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -5284,10 +5284,19 @@ struct extent_buffer *alloc_test_extent_buffer(struct btrfs_fs_info *fs_info, } #endif -static struct extent_buffer *grab_extent_buffer_from_page(struct page *page) +static struct extent_buffer *grab_extent_buffer_from_page( + struct btrfs_fs_info *fs_info, struct page *page) { struct extent_buffer *exists; + /* + * For subpage case, we completely rely on radix tree to ensure we + * don't try to insert two eb for the same bytenr. + * So here we alwasy return NULL and just continue. + */ + if (btrfs_is_subpage(fs_info)) + return NULL; + /* Page not yet attached to an extent buffer */ if (!PagePrivate(page)) return NULL; @@ -5402,7 +5411,7 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, } spin_lock(&mapping->private_lock); - exists = grab_extent_buffer_from_page(p); + exists = grab_extent_buffer_from_page(fs_info, p); if (exists) { spin_unlock(&mapping->private_lock); unlock_page(p); From patchwork Wed Nov 18 08:53:11 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11914551 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EDA55C56202 for ; Wed, 18 Nov 2020 08:53:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8EA22241A5 for ; Wed, 18 Nov 2020 08:53:41 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="H2j4gxZp" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727293AbgKRIxk (ORCPT ); Wed, 18 Nov 2020 03:53:40 -0500 Received: from mx2.suse.de ([195.135.220.15]:47750 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725964AbgKRIxk (ORCPT ); Wed, 18 Nov 2020 03:53:40 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1605689619; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=klisPb1jEVYBo0v/m8llSMZzfCSmkqvGx70w4zEb97M=; b=H2j4gxZpAiMOYQbg5RbfLEc9tLKlfcJe7HUbdWmVrnvyHne4Gw/Xw3VnxUlmUeJQVpPqdV MwP76BBnb1ve2cmQnCsOV+c7pTg2Mq2xpzWUxk26KrexdgxIsxHg8XRFEMYrLZ+wp4eUEh G0s7robMd2Yk0GrTDdRhvZX2WeKH3/0= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id F14FEAD71 for ; Wed, 18 Nov 2020 08:53:38 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH 06/14] btrfs: extent_io: support subpage for extent buffer page release Date: Wed, 18 Nov 2020 16:53:11 +0800 Message-Id: <20201118085319.56668-7-wqu@suse.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201118085319.56668-1-wqu@suse.com> References: <20201118085319.56668-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org In btrfs_release_extent_buffer_pages(), we need to add extra handling for subpage. To do so, introduce a new helper, detach_extent_buffer_page(), to do different handling for regular and subpage cases. For subpage case, the new trick is to clear the range of current extent buffer, and detach page private if and only if we're the last tree block of the page. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 70 +++++++++++++++++++++++++++++++++----------- 1 file changed, 53 insertions(+), 17 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index f424a26a695e..090acf0e6a59 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -4999,25 +4999,12 @@ int extent_buffer_under_io(const struct extent_buffer *eb) test_bit(EXTENT_BUFFER_DIRTY, &eb->bflags)); } -/* - * Release all pages attached to the extent buffer. - */ -static void btrfs_release_extent_buffer_pages(struct extent_buffer *eb) +static void detach_extent_buffer_page(struct extent_buffer *eb, + struct page *page) { - int i; - int num_pages; - int mapped = !test_bit(EXTENT_BUFFER_UNMAPPED, &eb->bflags); - - BUG_ON(extent_buffer_under_io(eb)); - - num_pages = num_extent_pages(eb); - for (i = 0; i < num_pages; i++) { - struct page *page = eb->pages[i]; + struct btrfs_fs_info *fs_info = eb->fs_info; - if (!page) - continue; - if (mapped) - spin_lock(&page->mapping->private_lock); + if (!btrfs_is_subpage(fs_info)) { /* * We do this since we'll remove the pages after we've * removed the eb from the radix tree, so we could race @@ -5036,6 +5023,55 @@ static void btrfs_release_extent_buffer_pages(struct extent_buffer *eb) */ detach_page_private(page); } + } + + /* + * For subpage case, clear the range in tree_block_bitmap, + * and if we're the last one, detach private completely. + */ + if (PagePrivate(page)) { + struct btrfs_subpage *subpage; + int start = (eb->start - page_offset(page)) >> + fs_info->sectorsize_bits; + int nbits = (eb->len) >> fs_info->sectorsize_bits; + bool last = false; + + ASSERT(page_offset(page) <= eb->start && + eb->start + eb->len <= page_offset(page) + PAGE_SIZE); + + subpage = (struct btrfs_subpage *)page->private; + spin_lock_bh(&subpage->lock); + bitmap_clear(subpage->tree_block_bitmap, start, nbits); + if (bitmap_empty(subpage->tree_block_bitmap, + BTRFS_SUBPAGE_BITMAP_SIZE)) + last = true; + spin_unlock_bh(&subpage->lock); + if (last) + btrfs_detach_subpage(fs_info, page); + } +} + +/* + * Release all pages attached to the extent buffer. + */ +static void btrfs_release_extent_buffer_pages(struct extent_buffer *eb) +{ + int i; + int num_pages; + int mapped = !test_bit(EXTENT_BUFFER_UNMAPPED, &eb->bflags); + + ASSERT(!extent_buffer_under_io(eb)); + + num_pages = num_extent_pages(eb); + for (i = 0; i < num_pages; i++) { + struct page *page = eb->pages[i]; + + if (!page) + continue; + if (mapped) + spin_lock(&page->mapping->private_lock); + + detach_extent_buffer_page(eb, page); if (mapped) spin_unlock(&page->mapping->private_lock); From patchwork Wed Nov 18 08:53:12 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11914553 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.9 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, UNWANTED_LANGUAGE_BODY,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5FD03C5519F for ; Wed, 18 Nov 2020 08:53:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F2C3E20719 for ; Wed, 18 Nov 2020 08:53:43 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="pgzjp0Su" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727395AbgKRIxn (ORCPT ); Wed, 18 Nov 2020 03:53:43 -0500 Received: from mx2.suse.de ([195.135.220.15]:47760 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725964AbgKRIxn (ORCPT ); Wed, 18 Nov 2020 03:53:43 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1605689621; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Uibu+OMBtN5wk/b/FCAKad3yQwQwR3rPRKBTmOUHEZE=; b=pgzjp0SuGUGmXKPQK82ZBLhTw0KzvijEoGlYel+d/KCcqZUyuczQMZE1LHui/pRbgR36Oh xIkE3A27J/67bZV6br7+Uueit68lCTU2amoOSb8LAnELgrxpiGwmw83WHCFN60NafmkV1i JsXb+ksaTYM8ZrCmdqJgon60pckHu4s= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 3EEB5ABF4 for ; Wed, 18 Nov 2020 08:53:41 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH 07/14] btrfs: extent_io: make set/clear_extent_buffer_uptodate() to support subpage size Date: Wed, 18 Nov 2020 16:53:12 +0800 Message-Id: <20201118085319.56668-8-wqu@suse.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201118085319.56668-1-wqu@suse.com> References: <20201118085319.56668-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org For those functions, to support subpage size they just need the follow work: - set/clear uptodate bitmap - set page Uptodate if the full range of the page is uptodate Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 40 ++++++++++++++++++++++++++++++++++++---- fs/btrfs/extent_io.h | 1 + 2 files changed, 37 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 090acf0e6a59..b3edd7fba5c8 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -5663,10 +5663,24 @@ bool set_extent_buffer_dirty(struct extent_buffer *eb) void clear_extent_buffer_uptodate(struct extent_buffer *eb) { - int i; - struct page *page; + struct btrfs_fs_info *fs_info = eb->fs_info; + struct page *page = eb->pages[0]; int num_pages; + int i; + + if (btrfs_is_subpage(fs_info)) { + struct btrfs_subpage *subpage; + int bit_start = (eb->start - page_offset(page)) >> + fs_info->sectorsize_bits; + int nbits = fs_info->nodesize >> + fs_info->sectorsize_bits; + subpage = (struct btrfs_subpage *)page->private; + + spin_lock_bh(&subpage->lock); + bitmap_clear(subpage->uptodate_bitmap, bit_start, nbits); + spin_unlock_bh(&subpage->lock); + } clear_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags); num_pages = num_extent_pages(eb); for (i = 0; i < num_pages; i++) { @@ -5678,11 +5692,29 @@ void clear_extent_buffer_uptodate(struct extent_buffer *eb) void set_extent_buffer_uptodate(struct extent_buffer *eb) { - int i; - struct page *page; + struct btrfs_fs_info *fs_info = eb->fs_info; + struct page *page = eb->pages[0]; int num_pages; + int i; set_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags); + if (btrfs_is_subpage(fs_info)) { + struct btrfs_subpage *subpage; + int bit_start = (eb->start - page_offset(page)) >> + fs_info->sectorsize_bits; + int nbits = fs_info->nodesize >> + fs_info->sectorsize_bits; + + subpage = (struct btrfs_subpage *)page->private; + + spin_lock_bh(&subpage->lock); + bitmap_set(subpage->uptodate_bitmap, bit_start, nbits); + if (bitmap_full(subpage->uptodate_bitmap, + BTRFS_SUBPAGE_BITMAP_SIZE)) + SetPageUptodate(page); + spin_unlock_bh(&subpage->lock); + return; + } num_pages = num_extent_pages(eb); for (i = 0; i < num_pages; i++) { page = eb->pages[i]; diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h index 4251bef25aac..11e1e013cb8c 100644 --- a/fs/btrfs/extent_io.h +++ b/fs/btrfs/extent_io.h @@ -311,6 +311,7 @@ blk_status_t btrfs_submit_read_repair(struct inode *inode, struct btrfs_subpage { spinlock_t lock; DECLARE_BITMAP(tree_block_bitmap, BTRFS_SUBPAGE_BITMAP_SIZE); + DECLARE_BITMAP(uptodate_bitmap, BTRFS_SUBPAGE_BITMAP_SIZE); }; int btrfs_attach_subpage(struct btrfs_fs_info *fs_info, struct page *page); From patchwork Wed Nov 18 08:53:13 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11914555 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 42650C63697 for ; Wed, 18 Nov 2020 08:53:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E46E720719 for ; Wed, 18 Nov 2020 08:53:45 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="qzmn8EiF" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727415AbgKRIxo (ORCPT ); Wed, 18 Nov 2020 03:53:44 -0500 Received: from mx2.suse.de ([195.135.220.15]:47778 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725964AbgKRIxo (ORCPT ); Wed, 18 Nov 2020 03:53:44 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1605689622; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sZON5b6EiJQAQZoqvwG8jV0ASFtt60KuiyKi+yTLPFI=; b=qzmn8EiFw8hZYAEvkow1OsooWwqTqKRzHFSUlp55HXBRVFMoxqM5v8KZogQGHf/SzwUT0Z 5Bn4TWg4ZIHbu0ihRAy5qQ+GVgmWHZRymVwrtfQ5zKE/3KpZ8vlYeO6b/WoS8CgAvJXM9B OYAunlOg0f7hcuoTIDr0i/MbK8p4x5M= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id CA8C9AD2F for ; Wed, 18 Nov 2020 08:53:42 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH 08/14] btrfs: extent_io: implement try_release_extent_buffer() for subpage metadata support Date: Wed, 18 Nov 2020 16:53:13 +0800 Message-Id: <20201118085319.56668-9-wqu@suse.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201118085319.56668-1-wqu@suse.com> References: <20201118085319.56668-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Unlike the original try_release_extent_buffer, try_release_subpage_extent_buffer() will iterate through btrfs_subpage::tree_block_bitmap, and try to release each extent buffer. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 69 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 69 insertions(+) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index b3edd7fba5c8..28f35eb06bf8 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -6340,10 +6340,79 @@ void memmove_extent_buffer(const struct extent_buffer *dst, } } +static int try_release_subpage_extent_buffer(struct page *page) +{ + struct btrfs_fs_info *fs_info = btrfs_sb(page->mapping->host->i_sb); + u64 page_start = page_offset(page); + int bitmap_size = BTRFS_SUBPAGE_BITMAP_SIZE; + int bit_start = 0; + int ret; + + while (bit_start < bitmap_size) { + struct btrfs_subpage *subpage; + struct extent_buffer *eb; + u64 start; + + /* + * Make sure the page still has private, as previous run can + * detach the private + */ + spin_lock(&page->mapping->private_lock); + if (!PagePrivate(page)) { + spin_unlock(&page->mapping->private_lock); + break; + } + subpage = (struct btrfs_subpage *)page->private; + spin_unlock(&page->mapping->private_lock); + + spin_lock_bh(&subpage->lock); + bit_start = find_next_bit(subpage->tree_block_bitmap, + BTRFS_SUBPAGE_BITMAP_SIZE, bit_start); + spin_unlock_bh(&subpage->lock); + if (bit_start >= bitmap_size) + break; + start = bit_start * fs_info->sectorsize + page_start; + bit_start += fs_info->nodesize >> fs_info->sectorsize_bits; + /* + * Here we can't call find_extent_buffer() which will increase + * eb->refs. + */ + rcu_read_lock(); + eb = radix_tree_lookup(&fs_info->buffer_radix, + start >> fs_info->sectorsize_bits); + rcu_read_unlock(); + ASSERT(eb); + spin_lock(&eb->refs_lock); + if (atomic_read(&eb->refs) != 1 || extent_buffer_under_io(eb) || + !test_and_clear_bit(EXTENT_BUFFER_TREE_REF, &eb->bflags)) { + spin_unlock(&eb->refs_lock); + continue; + } + /* + * Here we don't care the return value, we will always check + * the page private at the end. + * And release_extent_buffer() will release the refs_lock. + */ + release_extent_buffer(eb); + } + /* Finally to check if we have cleared page private */ + spin_lock(&page->mapping->private_lock); + if (!PagePrivate(page)) + ret = 1; + else + ret = 0; + spin_unlock(&page->mapping->private_lock); + return ret; + +} + int try_release_extent_buffer(struct page *page) { struct extent_buffer *eb; + if (btrfs_is_subpage(btrfs_sb(page->mapping->host->i_sb))) + return try_release_subpage_extent_buffer(page); + /* * We need to make sure nobody is attaching this page to an eb right * now. From patchwork Wed Nov 18 08:53:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11914557 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3FB56C5519F for ; Wed, 18 Nov 2020 08:53:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DF94C241A5 for ; Wed, 18 Nov 2020 08:53:47 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="IE96hKPV" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727435AbgKRIxr (ORCPT ); Wed, 18 Nov 2020 03:53:47 -0500 Received: from mx2.suse.de ([195.135.220.15]:47818 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725964AbgKRIxq (ORCPT ); Wed, 18 Nov 2020 03:53:46 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1605689625; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=bYs2DNck8XXzhJBLuS91GlFju/++E1u4OEs4AHhbeNc=; b=IE96hKPVlbXkZE3mlBBp6/kr7KWfTQ1ceBLaLhgwOgvJtLLF8czg28t9RuKTxqpeu9K4Fx UY2w6t+LVaybf99NKELgzUQ85nysi3xvX0fBFlS88ybhY4Cs/NCArAMg7QDEL9iXuOPyvY LFI4j986/IfJvx5oASj+8/7rGwphCSg= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 2A28EABF4 for ; Wed, 18 Nov 2020 08:53:45 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH 09/14] btrfs: extent_io: introduce read_extent_buffer_subpage() Date: Wed, 18 Nov 2020 16:53:14 +0800 Message-Id: <20201118085319.56668-10-wqu@suse.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201118085319.56668-1-wqu@suse.com> References: <20201118085319.56668-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Introduce a new helper, read_extent_buffer_subpage(), to do the subpage extent buffer read. The difference between regular and subpage routines are: - No page locking Here we completely rely on extent locking. Page locking can reduce the concurrency greatly, as if we lock one page to read one extent buffer, all the other extent buffers in the same page will have to wait. - Extent uptodate condition Despite the existing PageUptodate() and EXTENT_BUFFER_UPTODATE check, We also need to check btrfs_subpage::uptodate_bitmap. - No page loop Just one page, no need to loop, this greately simplified the subpage routine. This patch only implemented the bio submit part, no endio support yet. Signed-off-by: Qu Wenruo --- fs/btrfs/disk-io.c | 1 + fs/btrfs/extent_io.c | 72 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 73 insertions(+) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 8a558a43818d..b395daf62086 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -604,6 +604,7 @@ int btrfs_validate_metadata_buffer(struct btrfs_io_bio *io_bio, ASSERT(page->private); eb = (struct extent_buffer *)page->private; + /* * The pending IO might have been the only thing that kept this buffer * in memory. Make sure we have a ref for all this other checks diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 28f35eb06bf8..35aee688d6c1 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -5722,6 +5722,75 @@ void set_extent_buffer_uptodate(struct extent_buffer *eb) } } +static int read_extent_buffer_subpage(struct extent_buffer *eb, int wait, + int mirror_num) +{ + struct btrfs_fs_info *fs_info = eb->fs_info; + struct btrfs_subpage *subpage; + struct extent_io_tree *io_tree; + struct page *page = eb->pages[0]; + struct bio *bio = NULL; + int start = (eb->start - page_offset(page)) >> fs_info->sectorsize_bits; + int ret = 0; + + ASSERT(!test_bit(EXTENT_BUFFER_UNMAPPED, &eb->bflags)); + ASSERT(PagePrivate(page)); + subpage = (struct btrfs_subpage *)page->private; + io_tree = &BTRFS_I(fs_info->btree_inode)->io_tree; + + if (wait == WAIT_NONE) { + ret = try_lock_extent(io_tree, eb->start, + eb->start + eb->len - 1); + if (ret <= 0) + return ret; + } else { + ret = lock_extent(io_tree, eb->start, eb->start + eb->len - 1); + if (ret < 0) + return ret; + } + + ret = 0; + if (test_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags) || + PageUptodate(page) || test_bit(start, subpage->uptodate_bitmap)) { + set_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags); + unlock_extent(io_tree, eb->start, eb->start + eb->len - 1); + return ret; + } + + clear_bit(EXTENT_BUFFER_READ_ERR, &eb->bflags); + eb->read_mirror = 0; + atomic_set(&eb->io_pages, 1); + check_buffer_tree_ref(eb); + + ret = submit_extent_page(REQ_OP_READ | REQ_META, NULL, page, eb->start, + eb->len, eb->start - page_offset(page), &bio, + end_bio_extent_readpage, mirror_num, 0, 0, + true); + if (ret) { + /* + * In the endio function, if we hit something wrong we will + * increase the io_pages, so here we need to decrease it for error + * path. + */ + atomic_dec(&eb->io_pages); + } + if (bio) { + int tmp; + + tmp = submit_one_bio(bio, mirror_num, 0); + if (tmp < 0) + return tmp; + } + if (ret || wait != WAIT_COMPLETE) + return ret; + + wait_extent_bit(io_tree, eb->start, eb->start + eb->len - 1, + EXTENT_LOCKED); + if (!test_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags)) + ret = -EIO; + return ret; +} + int read_extent_buffer_pages(struct extent_buffer *eb, int wait, int mirror_num) { int i; @@ -5738,6 +5807,9 @@ int read_extent_buffer_pages(struct extent_buffer *eb, int wait, int mirror_num) if (test_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags)) return 0; + if (btrfs_is_subpage(eb->fs_info)) + return read_extent_buffer_subpage(eb, wait, mirror_num); + num_pages = num_extent_pages(eb); for (i = 0; i < num_pages; i++) { page = eb->pages[i]; From patchwork Wed Nov 18 08:53:15 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11914559 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.9 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, UNWANTED_LANGUAGE_BODY,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0C0B7C63798 for ; Wed, 18 Nov 2020 08:53:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A7C1D20719 for ; Wed, 18 Nov 2020 08:53:49 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="slcgx4bU" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727449AbgKRIxs (ORCPT ); Wed, 18 Nov 2020 03:53:48 -0500 Received: from mx2.suse.de ([195.135.220.15]:47868 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725964AbgKRIxs (ORCPT ); Wed, 18 Nov 2020 03:53:48 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1605689626; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kiPPDMMgipJI7Lx96z3mVMeHfhxE0jO+psrNZsOwyN4=; b=slcgx4bUJiINXkrbfPvGz6AquFLHhBmOEF1ft9QaWl7qvUPR2whUWZmgYQtFW1z5nTNFqD 5tbKWDpYx3pD9H/1G0almKwvoJ4nPtkFBB2tXSwo1zQGu9nz0t0I52W+po0TP4I7PtHJhP xXf9F6/fVVjANaTNidDjG8s6iaBKfXw= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id D3E8EAD2F for ; Wed, 18 Nov 2020 08:53:46 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH 10/14] btrfs: extent_io: make endio_readpage_update_page_status() to handle subpage case Date: Wed, 18 Nov 2020 16:53:15 +0800 Message-Id: <20201118085319.56668-11-wqu@suse.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201118085319.56668-1-wqu@suse.com> References: <20201118085319.56668-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org To handle subpage status update, add the following new tricks: - Set btrfs_subpage::error_bitmap Now if we hit an error, we set the corresponding bits in error bitmap, then call ClearPageUptodate() and SetPageError(). - Uptodate page status according to uptodate_bitmap Now we only SetPageUptodate() when the full page contains uptodate sectors. Also if we cleared all error bit during read, then we also ClearPageError() - No page unlock for metadata Since metadata doesn't utilize page locking at all, skip it for now. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 56 +++++++++++++++++++++++++++++++++++++++----- fs/btrfs/extent_io.h | 1 + 2 files changed, 51 insertions(+), 6 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 35aee688d6c1..236de0b6b20a 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -2847,15 +2847,59 @@ endio_readpage_release_extent(struct processed_extent *processed, processed->uptodate = uptodate; } -static void endio_readpage_update_page_status(struct page *page, bool uptodate) +static void endio_readpage_update_page_status(struct page *page, bool uptodate, + u64 start, u64 end) { - if (uptodate) { - SetPageUptodate(page); - } else { + struct btrfs_fs_info *fs_info = btrfs_sb(page->mapping->host->i_sb); + struct btrfs_subpage *subpage; + int bit_start; + int nbits; + bool all_uptodate = false; + bool no_error = false; + + ASSERT(page_offset(page) <= start && + end <= page_offset(page) + PAGE_SIZE - 1); + + if (!btrfs_is_subpage(fs_info)) { + if (uptodate) { + SetPageUptodate(page); + } else { + ClearPageUptodate(page); + SetPageError(page); + } + unlock_page(page); + return; + } + + ASSERT(PagePrivate(page) && page->private); + subpage = (struct btrfs_subpage *)page->private; + bit_start = (start - page_offset(page)) >> fs_info->sectorsize_bits; + nbits = fs_info->nodesize >> fs_info->sectorsize_bits; + + if (!uptodate) { + spin_lock_bh(&subpage->lock); + bitmap_set(subpage->error_bitmap, bit_start, nbits); + spin_unlock_bh(&subpage->lock); + ClearPageUptodate(page); SetPageError(page); + return; } - unlock_page(page); + + spin_lock_bh(&subpage->lock); + bitmap_set(subpage->uptodate_bitmap, bit_start, nbits); + bitmap_clear(subpage->error_bitmap, bit_start, nbits); + if (bitmap_full(subpage->uptodate_bitmap, BTRFS_SUBPAGE_BITMAP_SIZE)) + all_uptodate = true; + if (bitmap_empty(subpage->error_bitmap, BTRFS_SUBPAGE_BITMAP_SIZE)) + no_error = true; + spin_unlock_bh(&subpage->lock); + + if (no_error) + ClearPageError(page); + if (all_uptodate) + SetPageUptodate(page); + return; } /* @@ -2985,7 +3029,7 @@ static void end_bio_extent_readpage(struct bio *bio) } bio_offset += len; - endio_readpage_update_page_status(page, uptodate); + endio_readpage_update_page_status(page, uptodate, start, end); endio_readpage_release_extent(&processed, BTRFS_I(inode), start, end, uptodate); } diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h index 11e1e013cb8c..b4d0e39ebceb 100644 --- a/fs/btrfs/extent_io.h +++ b/fs/btrfs/extent_io.h @@ -312,6 +312,7 @@ struct btrfs_subpage { spinlock_t lock; DECLARE_BITMAP(tree_block_bitmap, BTRFS_SUBPAGE_BITMAP_SIZE); DECLARE_BITMAP(uptodate_bitmap, BTRFS_SUBPAGE_BITMAP_SIZE); + DECLARE_BITMAP(error_bitmap, BTRFS_SUBPAGE_BITMAP_SIZE); }; int btrfs_attach_subpage(struct btrfs_fs_info *fs_info, struct page *page); From patchwork Wed Nov 18 08:53:16 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11914561 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CF157C56202 for ; Wed, 18 Nov 2020 08:53:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 77BE820719 for ; Wed, 18 Nov 2020 08:53:51 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="gnWoQwIx" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726504AbgKRIxu (ORCPT ); Wed, 18 Nov 2020 03:53:50 -0500 Received: from mx2.suse.de ([195.135.220.15]:47878 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725772AbgKRIxu (ORCPT ); Wed, 18 Nov 2020 03:53:50 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1605689628; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=maDN01w4egfhxhZj5odDhJWzJ6lGmlXG3Fhyckd9i5A=; b=gnWoQwIxlA9a2XGuoMXAt+MSkXLK044if3CzOijLMOjSyRRD2YVPLwGYE75rn9ZCmKHR8c pfEVxH4g4vw/zinueIbImmceV4cBbHBJwIRhBOtalTGUeylhQiA3QUOTXCHV3gF4S4xTb4 w7aC9vW5kDfAZFyxpMmFGzHaqoYx3l4= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id A8C30ABF4 for ; Wed, 18 Nov 2020 08:53:48 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH 11/14] btrfs: disk-io: introduce subpage metadata validation check Date: Wed, 18 Nov 2020 16:53:16 +0800 Message-Id: <20201118085319.56668-12-wqu@suse.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201118085319.56668-1-wqu@suse.com> References: <20201118085319.56668-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org For subpage metadata validation check, there are some difference: - Read must finish in one bvec Since we're just reading one subpage range in one page, it should never be split into two bios nor two bvecs. - How to grab the existing eb Instead of grabbing eb using page->private, we have to go search radix tree as we don't have any direct pointer at hand. Signed-off-by: Qu Wenruo --- fs/btrfs/disk-io.c | 82 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 82 insertions(+) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index b395daf62086..699b999c8ba3 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -593,6 +593,84 @@ static int validate_extent_buffer(struct extent_buffer *eb) return ret; } +static int validate_subpage_buffer(struct page *page, u64 start, u64 end, + int mirror) +{ + struct btrfs_fs_info *fs_info = btrfs_sb(page->mapping->host->i_sb); + struct extent_buffer *eb; + int reads_done; + int ret = 0; + + if (!IS_ALIGNED(start, fs_info->sectorsize) || + !IS_ALIGNED(end - start + 1, fs_info->sectorsize) || + !IS_ALIGNED(end - start + 1, fs_info->nodesize)) { + WARN_ON(IS_ENABLED(CONFIG_BTRFS_DEBUG)); + btrfs_err(fs_info, "invalid tree read bytenr"); + return -EUCLEAN; + } + + /* + * We don't allow bio merge for subpage metadata read, so we should + * only get one eb for each endio hook. + */ + ASSERT(end == start + fs_info->nodesize - 1); + ASSERT(PagePrivate(page)); + + rcu_read_lock(); + eb = radix_tree_lookup(&fs_info->buffer_radix, + start / fs_info->sectorsize); + rcu_read_unlock(); + + /* + * When we are reading one tree block, eb must have been + * inserted into the radix tree. If not something is wrong. + */ + if (!eb) { + WARN_ON(IS_ENABLED(CONFIG_BTRFS_DEBUG)); + btrfs_err(fs_info, + "can't find extent buffer for bytenr %llu", + start); + return -EUCLEAN; + } + /* + * The pending IO might have been the only thing that kept + * this buffer in memory. Make sure we have a ref for all + * this other checks + */ + atomic_inc(&eb->refs); + + reads_done = atomic_dec_and_test(&eb->io_pages); + /* Subpage read must finish in page read */ + ASSERT(reads_done); + + eb->read_mirror = mirror; + if (test_bit(EXTENT_BUFFER_READ_ERR, &eb->bflags)) { + ret = -EIO; + goto err; + } + ret = validate_extent_buffer(eb); + if (ret < 0) + goto err; + + if (test_and_clear_bit(EXTENT_BUFFER_READAHEAD, &eb->bflags)) + btree_readahead_hook(eb, ret); + + set_extent_buffer_uptodate(eb); + + free_extent_buffer(eb); + return ret; +err: + /* + * our io error hook is going to dec the io pages + * again, we have to make sure it has something to + * decrement + */ + atomic_inc(&eb->io_pages); + clear_extent_buffer_uptodate(eb); + free_extent_buffer(eb); + return ret; +} + int btrfs_validate_metadata_buffer(struct btrfs_io_bio *io_bio, struct page *page, u64 start, u64 end, int mirror) @@ -602,6 +680,10 @@ int btrfs_validate_metadata_buffer(struct btrfs_io_bio *io_bio, int reads_done; ASSERT(page->private); + + if (btrfs_is_subpage(btrfs_sb(page->mapping->host->i_sb))) + return validate_subpage_buffer(page, start, end, mirror); + eb = (struct extent_buffer *)page->private; From patchwork Wed Nov 18 08:53:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11914563 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1CEE5C5519F for ; Wed, 18 Nov 2020 08:53:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B5DF520719 for ; Wed, 18 Nov 2020 08:53:53 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="Hk4uRcPf" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727473AbgKRIxw (ORCPT ); Wed, 18 Nov 2020 03:53:52 -0500 Received: from mx2.suse.de ([195.135.220.15]:47888 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725772AbgKRIxw (ORCPT ); Wed, 18 Nov 2020 03:53:52 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1605689630; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=AGgYqM4sh5iqAPoGDfcQql8JeD4VfBTdoN6LgNP+qMc=; b=Hk4uRcPfHGj7GT7l1WMiqKcaZIQDcsiMpy+S5gp1Up0d3xieKg0w9qmCDoCZDdo2ONZG8G WWc0CKv+nxXTvQJRc8OfghEK4y4fIScAl02+gDekFFjKlR5ntN/a/ZLEJ+pLf99ZRazkzz CFo29ETV+LLpyiXQi6SxNrsN3cL6i5k= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 8EB2DAD2F for ; Wed, 18 Nov 2020 08:53:50 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH 12/14] btrfs: introduce btrfs_subpage for data inodes Date: Wed, 18 Nov 2020 16:53:17 +0800 Message-Id: <20201118085319.56668-13-wqu@suse.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201118085319.56668-1-wqu@suse.com> References: <20201118085319.56668-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org To support subpage sector size, data also need extra info to make sure which sectors in a page are uptodate/dirty/... This patch will make pages for data inodes to get btrfs_subpage structure attached, and detached when the page is freed. This patch also slightly changes the timing when set_page_extent_mapped() to make sure: - We have page->mapping set page->mapping->host is used to grab btrfs_fs_info, thus we can only call this function after page is mapped to an inode. One call site attaches pages to inode manually, thus we have to modify the timing of set_page_extent_mapped() a little. - As soon as possible, before other operations Since memory allocation can fail, we have to do extra error handling. Calling set_page_extent_mapped() as soon as possible can simply the error handling for several call sites. The idea is pretty much the same as iomap_page, but with more bitmaps for btrfs specific cases. Currently the plan is to switch iomap if iomap can provide sector aligned write back (only write back dirty sectors, but not the full page, data balance require this feature). So we will stick to btrfs specific bitmap for now. Signed-off-by: Qu Wenruo --- fs/btrfs/compression.c | 10 ++++++-- fs/btrfs/extent_io.c | 47 +++++++++++++++++++++++++++++++++---- fs/btrfs/extent_io.h | 3 ++- fs/btrfs/file.c | 10 +++++--- fs/btrfs/free-space-cache.c | 15 +++++++++--- fs/btrfs/inode.c | 12 ++++++---- fs/btrfs/ioctl.c | 5 +++- fs/btrfs/reflink.c | 5 +++- fs/btrfs/relocation.c | 12 ++++++++-- 9 files changed, 98 insertions(+), 21 deletions(-) diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c index 3fb6fde2ca13..f0b119a910a4 100644 --- a/fs/btrfs/compression.c +++ b/fs/btrfs/compression.c @@ -542,13 +542,19 @@ static noinline int add_ra_bio_pages(struct inode *inode, goto next; } - end = last_offset + PAGE_SIZE - 1; /* * at this point, we have a locked page in the page cache * for these bytes in the file. But, we have to make * sure they map to this compressed extent on disk. */ - set_page_extent_mapped(page); + ret = set_page_extent_mapped(page); + if (ret < 0) { + unlock_page(page); + put_page(page); + break; + } + + end = last_offset + PAGE_SIZE - 1; lock_extent(tree, last_offset, end); read_lock(&em_tree->lock); em = lookup_extent_mapping(em_tree, last_offset, diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 236de0b6b20a..3d1dee27db8a 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -3232,10 +3232,40 @@ static int attach_extent_buffer_page(struct extent_buffer *eb, return 0; } -void set_page_extent_mapped(struct page *page) +int __must_check set_page_extent_mapped(struct page *page) { - if (!PagePrivate(page)) + struct btrfs_fs_info *fs_info; + + ASSERT(page->mapping); + + if (PagePrivate(page)) + return 0; + + fs_info = btrfs_sb(page->mapping->host->i_sb); + if (!btrfs_is_subpage(fs_info)) { attach_page_private(page, (void *)EXTENT_PAGE_PRIVATE); + return 0; + } + + return btrfs_attach_subpage(fs_info, page); +} + +void clear_page_extent_mapped(struct page *page) +{ + struct btrfs_fs_info *fs_info; + + ASSERT(page->mapping); + + if (!PagePrivate(page)) + return; + + fs_info = btrfs_sb(page->mapping->host->i_sb); + if (!btrfs_is_subpage(fs_info)) { + detach_page_private(page); + return; + } + + btrfs_detach_subpage(fs_info, page); } static struct extent_map * @@ -3292,7 +3322,12 @@ int btrfs_do_readpage(struct page *page, struct extent_map **em_cached, unsigned long this_bio_flag = 0; struct extent_io_tree *tree = &BTRFS_I(inode)->io_tree; - set_page_extent_mapped(page); + ret = set_page_extent_mapped(page); + if (ret < 0) { + unlock_extent(tree, start, end); + SetPageError(page); + goto out; + } if (!PageUptodate(page)) { if (cleancache_get_page(page) == 0) { @@ -3737,7 +3772,11 @@ static int __extent_writepage(struct page *page, struct writeback_control *wbc, flush_dcache_page(page); } - set_page_extent_mapped(page); + ret = set_page_extent_mapped(page); + if (ret < 0) { + SetPageError(page); + goto done; + } if (!epd->extent_locked) { ret = writepage_delalloc(BTRFS_I(inode), page, wbc, start, diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h index b4d0e39ebceb..01ec178a1ab9 100644 --- a/fs/btrfs/extent_io.h +++ b/fs/btrfs/extent_io.h @@ -181,7 +181,8 @@ int btree_write_cache_pages(struct address_space *mapping, void extent_readahead(struct readahead_control *rac); int extent_fiemap(struct btrfs_inode *inode, struct fiemap_extent_info *fieinfo, u64 start, u64 len); -void set_page_extent_mapped(struct page *page); +int __must_check set_page_extent_mapped(struct page *page); +void clear_page_extent_mapped(struct page *page); struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, u64 start, u64 owner_root, int level); diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 69147091f219..41188b751808 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -1370,6 +1370,12 @@ static noinline int prepare_pages(struct inode *inode, struct page **pages, goto fail; } + err = set_page_extent_mapped(pages[i]); + if (err < 0) { + faili = i; + goto fail; + } + if (i == 0) err = prepare_uptodate_page(inode, pages[i], pos, force_uptodate); @@ -1467,10 +1473,8 @@ lock_and_cleanup_extent_if_need(struct btrfs_inode *inode, struct page **pages, * We'll call btrfs_dirty_pages() later on, and that will flip around * delalloc bits and dirty the pages as required. */ - for (i = 0; i < num_pages; i++) { - set_page_extent_mapped(pages[i]); + for (i = 0; i < num_pages; i++) WARN_ON(!PageLocked(pages[i])); - } return ret; } diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c index 58bd2d3e54db..115e2a7fe74a 100644 --- a/fs/btrfs/free-space-cache.c +++ b/fs/btrfs/free-space-cache.c @@ -385,11 +385,22 @@ static int io_ctl_prepare_pages(struct btrfs_io_ctl *io_ctl, bool uptodate) int i; for (i = 0; i < io_ctl->num_pages; i++) { + int ret; + page = find_or_create_page(inode->i_mapping, i, mask); if (!page) { io_ctl_drop_pages(io_ctl); return -ENOMEM; } + + ret = set_page_extent_mapped(page); + if (ret < 0) { + unlock_page(page); + put_page(page); + io_ctl_drop_pages(io_ctl); + return -ENOMEM; + } + io_ctl->pages[i] = page; if (uptodate && !PageUptodate(page)) { btrfs_readpage(NULL, page); @@ -409,10 +420,8 @@ static int io_ctl_prepare_pages(struct btrfs_io_ctl *io_ctl, bool uptodate) } } - for (i = 0; i < io_ctl->num_pages; i++) { + for (i = 0; i < io_ctl->num_pages; i++) clear_page_dirty_for_io(io_ctl->pages[i]); - set_page_extent_mapped(io_ctl->pages[i]); - } return 0; } diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 750aa3770d8f..b9918214cd23 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -4717,6 +4717,9 @@ int btrfs_truncate_block(struct inode *inode, loff_t from, loff_t len, ret = -ENOMEM; goto out; } + ret = set_page_extent_mapped(page); + if (ret < 0) + goto out_unlock; if (!PageUptodate(page)) { ret = btrfs_readpage(NULL, page); @@ -4734,7 +4737,6 @@ int btrfs_truncate_block(struct inode *inode, loff_t from, loff_t len, wait_on_page_writeback(page); lock_extent_bits(io_tree, block_start, block_end, &cached_state); - set_page_extent_mapped(page); ordered = btrfs_lookup_ordered_extent(BTRFS_I(inode), block_start); if (ordered) { @@ -8118,7 +8120,7 @@ static int __btrfs_releasepage(struct page *page, gfp_t gfp_flags) { int ret = try_release_extent_mapping(page, gfp_flags); if (ret == 1) - detach_page_private(page); + clear_page_extent_mapped(page); return ret; } @@ -8277,7 +8279,7 @@ static void btrfs_invalidatepage(struct page *page, unsigned int offset, } ClearPageChecked(page); - detach_page_private(page); + clear_page_extent_mapped(page); } /* @@ -8356,7 +8358,9 @@ vm_fault_t btrfs_page_mkwrite(struct vm_fault *vmf) wait_on_page_writeback(page); lock_extent_bits(io_tree, page_start, page_end, &cached_state); - set_page_extent_mapped(page); + ret = set_page_extent_mapped(page); + if (ret < 0) + goto out_unlock; /* * we can't set the delalloc bits if there are pending ordered diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 2904f92c3813..56cc26d0e6db 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -1307,6 +1307,10 @@ static int cluster_pages_for_defrag(struct inode *inode, if (!page) break; + ret = set_page_extent_mapped(page); + if (ret < 0) + break; + page_start = page_offset(page); page_end = page_start + PAGE_SIZE - 1; while (1) { @@ -1428,7 +1432,6 @@ static int cluster_pages_for_defrag(struct inode *inode, for (i = 0; i < i_done; i++) { clear_page_dirty_for_io(pages[i]); ClearPageChecked(pages[i]); - set_page_extent_mapped(pages[i]); set_page_dirty(pages[i]); unlock_page(pages[i]); put_page(pages[i]); diff --git a/fs/btrfs/reflink.c b/fs/btrfs/reflink.c index 4bbc5f52b752..6f20536494e8 100644 --- a/fs/btrfs/reflink.c +++ b/fs/btrfs/reflink.c @@ -81,7 +81,10 @@ static int copy_inline_to_page(struct btrfs_inode *inode, goto out_unlock; } - set_page_extent_mapped(page); + ret = set_page_extent_mapped(page); + if (ret < 0) + goto out_unlock; + clear_extent_bit(&inode->io_tree, file_offset, range_end, EXTENT_DELALLOC | EXTENT_DO_ACCOUNTING | EXTENT_DEFRAG, 0, 0, NULL); diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c index c5774a8e6ff7..c353b85f7027 100644 --- a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -2699,6 +2699,16 @@ static int relocate_file_extent_cluster(struct inode *inode, goto out; } } + ret = set_page_extent_mapped(page); + if (ret < 0) { + btrfs_delalloc_release_metadata(BTRFS_I(inode), + PAGE_SIZE, true); + btrfs_delalloc_release_extents(BTRFS_I(inode), + PAGE_SIZE); + unlock_page(page); + put_page(page); + goto out; + } if (PageReadahead(page)) { page_cache_async_readahead(inode->i_mapping, @@ -2726,8 +2736,6 @@ static int relocate_file_extent_cluster(struct inode *inode, lock_extent(&BTRFS_I(inode)->io_tree, page_start, page_end); - set_page_extent_mapped(page); - if (nr < cluster->nr && page_start + offset == cluster->boundary[nr]) { set_extent_bits(&BTRFS_I(inode)->io_tree, From patchwork Wed Nov 18 08:53:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11914565 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B04C4C63777 for ; Wed, 18 Nov 2020 08:53:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 547A020719 for ; Wed, 18 Nov 2020 08:53:55 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="ujy1QW2r" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727479AbgKRIxy (ORCPT ); Wed, 18 Nov 2020 03:53:54 -0500 Received: from mx2.suse.de ([195.135.220.15]:47908 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727263AbgKRIxy (ORCPT ); Wed, 18 Nov 2020 03:53:54 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1605689632; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=uo+SVRPicMQyS4HCR5gkngoKafxbEVBkHafBfAFsYzk=; b=ujy1QW2r9RwjoeE96er0mfBOzCKa/w7EkDIeOrQl1sLpasp0gJvlXUVJQZnV4DeKE0L7/1 CKamAEuptl02DY7nQ+k8NR/l7yqFgrHDcy3+0zL7diO+QI0x/Bc0b1MGjGAsAWxqTG/mvP XInXpSDDA4AKwSD0w7K7skIncVbUsnA= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 424E1AD71 for ; Wed, 18 Nov 2020 08:53:52 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH 13/14] btrfs: integrate page status update for read path into begin/end_page_read() Date: Wed, 18 Nov 2020 16:53:18 +0800 Message-Id: <20201118085319.56668-14-wqu@suse.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201118085319.56668-1-wqu@suse.com> References: <20201118085319.56668-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org In btrfs data page read path, the page status update are handled in two different locations: btrfs_do_read_page() { while (cur <= end) { /* No need to read from disk */ if (HOLE/PREALLOC/INLINE){ memset(); set_extent_uptodate(); continue; } /* Read from disk */ ret = submit_extent_page(end_bio_extent_readpage); } end_bio_extent_readpage() { endio_readpage_uptodate_page_status(); } This is fine for sectorsize == PAGE_SIZE case, as for above loop we should only hit one branch and then exit. But for subpage, there are more works to be done in page status update: - Page Unlock condition Unlike regular page size == sectorsize case, we can no longer just unlock a page. Only the last reader of the page can unlock the page. This means, we can unlock the page either in the while() loop, or in the endio function. - Page uptodate condition Since we have multiple sectors to read for a page, we can only mark the full page uptodate if all sectors are uptodate. To handle both subpage and regular cases, introduce a pair of functions to help handling page status update: - being_page_read() For regular case, it does nothing. For subpage case, it update the reader counters so that later end_page_read() can know who is the last one to unlock the page. - end_page_read() This is just endio_readpage_uptodate_page_status() renamed. The original name is a little too long and too specific for endio. The only new trick added is the condition for page unlock. Now for subage data, we unlock the page if we're the last reader. This does not only provide the basis for subpage data read, but also hide the special handling of page read from the main read loop. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 45 +++++++++++++++++++++++++++++++------------- fs/btrfs/extent_io.h | 1 + 2 files changed, 33 insertions(+), 13 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 3d1dee27db8a..0b484df67dc3 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -2847,8 +2847,19 @@ endio_readpage_release_extent(struct processed_extent *processed, processed->uptodate = uptodate; } -static void endio_readpage_update_page_status(struct page *page, bool uptodate, - u64 start, u64 end) +static void begin_page_read(struct btrfs_fs_info *fs_info, struct page *page) +{ + struct btrfs_subpage *subpage; + + if (!btrfs_is_subpage(fs_info)) + return; + + ASSERT(PagePrivate(page) && page->private); + subpage = (struct btrfs_subpage *)page->private; + atomic_set(&subpage->readers, PAGE_SIZE >> fs_info->sectorsize_bits); +} + +static void end_page_read(struct page *page, bool uptodate, u64 start, u64 end) { struct btrfs_fs_info *fs_info = btrfs_sb(page->mapping->host->i_sb); struct btrfs_subpage *subpage; @@ -2874,7 +2885,7 @@ static void endio_readpage_update_page_status(struct page *page, bool uptodate, ASSERT(PagePrivate(page) && page->private); subpage = (struct btrfs_subpage *)page->private; bit_start = (start - page_offset(page)) >> fs_info->sectorsize_bits; - nbits = fs_info->nodesize >> fs_info->sectorsize_bits; + nbits = (end + 1 - start) >> fs_info->sectorsize_bits; if (!uptodate) { spin_lock_bh(&subpage->lock); @@ -2899,7 +2910,14 @@ static void endio_readpage_update_page_status(struct page *page, bool uptodate, ClearPageError(page); if (all_uptodate) SetPageUptodate(page); - return; + + /* + * For data, we still do page unlock, but that only happens when we're + * the last reader of the page. + */ + if (page->mapping->host != fs_info->btree_inode && + atomic_sub_and_test(nbits, &subpage->readers)) + unlock_page(page); } /* @@ -3029,7 +3047,7 @@ static void end_bio_extent_readpage(struct bio *bio) } bio_offset += len; - endio_readpage_update_page_status(page, uptodate, start, end); + end_page_read(page, uptodate, start, end); endio_readpage_release_extent(&processed, BTRFS_I(inode), start, end, uptodate); } @@ -3306,6 +3324,7 @@ int btrfs_do_readpage(struct page *page, struct extent_map **em_cached, unsigned int read_flags, u64 *prev_em_start) { struct inode *inode = page->mapping->host; + struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); u64 start = page_offset(page); const u64 end = start + PAGE_SIZE - 1; u64 cur = start; @@ -3349,6 +3368,7 @@ int btrfs_do_readpage(struct page *page, struct extent_map **em_cached, kunmap_atomic(userpage); } } + begin_page_read(fs_info, page); while (cur <= end) { bool force_bio_submit = false; u64 offset; @@ -3366,13 +3386,14 @@ int btrfs_do_readpage(struct page *page, struct extent_map **em_cached, &cached, GFP_NOFS); unlock_extent_cached(tree, cur, cur + iosize - 1, &cached); + end_page_read(page, true, cur, cur + iosize - 1); break; } em = __get_extent_map(inode, page, pg_offset, cur, end - cur + 1, em_cached); if (IS_ERR_OR_NULL(em)) { - SetPageError(page); unlock_extent(tree, cur, end); + end_page_read(page, false, cur, end); break; } extent_offset = cur - em->start; @@ -3455,6 +3476,7 @@ int btrfs_do_readpage(struct page *page, struct extent_map **em_cached, &cached, GFP_NOFS); unlock_extent_cached(tree, cur, cur + iosize - 1, &cached); + end_page_read(page, true, cur, cur + iosize - 1); cur = cur + iosize; pg_offset += iosize; continue; @@ -3464,6 +3486,7 @@ int btrfs_do_readpage(struct page *page, struct extent_map **em_cached, EXTENT_UPTODATE, 1, NULL)) { check_page_uptodate(tree, page); unlock_extent(tree, cur, cur + iosize - 1); + end_page_read(page, true, cur, cur + iosize - 1); cur = cur + iosize; pg_offset += iosize; continue; @@ -3472,8 +3495,8 @@ int btrfs_do_readpage(struct page *page, struct extent_map **em_cached, * to date. Error out */ if (block_start == EXTENT_MAP_INLINE) { - SetPageError(page); unlock_extent(tree, cur, cur + iosize - 1); + end_page_read(page, false, cur, cur + iosize - 1); cur = cur + iosize; pg_offset += iosize; continue; @@ -3490,19 +3513,14 @@ int btrfs_do_readpage(struct page *page, struct extent_map **em_cached, nr++; *bio_flags = this_bio_flag; } else { - SetPageError(page); unlock_extent(tree, cur, cur + iosize - 1); + end_page_read(page, false, cur, cur + iosize - 1); goto out; } cur = cur + iosize; pg_offset += iosize; } out: - if (!nr) { - if (!PageError(page)) - SetPageUptodate(page); - unlock_page(page); - } return ret; } @@ -5456,6 +5474,7 @@ int btrfs_attach_subpage(struct btrfs_fs_info *fs_info, struct page *page) return -ENOMEM; spin_lock_init(&subpage->lock); + atomic_set(&subpage->readers, 0); attach_page_private(page, subpage); return 0; } diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h index 01ec178a1ab9..e050490056a6 100644 --- a/fs/btrfs/extent_io.h +++ b/fs/btrfs/extent_io.h @@ -314,6 +314,7 @@ struct btrfs_subpage { DECLARE_BITMAP(tree_block_bitmap, BTRFS_SUBPAGE_BITMAP_SIZE); DECLARE_BITMAP(uptodate_bitmap, BTRFS_SUBPAGE_BITMAP_SIZE); DECLARE_BITMAP(error_bitmap, BTRFS_SUBPAGE_BITMAP_SIZE); + atomic_t readers; }; int btrfs_attach_subpage(struct btrfs_fs_info *fs_info, struct page *page); From patchwork Wed Nov 18 08:53:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11914567 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D14EDC5519F for ; Wed, 18 Nov 2020 08:53:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 760B520639 for ; Wed, 18 Nov 2020 08:53:56 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="jeE/BcqX" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727484AbgKRIxz (ORCPT ); Wed, 18 Nov 2020 03:53:55 -0500 Received: from mx2.suse.de ([195.135.220.15]:47926 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725772AbgKRIxz (ORCPT ); Wed, 18 Nov 2020 03:53:55 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1605689633; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4h65N+nkrAIJrcum+w+4MnEMeJBZ3uZ60Tr5SJSQL0Y=; b=jeE/BcqX7rrC+y3rG/8p/N/6s+oWcxCweGVY4af2fpfWmLMP7eGflBssLUPo3IdTPBlRCQ 3sy4zLAFQix6BLSABe6CjKIcXdhmUns+CT1ePT8qUuSm/nZN1XRP+fYbzXag7+9VRbJ/5n NOTjdBAcrReCbbgQL/6ojF7DwbmRMxI= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id DC2E7AD2F for ; Wed, 18 Nov 2020 08:53:53 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH 14/14] btrfs: allow RO mount of 4K sector size fs on 64K page system Date: Wed, 18 Nov 2020 16:53:19 +0800 Message-Id: <20201118085319.56668-15-wqu@suse.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: <20201118085319.56668-1-wqu@suse.com> References: <20201118085319.56668-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org This adds the basic RO mount ability for 4K sector size on 64K page system. Currently we only plan to support 4K and 64K page system. Signed-off-by: Qu Wenruo --- fs/btrfs/disk-io.c | 24 +++++++++++++++++++++--- fs/btrfs/super.c | 7 +++++++ 2 files changed, 28 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 699b999c8ba3..32bf623e3646 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -2524,13 +2524,21 @@ static int validate_super(struct btrfs_fs_info *fs_info, btrfs_err(fs_info, "invalid sectorsize %llu", sectorsize); ret = -EINVAL; } - /* Only PAGE SIZE is supported yet */ - if (sectorsize != PAGE_SIZE) { + + /* + * For 4K page size, we only support 4K sector size. + * For 64K page size, we support RW for 64K sector size, and RO for + * 4K sector size. + */ + if ((SZ_4K == PAGE_SIZE && sectorsize != PAGE_SIZE) || + (SZ_64K == PAGE_SIZE && (sectorsize != SZ_4K && + sectorsize != SZ_64K))) { btrfs_err(fs_info, - "sectorsize %llu not supported yet, only support %lu", + "sectorsize %llu not supported yet for page size %lu", sectorsize, PAGE_SIZE); ret = -EINVAL; } + if (!is_power_of_2(nodesize) || nodesize < sectorsize || nodesize > BTRFS_MAX_METADATA_BLOCKSIZE) { btrfs_err(fs_info, "invalid nodesize %llu", nodesize); @@ -3182,6 +3190,16 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device goto fail_alloc; } + /* For 4K sector size support, it's only read-only yet */ + if (PAGE_SIZE == SZ_64K && sectorsize == SZ_4K) { + if (!sb_rdonly(sb) || btrfs_super_log_root(disk_super)) { + btrfs_err(fs_info, + "subpage sector size only support RO yet"); + err = -EINVAL; + goto fail_alloc; + } + } + ret = btrfs_init_workqueues(fs_info, fs_devices); if (ret) { err = ret; diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 6693cfc14dfd..5338d3a60e9b 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -1970,6 +1970,13 @@ static int btrfs_remount(struct super_block *sb, int *flags, char *data) ret = -EINVAL; goto restore; } + if (btrfs_is_subpage(fs_info)) { + btrfs_warn(fs_info, + "read-write mount is not yet allowed for sector size %u page size %lu", + fs_info->sectorsize, PAGE_SIZE); + ret = -EINVAL; + goto restore; + } ret = btrfs_cleanup_fs_roots(fs_info); if (ret)