From patchwork Wed Oct 21 06:24:47 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848317 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 40864C56201 for ; Wed, 21 Oct 2020 06:26:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DD18D22249 for ; Wed, 21 Oct 2020 06:26:03 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="oFeS3o4F" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440698AbgJUG0C (ORCPT ); Wed, 21 Oct 2020 02:26:02 -0400 Received: from mx2.suse.de ([195.135.220.15]:42424 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2440496AbgJUG0C (ORCPT ); Wed, 21 Oct 2020 02:26:02 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261560; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=DrRbpVF/5vthz1rTFsGhZ0kkdAn5jEC49CjFpBGtrEo=; b=oFeS3o4F0ema0/TcfFrQb23TuMxwnSTxlOVAYPVE6j0gqJrFfkNrNp3d/4dFjloDlkMIQf LzYkFAvoQsz97rgeK+E5UHmpSRXfz0hJlQK5xaQsdGgTEUaEQBzTuNmcFuRZXk4JQ92mHO 8yC+SgOyX/vO+lMVxU9ojg4L9F1QW4Q= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id D635BAC35 for ; Wed, 21 Oct 2020 06:26:00 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 01/68] btrfs: extent-io-tests: remove invalid tests Date: Wed, 21 Oct 2020 14:24:47 +0800 Message-Id: <20201021062554.68132-2-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org In extent-io-test, there are two invalid tests: - Invalid nodesize for test_eb_bitmaps() Instead of the sectorsize and nodesize combination passed in, we're always using hand-crafted nodesize. Although it has some extra check for 64K page size, we can still hit a case where PAGE_SIZE == 32K, then we got 128K nodesize which is larger than max valid node size. Thankfully most machines are either 4K or 64K page size, thus we haven't yet hit such case. - Invalid extent buffer bytenr For 64K page size, the only combination we're going to test is sectorsize = nodesize = 64K. In that case, we'll try to create an extent buffer with 32K bytenr, which is not aligned to sectorsize thus invalid. This patch will fix both problems by: - Honor the sectorsize/nodesize combination Now we won't bother to hand-craft a strange length and use it as nodesize. - Use sectorsize as the 2nd run extent buffer start This would test the case where extent buffer is aligned to sectorsize but not always aligned to nodesize. Signed-off-by: Qu Wenruo --- fs/btrfs/tests/extent-io-tests.c | 26 +++++++++++--------------- 1 file changed, 11 insertions(+), 15 deletions(-) diff --git a/fs/btrfs/tests/extent-io-tests.c b/fs/btrfs/tests/extent-io-tests.c index df7ce874a74b..73e96d505f4f 100644 --- a/fs/btrfs/tests/extent-io-tests.c +++ b/fs/btrfs/tests/extent-io-tests.c @@ -379,54 +379,50 @@ static int __test_eb_bitmaps(unsigned long *bitmap, struct extent_buffer *eb, static int test_eb_bitmaps(u32 sectorsize, u32 nodesize) { struct btrfs_fs_info *fs_info; - unsigned long len; unsigned long *bitmap = NULL; struct extent_buffer *eb = NULL; int ret; test_msg("running extent buffer bitmap tests"); - /* - * In ppc64, sectorsize can be 64K, thus 4 * 64K will be larger than - * BTRFS_MAX_METADATA_BLOCKSIZE. - */ - len = (sectorsize < BTRFS_MAX_METADATA_BLOCKSIZE) - ? sectorsize * 4 : sectorsize; - - fs_info = btrfs_alloc_dummy_fs_info(len, len); + fs_info = btrfs_alloc_dummy_fs_info(nodesize, sectorsize); if (!fs_info) { test_std_err(TEST_ALLOC_FS_INFO); return -ENOMEM; } - bitmap = kmalloc(len, GFP_KERNEL); + bitmap = kmalloc(nodesize, GFP_KERNEL); if (!bitmap) { test_err("couldn't allocate test bitmap"); ret = -ENOMEM; goto out; } - eb = __alloc_dummy_extent_buffer(fs_info, 0, len); + eb = __alloc_dummy_extent_buffer(fs_info, 0, nodesize); if (!eb) { test_std_err(TEST_ALLOC_ROOT); ret = -ENOMEM; goto out; } - ret = __test_eb_bitmaps(bitmap, eb, len); + ret = __test_eb_bitmaps(bitmap, eb, nodesize); if (ret) goto out; - /* Do it over again with an extent buffer which isn't page-aligned. */ free_extent_buffer(eb); - eb = __alloc_dummy_extent_buffer(fs_info, nodesize / 2, len); + + /* + * Test again for case where the tree block is sectorsize aligned but + * not nodesize aligned. + */ + eb = __alloc_dummy_extent_buffer(fs_info, sectorsize, nodesize); if (!eb) { test_std_err(TEST_ALLOC_ROOT); ret = -ENOMEM; goto out; } - ret = __test_eb_bitmaps(bitmap, eb, len); + ret = __test_eb_bitmaps(bitmap, eb, nodesize); out: free_extent_buffer(eb); kfree(bitmap); From patchwork Wed Oct 21 06:24:48 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848311 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 190B4C4363A for ; Wed, 21 Oct 2020 06:26:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B3DA622249 for ; Wed, 21 Oct 2020 06:26:08 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="bcBKlCKt" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440705AbgJUG0H (ORCPT ); Wed, 21 Oct 2020 02:26:07 -0400 Received: from mx2.suse.de ([195.135.220.15]:42504 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2440496AbgJUG0H (ORCPT ); Wed, 21 Oct 2020 02:26:07 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261565; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4PpVl209z3KEQlPi7nxsOc6ZqHyPbggMt6Dqzr4kkZc=; b=bcBKlCKtknkfJ7QsL4SHzQ3fcM6/iKu5BedON4w8V9feU+QYReSiZmhB+XJQFqv/hrKUdk 86KxUcafceXsllAdM0iGMFhzS+fFLamh0qWrVEfXLok2bAXwDws7B57YxDW9fQ6+wV0+LL CEQFgcadZq11SmD3mpyLNM+916llMzQ= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 9194CACC5; Wed, 21 Oct 2020 06:26:05 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Cc: Goldwyn Rodrigues , Goldwyn Rodrigues Subject: [PATCH v4 02/68] btrfs: use iosize while reading compressed pages Date: Wed, 21 Oct 2020 14:24:48 +0800 Message-Id: <20201021062554.68132-3-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goldwyn Rodrigues While using compression, a submitted bio is mapped with a compressed bio which performs the read from disk, decompresses and returns uncompressed data to original bio. The original bio must reflect the uncompressed size (iosize) of the I/O to be performed, or else the page just gets the decompressed I/O length of data (disk_io_size). The compressed bio checks the extent map and get the correct length while performing the I/O from disk. This came up in subpage work when only compressed length of the original bio was filled in the page. This worked correctly for pagesize == sectorsize because both compressed and uncompressed data are at pagesize boundaries, and would end up filling the requested page. Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/extent_io.c | 10 +++------- 1 file changed, 3 insertions(+), 7 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index a940edb1e64f..64f7f61ce718 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -3162,7 +3162,6 @@ static int __do_readpage(struct page *page, int nr = 0; size_t pg_offset = 0; size_t iosize; - size_t disk_io_size; size_t blocksize = inode->i_sb->s_blocksize; unsigned long this_bio_flag = 0; struct extent_io_tree *tree = &BTRFS_I(inode)->io_tree; @@ -3228,13 +3227,10 @@ static int __do_readpage(struct page *page, iosize = min(extent_map_end(em) - cur, end - cur + 1); cur_end = min(extent_map_end(em) - 1, end); iosize = ALIGN(iosize, blocksize); - if (this_bio_flag & EXTENT_BIO_COMPRESSED) { - disk_io_size = em->block_len; + if (this_bio_flag & EXTENT_BIO_COMPRESSED) offset = em->block_start; - } else { + else offset = em->block_start + extent_offset; - disk_io_size = iosize; - } block_start = em->block_start; if (test_bit(EXTENT_FLAG_PREALLOC, &em->flags)) block_start = EXTENT_MAP_HOLE; @@ -3323,7 +3319,7 @@ static int __do_readpage(struct page *page, } ret = submit_extent_page(REQ_OP_READ | read_flags, NULL, - page, offset, disk_io_size, + page, offset, iosize, pg_offset, bio, end_bio_extent_readpage, mirror_num, *bio_flags, From patchwork Wed Oct 21 06:24:49 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848319 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 38AC9C561F8 for ; Wed, 21 Oct 2020 06:26:11 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D131422251 for ; Wed, 21 Oct 2020 06:26:10 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="PFMECzF/" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440709AbgJUG0J (ORCPT ); Wed, 21 Oct 2020 02:26:09 -0400 Received: from mx2.suse.de ([195.135.220.15]:42516 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2440706AbgJUG0J (ORCPT ); Wed, 21 Oct 2020 02:26:09 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261567; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CkscHhzEETPJWamFzUduOECkO2dzdk7UoVoARX8lg0Q=; b=PFMECzF/FJT3UL0KCYpdkwC7X2vM0vhgGlYkUDcZ/voCKstIpK2js1Pp+KgFUPrT/22oFw hE48j2XMiLgeASs8HOAah5kHy0WRy6uLyGWKZQjF9q7lA4I/GiCzwdHWbaa4BcgERhLUJa Fs/EJRFZMzdiN5bLKTD+QVaiMkK2xzw= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 61CFAAC12 for ; Wed, 21 Oct 2020 06:26:07 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 03/68] btrfs: extent_io: fix the comment on lock_extent_buffer_for_io(). Date: Wed, 21 Oct 2020 14:24:49 +0800 Message-Id: <20201021062554.68132-4-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org The return value of that function is completely wrong. That function only return 0 if the the extent buffer doesn't need to be submitted. The "ret = 1" and "ret = 0" are determined by the return value of "test_and_clear_bit(EXTENT_BUFFER_DIRTY, &eb->bflags)". And if we get ret == 1, it's because the extent buffer is dirty, and we set its status to EXTENT_BUFFER_WRITE_BACK, and continue to page locking. While if we get ret == 0, it means the extent is not dirty from the beginning, so we don't need to write it back. The caller also follows this, in btree_write_cache_pages(), if lock_extent_buffer_for_io() return 0, we just skip the extent buffer completely. So the comment is completely wrong. Since we're here, also change the description a little. The write bio flushing won't be visible to the caller, thus it's not an major feature. In the main decription, only describe the locking part to make the point more clear. Fixes: 2e3c25136adf ("btrfs: extent_io: add proper error handling to lock_extent_buffer_for_io()") Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 64f7f61ce718..a64d88163f3b 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -3688,11 +3688,14 @@ static void end_extent_buffer_writeback(struct extent_buffer *eb) } /* - * Lock eb pages and flush the bio if we can't the locks + * Lock extent buffer status and pages for write back. * - * Return 0 if nothing went wrong - * Return >0 is same as 0, except bio is not submitted - * Return <0 if something went wrong, no page is locked + * May try to flush write bio if we can't get the lock. + * + * Return 0 if the extent buffer doesn't need to be submitted. + * (E.g. the extent buffer is not dirty) + * Return >0 is the extent buffer is submitted to bio. + * Return <0 if something went wrong, no page is locked. */ static noinline_for_stack int lock_extent_buffer_for_io(struct extent_buffer *eb, struct extent_page_data *epd) From patchwork Wed Oct 21 06:24:50 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848321 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 46A51C4363A for ; Wed, 21 Oct 2020 06:26:12 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D562822249 for ; Wed, 21 Oct 2020 06:26:11 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="If+RLOlW" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440712AbgJUG0K (ORCPT ); Wed, 21 Oct 2020 02:26:10 -0400 Received: from mx2.suse.de ([195.135.220.15]:42526 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2440710AbgJUG0K (ORCPT ); Wed, 21 Oct 2020 02:26:10 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261569; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ubi6dxuYF1ruxqVWZUNpXQle0M+Nuln4OPBkB7debVE=; b=If+RLOlWSlk9dT88xR0DBliVfzuP7hlMXRj2MHxFqcAKfC8lSM9u/KzPylnxR9xCsf8qtw vXUe2G5PyMPKEzurHsRwr2SKhfejt0EPEfSpF0rNO5c6/OFWJx3VS7fbA0eZVksuANXTzw gUkpLNdclkTI9Z1TTeugDIvYL/qJL2w= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 4F894AC35 for ; Wed, 21 Oct 2020 06:26:09 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 04/68] btrfs: extent_io: update the comment for find_first_extent_bit() Date: Wed, 21 Oct 2020 14:24:50 +0800 Message-Id: <20201021062554.68132-5-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org The pitfall here is, if the parameter @bits has multiple bits set, we will return the first range which just has one of the specified bits set. This is a little tricky if we want an exact match. Anyway, update the comment to inform the callers. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index a64d88163f3b..2980e8384e74 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -1554,11 +1554,12 @@ find_first_extent_bit_state(struct extent_io_tree *tree, } /* - * find the first offset in the io tree with 'bits' set. zero is - * returned if we find something, and *start_ret and *end_ret are - * set to reflect the state struct that was found. + * Find the first offset in the io tree with one or more @bits set. * - * If nothing was found, 1 is returned. If found something, return 0. + * NOTE: If @bits are multiple bits, any bit of @bits will meet the match. + * + * Return 0 if we find something, and update @start_ret and @end_ret. + * Return 1 if we found nothing. */ int find_first_extent_bit(struct extent_io_tree *tree, u64 start, u64 *start_ret, u64 *end_ret, unsigned bits, From patchwork Wed Oct 21 06:24:51 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848323 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A6F2DC4363A for ; Wed, 21 Oct 2020 06:26:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5C83822249 for ; Wed, 21 Oct 2020 06:26:14 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="HUAo9+Ak" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440715AbgJUG0N (ORCPT ); Wed, 21 Oct 2020 02:26:13 -0400 Received: from mx2.suse.de ([195.135.220.15]:42536 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2440710AbgJUG0N (ORCPT ); Wed, 21 Oct 2020 02:26:13 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261571; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7PrMMvEhLabVCCzuTsnuZg8Sq7AdfYmwb86BWJMBfN4=; b=HUAo9+AkkYTPWxjx/MKphZxNln/BIAdH2Kn6eBtrlopULnvmkJPZ4P1r8UAO2pP5ng2d69 WCJsOc2xBFZJFweYcN5ftFNzdn+P8WtTesTmw8T8M09wLrbi3jGNser7020/Lr+/U/PnzD gHEg76cHJP71K2QyBprr2cicZ4YfGAE= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 13CAFAC1D for ; Wed, 21 Oct 2020 06:26:11 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 05/68] btrfs: extent_io: sink the @failed_start parameter for set_extent_bit() Date: Wed, 21 Oct 2020 14:24:51 +0800 Message-Id: <20201021062554.68132-6-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org The @failed_start parameter is only paired with @exclusive_bits, and those parameters are only used for EXTENT_LOCKED bit, which have their own wrappers lock_extent_bits(). Thus for regular set_extent_bit() calls, the failed_start makes no sense, just sink the parameter. Also, since @failed_start and @exclusive_bits are used in pairs, add extra assert to make it more obvious. Signed-off-by: Qu Wenruo --- fs/btrfs/extent-io-tree.h | 18 ++++++++---------- fs/btrfs/extent_io.c | 12 ++++++++---- fs/btrfs/file.c | 6 +++--- fs/btrfs/inode.c | 3 +-- 4 files changed, 20 insertions(+), 19 deletions(-) diff --git a/fs/btrfs/extent-io-tree.h b/fs/btrfs/extent-io-tree.h index 219a09a2b734..9a60d8426796 100644 --- a/fs/btrfs/extent-io-tree.h +++ b/fs/btrfs/extent-io-tree.h @@ -153,15 +153,15 @@ static inline int clear_extent_bits(struct extent_io_tree *tree, u64 start, int set_record_extent_bits(struct extent_io_tree *tree, u64 start, u64 end, unsigned bits, struct extent_changeset *changeset); int set_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, - unsigned bits, u64 *failed_start, - struct extent_state **cached_state, gfp_t mask); + unsigned bits, struct extent_state **cached_state, + gfp_t mask); int set_extent_bits_nowait(struct extent_io_tree *tree, u64 start, u64 end, unsigned bits); static inline int set_extent_bits(struct extent_io_tree *tree, u64 start, u64 end, unsigned bits) { - return set_extent_bit(tree, start, end, bits, NULL, NULL, GFP_NOFS); + return set_extent_bit(tree, start, end, bits, NULL, GFP_NOFS); } static inline int clear_extent_uptodate(struct extent_io_tree *tree, u64 start, @@ -174,8 +174,7 @@ static inline int clear_extent_uptodate(struct extent_io_tree *tree, u64 start, static inline int set_extent_dirty(struct extent_io_tree *tree, u64 start, u64 end, gfp_t mask) { - return set_extent_bit(tree, start, end, EXTENT_DIRTY, NULL, - NULL, mask); + return set_extent_bit(tree, start, end, EXTENT_DIRTY, NULL, mask); } static inline int clear_extent_dirty(struct extent_io_tree *tree, u64 start, @@ -196,7 +195,7 @@ static inline int set_extent_delalloc(struct extent_io_tree *tree, u64 start, { return set_extent_bit(tree, start, end, EXTENT_DELALLOC | EXTENT_UPTODATE | extra_bits, - NULL, cached_state, GFP_NOFS); + cached_state, GFP_NOFS); } static inline int set_extent_defrag(struct extent_io_tree *tree, u64 start, @@ -204,20 +203,19 @@ static inline int set_extent_defrag(struct extent_io_tree *tree, u64 start, { return set_extent_bit(tree, start, end, EXTENT_DELALLOC | EXTENT_UPTODATE | EXTENT_DEFRAG, - NULL, cached_state, GFP_NOFS); + cached_state, GFP_NOFS); } static inline int set_extent_new(struct extent_io_tree *tree, u64 start, u64 end) { - return set_extent_bit(tree, start, end, EXTENT_NEW, NULL, NULL, - GFP_NOFS); + return set_extent_bit(tree, start, end, EXTENT_NEW, NULL, GFP_NOFS); } static inline int set_extent_uptodate(struct extent_io_tree *tree, u64 start, u64 end, struct extent_state **cached_state, gfp_t mask) { - return set_extent_bit(tree, start, end, EXTENT_UPTODATE, NULL, + return set_extent_bit(tree, start, end, EXTENT_UPTODATE, cached_state, mask); } diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 2980e8384e74..ca219c42ddc6 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -980,6 +980,10 @@ __set_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, btrfs_debug_check_extent_io_range(tree, start, end); trace_btrfs_set_extent_bit(tree, start, end - start + 1, bits); + if (exclusive_bits) + ASSERT(failed_start); + else + ASSERT(!failed_start); again: if (!prealloc && gfpflags_allow_blocking(mask)) { /* @@ -1180,11 +1184,11 @@ __set_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, } int set_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, - unsigned bits, u64 * failed_start, - struct extent_state **cached_state, gfp_t mask) + unsigned bits, struct extent_state **cached_state, + gfp_t mask) { - return __set_extent_bit(tree, start, end, bits, 0, failed_start, - cached_state, mask, NULL); + return __set_extent_bit(tree, start, end, bits, 0, NULL, cached_state, + mask, NULL); } diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 4507c3d09399..d3766d2bb8d6 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -481,8 +481,8 @@ static int btrfs_find_new_delalloc_bytes(struct btrfs_inode *inode, ret = set_extent_bit(&inode->io_tree, search_start, search_start + em_len - 1, - EXTENT_DELALLOC_NEW, - NULL, cached_state, GFP_NOFS); + EXTENT_DELALLOC_NEW, cached_state, + GFP_NOFS); next: search_start = extent_map_end(em); free_extent_map(em); @@ -1830,7 +1830,7 @@ static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb, set_extent_bit(&BTRFS_I(inode)->io_tree, lockstart, lockend, EXTENT_NORESERVE, NULL, - NULL, GFP_NOFS); + GFP_NOFS); } btrfs_drop_pages(pages, num_pages); diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 9570458aa847..1d2fe21489ca 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -4619,8 +4619,7 @@ int btrfs_truncate_block(struct inode *inode, loff_t from, loff_t len, if (only_release_metadata) set_extent_bit(&BTRFS_I(inode)->io_tree, block_start, - block_end, EXTENT_NORESERVE, NULL, NULL, - GFP_NOFS); + block_end, EXTENT_NORESERVE, NULL, GFP_NOFS); out_unlock: if (ret) { From patchwork Wed Oct 21 06:24:52 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848315 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E5DB5C561F8 for ; Wed, 21 Oct 2020 06:26:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 990E822249 for ; Wed, 21 Oct 2020 06:26:15 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="UQ5Ign3m" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440717AbgJUG0O (ORCPT ); Wed, 21 Oct 2020 02:26:14 -0400 Received: from mx2.suse.de ([195.135.220.15]:42596 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2440713AbgJUG0O (ORCPT ); Wed, 21 Oct 2020 02:26:14 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261572; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=WyxgVmfSnqy3nIsaxpTFC1EaBL9RBW0/X/z/r0RhDOE=; b=UQ5Ign3mpCFLA5TcCIhdSoxVdIfMgftanMVRZTNpB/XJjAD4TvkZnNp2JU/BA1ohEFzjB7 7ip7sTe1bKo8nJcAHuawQXoenF3v6d+MKnwwU4iKKpR49PkJiERKyDJY0qNphGqV7dt5Sk 1fiDjoj6lGRtQEaqLknZBrAjJRUACR4= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id D6D77AC8C for ; Wed, 21 Oct 2020 06:26:12 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 06/68] btrfs: make btree inode io_tree has its special owner Date: Wed, 21 Oct 2020 14:24:52 +0800 Message-Id: <20201021062554.68132-7-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Btree inode is pretty special compared to all other inode extent io tree, although it has a btrfs inode, it doesn't have the track_uptodate bit set to true, and never has ordered extent. Since it's so special, adds a new owner value for it to make debuging a little easier. Signed-off-by: Qu Wenruo --- fs/btrfs/disk-io.c | 2 +- fs/btrfs/extent-io-tree.h | 1 + include/trace/events/btrfs.h | 1 + 3 files changed, 3 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index f6bba7eb1fa1..be6edbd34934 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -2116,7 +2116,7 @@ static void btrfs_init_btree_inode(struct btrfs_fs_info *fs_info) RB_CLEAR_NODE(&BTRFS_I(inode)->rb_node); extent_io_tree_init(fs_info, &BTRFS_I(inode)->io_tree, - IO_TREE_INODE_IO, inode); + IO_TREE_BTREE_INODE_IO, inode); BTRFS_I(inode)->io_tree.track_uptodate = false; extent_map_tree_init(&BTRFS_I(inode)->extent_tree); diff --git a/fs/btrfs/extent-io-tree.h b/fs/btrfs/extent-io-tree.h index 9a60d8426796..92caa1190ca8 100644 --- a/fs/btrfs/extent-io-tree.h +++ b/fs/btrfs/extent-io-tree.h @@ -40,6 +40,7 @@ struct io_failure_record; enum { IO_TREE_FS_PINNED_EXTENTS, IO_TREE_FS_EXCLUDED_EXTENTS, + IO_TREE_BTREE_INODE_IO, IO_TREE_INODE_IO, IO_TREE_INODE_IO_FAILURE, IO_TREE_RELOC_BLOCKS, diff --git a/include/trace/events/btrfs.h b/include/trace/events/btrfs.h index 863335ecb7e8..89397605e465 100644 --- a/include/trace/events/btrfs.h +++ b/include/trace/events/btrfs.h @@ -79,6 +79,7 @@ struct btrfs_space_info; #define IO_TREE_OWNER \ EM( IO_TREE_FS_PINNED_EXTENTS, "PINNED_EXTENTS") \ EM( IO_TREE_FS_EXCLUDED_EXTENTS, "EXCLUDED_EXTENTS") \ + EM( IO_TREE_BTREE_INODE_IO, "BTRFS_INODE_IO") \ EM( IO_TREE_INODE_IO, "INODE_IO") \ EM( IO_TREE_INODE_IO_FAILURE, "INODE_IO_FAILURE") \ EM( IO_TREE_RELOC_BLOCKS, "RELOC_BLOCKS") \ From patchwork Wed Oct 21 06:24:53 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848329 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EAFCAC4363A for ; Wed, 21 Oct 2020 06:26:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8498F22249 for ; Wed, 21 Oct 2020 06:26:18 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="EhbnpHn6" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440720AbgJUG0R (ORCPT ); Wed, 21 Oct 2020 02:26:17 -0400 Received: from mx2.suse.de ([195.135.220.15]:42612 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2440718AbgJUG0R (ORCPT ); Wed, 21 Oct 2020 02:26:17 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261575; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/984dPw0TZ6e0w9hf/o8ydoO/aSED0BXaz0rRgHbCy4=; b=EhbnpHn6EZvAjljp2SFqMYjgiPR95Bvp/sdbP73Ma2CjD7tqeX2sRk9AeNhAqqsVC0r8J3 SpnRIaG5MiYPVSPFQWmMCaV9PKoeH8haEZRNUh4XcwqY+Bx6M+0pU2z6OiEpcfDp0qURQM wmGo829mtjRusOQrIFtzc4YPsAqoz5A= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 44036AC1D for ; Wed, 21 Oct 2020 06:26:15 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 07/68] btrfs: disk-io: replace @fs_info and @private_data with @inode for btrfs_wq_submit_bio() Date: Wed, 21 Oct 2020 14:24:53 +0800 Message-Id: <20201021062554.68132-8-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org All callers for btrfs_wq_submit_bio() passes struct inode as @private_data, so there is no need for @private_data to be (void *), just replace it with "struct inode *inode". While we can extra fs_info from struct inode, also remove the @fs_info parameter. Since we're here, also replace all the (void *private_data) into (struct inode *inode). Signed-off-by: Qu Wenruo Reviewed-by: Goldwyn Rodrigues --- fs/btrfs/disk-io.c | 21 +++++++++++---------- fs/btrfs/disk-io.h | 8 ++++---- fs/btrfs/extent_io.h | 2 +- fs/btrfs/inode.c | 21 +++++++++------------ 4 files changed, 25 insertions(+), 27 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index be6edbd34934..b7436ab7bba9 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -110,7 +110,7 @@ static void btrfs_free_csum_hash(struct btrfs_fs_info *fs_info) * just before they are sent down the IO stack. */ struct async_submit_bio { - void *private_data; + struct inode *inode; struct bio *bio; extent_submit_bio_start_t *submit_bio_start; int mirror_num; @@ -746,7 +746,7 @@ static void run_one_async_start(struct btrfs_work *work) blk_status_t ret; async = container_of(work, struct async_submit_bio, work); - ret = async->submit_bio_start(async->private_data, async->bio, + ret = async->submit_bio_start(async->inode, async->bio, async->bio_offset); if (ret) async->status = ret; @@ -767,7 +767,7 @@ static void run_one_async_done(struct btrfs_work *work) blk_status_t ret; async = container_of(work, struct async_submit_bio, work); - inode = async->private_data; + inode = async->inode; /* If an error occurred we just want to clean up the bio and move on */ if (async->status) { @@ -797,18 +797,19 @@ static void run_one_async_free(struct btrfs_work *work) kfree(async); } -blk_status_t btrfs_wq_submit_bio(struct btrfs_fs_info *fs_info, struct bio *bio, +blk_status_t btrfs_wq_submit_bio(struct inode *inode, struct bio *bio, int mirror_num, unsigned long bio_flags, - u64 bio_offset, void *private_data, + u64 bio_offset, extent_submit_bio_start_t *submit_bio_start) { + struct btrfs_fs_info *fs_info = BTRFS_I(inode)->root->fs_info; struct async_submit_bio *async; async = kmalloc(sizeof(*async), GFP_NOFS); if (!async) return BLK_STS_RESOURCE; - async->private_data = private_data; + async->inode = inode; async->bio = bio; async->mirror_num = mirror_num; async->submit_bio_start = submit_bio_start; @@ -845,8 +846,8 @@ static blk_status_t btree_csum_one_bio(struct bio *bio) return errno_to_blk_status(ret); } -static blk_status_t btree_submit_bio_start(void *private_data, struct bio *bio, - u64 bio_offset) +static blk_status_t btree_submit_bio_start(struct inode *inode, struct bio *bio, + u64 bio_offset) { /* * when we're called for a write, we're already in the async @@ -893,8 +894,8 @@ static blk_status_t btree_submit_bio_hook(struct inode *inode, struct bio *bio, * kthread helpers are used to submit writes so that * checksumming can happen in parallel across all CPUs */ - ret = btrfs_wq_submit_bio(fs_info, bio, mirror_num, 0, - 0, inode, btree_submit_bio_start); + ret = btrfs_wq_submit_bio(inode, bio, mirror_num, 0, + 0, btree_submit_bio_start); } if (ret) diff --git a/fs/btrfs/disk-io.h b/fs/btrfs/disk-io.h index 00dc39d47ed3..2d564e9223e2 100644 --- a/fs/btrfs/disk-io.h +++ b/fs/btrfs/disk-io.h @@ -105,10 +105,10 @@ int btrfs_read_buffer(struct extent_buffer *buf, u64 parent_transid, int level, struct btrfs_key *first_key); blk_status_t btrfs_bio_wq_end_io(struct btrfs_fs_info *info, struct bio *bio, enum btrfs_wq_endio_type metadata); -blk_status_t btrfs_wq_submit_bio(struct btrfs_fs_info *fs_info, struct bio *bio, - int mirror_num, unsigned long bio_flags, - u64 bio_offset, void *private_data, - extent_submit_bio_start_t *submit_bio_start); +blk_status_t btrfs_wq_submit_bio(struct inode *inode, struct bio *bio, + int mirror_num, unsigned long bio_flags, + u64 bio_offset, + extent_submit_bio_start_t *submit_bio_start); blk_status_t btrfs_submit_bio_done(void *private_data, struct bio *bio, int mirror_num); int btrfs_init_log_root_tree(struct btrfs_trans_handle *trans, diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h index 30794ae58498..3c9252b429e0 100644 --- a/fs/btrfs/extent_io.h +++ b/fs/btrfs/extent_io.h @@ -71,7 +71,7 @@ typedef blk_status_t (submit_bio_hook_t)(struct inode *inode, struct bio *bio, int mirror_num, unsigned long bio_flags); -typedef blk_status_t (extent_submit_bio_start_t)(void *private_data, +typedef blk_status_t (extent_submit_bio_start_t)(struct inode *inode, struct bio *bio, u64 bio_offset); struct extent_io_ops { diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 1d2fe21489ca..2a56d3b8eff4 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -2157,11 +2157,9 @@ int btrfs_bio_fits_in_stripe(struct page *page, size_t size, struct bio *bio, * At IO completion time the cums attached on the ordered extent record * are inserted into the btree */ -static blk_status_t btrfs_submit_bio_start(void *private_data, struct bio *bio, - u64 bio_offset) +static blk_status_t btrfs_submit_bio_start(struct inode *inode, struct bio *bio, + u64 bio_offset) { - struct inode *inode = private_data; - return btrfs_csum_one_bio(BTRFS_I(inode), bio, 0, 0); } @@ -2221,8 +2219,8 @@ static blk_status_t btrfs_submit_bio_hook(struct inode *inode, struct bio *bio, if (root->root_key.objectid == BTRFS_DATA_RELOC_TREE_OBJECTID) goto mapit; /* we're doing a write, do the async checksumming */ - ret = btrfs_wq_submit_bio(fs_info, bio, mirror_num, bio_flags, - 0, inode, btrfs_submit_bio_start); + ret = btrfs_wq_submit_bio(inode, bio, mirror_num, bio_flags, + 0, btrfs_submit_bio_start); goto out; } else if (!skip_sum) { ret = btrfs_csum_one_bio(BTRFS_I(inode), bio, 0, 0); @@ -7615,11 +7613,10 @@ static void __endio_write_update_ordered(struct btrfs_inode *inode, } } -static blk_status_t btrfs_submit_bio_start_direct_io(void *private_data, - struct bio *bio, u64 offset) +static blk_status_t btrfs_submit_bio_start_direct_io(struct inode *inode, + struct bio *bio, + u64 offset) { - struct inode *inode = private_data; - return btrfs_csum_one_bio(BTRFS_I(inode), bio, offset, 1); } @@ -7670,8 +7667,8 @@ static inline blk_status_t btrfs_submit_dio_bio(struct bio *bio, goto map; if (write && async_submit) { - ret = btrfs_wq_submit_bio(fs_info, bio, 0, 0, - file_offset, inode, + ret = btrfs_wq_submit_bio(inode, bio, 0, 0, + file_offset, btrfs_submit_bio_start_direct_io); goto err; } else if (write) { From patchwork Wed Oct 21 06:24:54 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848313 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9DEA7C561F8 for ; Wed, 21 Oct 2020 06:26:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 38C3222249 for ; Wed, 21 Oct 2020 06:26:21 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="Gk0Do8xh" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440724AbgJUG0U (ORCPT ); Wed, 21 Oct 2020 02:26:20 -0400 Received: from mx2.suse.de ([195.135.220.15]:42666 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2440718AbgJUG0U (ORCPT ); Wed, 21 Oct 2020 02:26:20 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261578; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XahuQLSM/YwpnVxOSMLRABkqOmE0io4O3KhBLzNqUoY=; b=Gk0Do8xhQWZGBpv1sNEmMlXMrvtEvz/KUdt+ff2EUn7S+xJha+f/m1gg/n5v5WNLABAGFv F3VxL/KybaQLV4ksmGFpvQ3sh1ymTZcbJXgnVgwy5lQf5YJgSXBnfKeds/Vgv/H0YfKdEl UfxfCHkBcRipurtbG0llkwRC38+M64M= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 3A41EAC1D for ; Wed, 21 Oct 2020 06:26:18 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 08/68] btrfs: inode: sink parameter @start and @len for check_data_csum() Date: Wed, 21 Oct 2020 14:24:54 +0800 Message-Id: <20201021062554.68132-9-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org For check_data_csum(), the page we're using is directly from inode mapping, thus it has valid page_offset(). We can use (page_offset() + pg_off) to replace @start parameter completely, while the @len should always be sectorsize. Since we're here, also add some comment, as there are quite some confusion in words like start/offset, without explaining whether it's file_offset or logical bytenr. This should not affect the existing behavior, as for current sectorsize == PAGE_SIZE case, @pgoff should always be 0, and len is always PAGE_SIZE (or sectorsize from the dio read path). Signed-off-by: Qu Wenruo Reviewed-by: Goldwyn Rodrigues --- fs/btrfs/inode.c | 27 +++++++++++++++++++-------- 1 file changed, 19 insertions(+), 8 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 2a56d3b8eff4..24fbf2c46e56 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -2791,17 +2791,30 @@ void btrfs_writepage_endio_finish_ordered(struct page *page, u64 start, btrfs_queue_work(wq, &ordered_extent->work); } +/* + * Verify the checksum of one sector of uncompressed data. + * + * @inode: The inode. + * @io_bio: The btrfs_io_bio which contains the csum. + * @icsum: The csum offset (by number of sectors). + * @page: The page where the data to be verified is. + * @pgoff: The offset inside the page. + * + * The length of such check is always one sector size. + */ static int check_data_csum(struct inode *inode, struct btrfs_io_bio *io_bio, - int icsum, struct page *page, int pgoff, u64 start, - size_t len) + int icsum, struct page *page, int pgoff) { struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); SHASH_DESC_ON_STACK(shash, fs_info->csum_shash); char *kaddr; + u32 len = fs_info->sectorsize; u16 csum_size = btrfs_super_csum_size(fs_info->super_copy); u8 *csum_expected; u8 csum[BTRFS_CSUM_SIZE]; + ASSERT(pgoff + len <= PAGE_SIZE); + csum_expected = ((u8 *)io_bio->csum) + icsum * csum_size; kaddr = kmap_atomic(page); @@ -2815,8 +2828,8 @@ static int check_data_csum(struct inode *inode, struct btrfs_io_bio *io_bio, kunmap_atomic(kaddr); return 0; zeroit: - btrfs_print_data_csum_error(BTRFS_I(inode), start, csum, csum_expected, - io_bio->mirror_num); + btrfs_print_data_csum_error(BTRFS_I(inode), page_offset(page) + pgoff, + csum, csum_expected, io_bio->mirror_num); if (io_bio->device) btrfs_dev_stat_inc_and_print(io_bio->device, BTRFS_DEV_STAT_CORRUPTION_ERRS); @@ -2855,8 +2868,7 @@ static int btrfs_readpage_end_io_hook(struct btrfs_io_bio *io_bio, } phy_offset >>= inode->i_sb->s_blocksize_bits; - return check_data_csum(inode, io_bio, phy_offset, page, offset, start, - (size_t)(end - start + 1)); + return check_data_csum(inode, io_bio, phy_offset, page, offset); } /* @@ -7542,8 +7554,7 @@ static blk_status_t btrfs_check_read_dio_bio(struct inode *inode, ASSERT(pgoff < PAGE_SIZE); if (uptodate && (!csum || !check_data_csum(inode, io_bio, icsum, - bvec.bv_page, pgoff, - start, sectorsize))) { + bvec.bv_page, pgoff))) { clean_io_failure(fs_info, failure_tree, io_tree, start, bvec.bv_page, btrfs_ino(BTRFS_I(inode)), From patchwork Wed Oct 21 06:24:55 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848325 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 537DEC56201 for ; Wed, 21 Oct 2020 06:26:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 030DD22249 for ; Wed, 21 Oct 2020 06:26:22 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="GXf9blHI" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440727AbgJUG0W (ORCPT ); Wed, 21 Oct 2020 02:26:22 -0400 Received: from mx2.suse.de ([195.135.220.15]:42696 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2440723AbgJUG0V (ORCPT ); Wed, 21 Oct 2020 02:26:21 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261579; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XugAJXTVbcMduxwoky4Xcurh78SnufiC6dAqK5DT3wY=; b=GXf9blHICbI2b7RNXRSNZdkjQxI1t8An3LqJw807UXFGUjgsKayizBrb14/PaG5irLAxcZ sV0er9Zgk9N3DAXhpxQ8tWxhnrvrmkGBXLfD4Zg2yhCz1P1oEtfv+poCd+ZLzkbPGkOmap FF54x407iIZvl9LT33OiFBLWEOcB65I= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id CFF12AC48 for ; Wed, 21 Oct 2020 06:26:19 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 09/68] btrfs: extent_io: unexport extent_invalidatepage() Date: Wed, 21 Oct 2020 14:24:55 +0800 Message-Id: <20201021062554.68132-10-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Function extent_invalidatepage() has a single caller, btree_invalidatepage(). Just unexport this function and move it disk-io.c. Signed-off-by: Qu Wenruo --- fs/btrfs/disk-io.c | 23 +++++++++++++++++++++++ fs/btrfs/extent-io-tree.h | 2 -- fs/btrfs/extent_io.c | 24 ------------------------ 3 files changed, 23 insertions(+), 26 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index b7436ab7bba9..c81b7e53149c 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -966,6 +966,29 @@ static int btree_releasepage(struct page *page, gfp_t gfp_flags) return try_release_extent_buffer(page); } +/* + * basic invalidatepage code, this waits on any locked or writeback + * ranges corresponding to the page, and then deletes any extent state + * records from the tree + */ +static void extent_invalidatepage(struct extent_io_tree *tree, + struct page *page, unsigned long offset) +{ + struct extent_state *cached_state = NULL; + u64 start = page_offset(page); + u64 end = start + PAGE_SIZE - 1; + size_t blocksize = page->mapping->host->i_sb->s_blocksize; + + start += ALIGN(offset, blocksize); + if (start > end) + return; + + lock_extent_bits(tree, start, end, &cached_state); + wait_on_page_writeback(page); + clear_extent_bit(tree, start, end, EXTENT_LOCKED | EXTENT_DELALLOC | + EXTENT_DO_ACCOUNTING, 1, 1, &cached_state); +} + static void btree_invalidatepage(struct page *page, unsigned int offset, unsigned int length) { diff --git a/fs/btrfs/extent-io-tree.h b/fs/btrfs/extent-io-tree.h index 92caa1190ca8..3aaf83376797 100644 --- a/fs/btrfs/extent-io-tree.h +++ b/fs/btrfs/extent-io-tree.h @@ -227,8 +227,6 @@ void find_first_clear_extent_bit(struct extent_io_tree *tree, u64 start, u64 *start_ret, u64 *end_ret, unsigned bits); int find_contiguous_extent_bit(struct extent_io_tree *tree, u64 start, u64 *start_ret, u64 *end_ret, unsigned bits); -int extent_invalidatepage(struct extent_io_tree *tree, - struct page *page, unsigned long offset); bool btrfs_find_delalloc_range(struct extent_io_tree *tree, u64 *start, u64 *end, u64 max_bytes, struct extent_state **cached_state); diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index ca219c42ddc6..3f95c67f0c92 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -4409,30 +4409,6 @@ void extent_readahead(struct readahead_control *rac) } } -/* - * basic invalidatepage code, this waits on any locked or writeback - * ranges corresponding to the page, and then deletes any extent state - * records from the tree - */ -int extent_invalidatepage(struct extent_io_tree *tree, - struct page *page, unsigned long offset) -{ - struct extent_state *cached_state = NULL; - u64 start = page_offset(page); - u64 end = start + PAGE_SIZE - 1; - size_t blocksize = page->mapping->host->i_sb->s_blocksize; - - start += ALIGN(offset, blocksize); - if (start > end) - return 0; - - lock_extent_bits(tree, start, end, &cached_state); - wait_on_page_writeback(page); - clear_extent_bit(tree, start, end, EXTENT_LOCKED | EXTENT_DELALLOC | - EXTENT_DO_ACCOUNTING, 1, 1, &cached_state); - return 0; -} - /* * a helper for releasepage, this tests for areas of the page that * are locked or under IO and drops the related state bits if it is safe From patchwork Wed Oct 21 06:24:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848327 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BEDE3C561F8 for ; Wed, 21 Oct 2020 06:26:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 71D0022249 for ; Wed, 21 Oct 2020 06:26:24 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="ZmQGzxZu" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440729AbgJUG0X (ORCPT ); Wed, 21 Oct 2020 02:26:23 -0400 Received: from mx2.suse.de ([195.135.220.15]:42716 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2440725AbgJUG0X (ORCPT ); Wed, 21 Oct 2020 02:26:23 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261581; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=MkJnHkjRGv5xVQjiOoPJ4yNmrnt5iUJgNIk6BvQvbns=; b=ZmQGzxZuRbAfnQDZRW2wxgrUE+L5Or+tD8wjC8R+9+cnxSUsZql2UePxj1smk8TATPpYIe pdiNjS2Rq/TYAZtG4pXJXv3vpzlxbUfkUDxvGJi8iTneq3shaZCMNHGj6hTae7GwbTEsfC a/3GGR9w7gVGMpc84nhRjaHO9UCP5vw= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 8ED1BAC1D for ; Wed, 21 Oct 2020 06:26:21 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 10/68] btrfs: extent_io: remove the forward declaration and rename __process_pages_contig Date: Wed, 21 Oct 2020 14:24:56 +0800 Message-Id: <20201021062554.68132-11-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org There is no need to do forward declaration for __process_pages_contig(), so move it before it get first called. Since we are here, also remove the "__" prefix since there is no special meaning for it. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 180 +++++++++++++++++++++++-------------------- 1 file changed, 95 insertions(+), 85 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 3f95c67f0c92..d5e03977c9c8 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -1814,10 +1814,98 @@ bool btrfs_find_delalloc_range(struct extent_io_tree *tree, u64 *start, return found; } -static int __process_pages_contig(struct address_space *mapping, - struct page *locked_page, - pgoff_t start_index, pgoff_t end_index, - unsigned long page_ops, pgoff_t *index_ret); +/* + * A helper to update contiguous pages status according to @page_ops. + * + * @mapping: The address space of the pages + * @locked_page: The already locked page. Mostly for inline extent + * handling + * @start_index: The start page index. + * @end_inde: The last page index. + * @pages_opts: The operations to be done + * @index_ret: The last handled page index (for error case) + * + * Return 0 if every page is handled properly. + * Return <0 if something wrong happened, and update @index_ret. + */ +static int process_pages_contig(struct address_space *mapping, + struct page *locked_page, + pgoff_t start_index, pgoff_t end_index, + unsigned long page_ops, pgoff_t *index_ret) +{ + unsigned long nr_pages = end_index - start_index + 1; + unsigned long pages_locked = 0; + pgoff_t index = start_index; + struct page *pages[16]; + unsigned ret; + int err = 0; + int i; + + if (page_ops & PAGE_LOCK) { + ASSERT(page_ops == PAGE_LOCK); + ASSERT(index_ret && *index_ret == start_index); + } + + if ((page_ops & PAGE_SET_ERROR) && nr_pages > 0) + mapping_set_error(mapping, -EIO); + + while (nr_pages > 0) { + ret = find_get_pages_contig(mapping, index, + min_t(unsigned long, + nr_pages, ARRAY_SIZE(pages)), pages); + if (ret == 0) { + /* + * Only if we're going to lock these pages, + * can we find nothing at @index. + */ + ASSERT(page_ops & PAGE_LOCK); + err = -EAGAIN; + goto out; + } + + for (i = 0; i < ret; i++) { + if (page_ops & PAGE_SET_PRIVATE2) + SetPagePrivate2(pages[i]); + + if (locked_page && pages[i] == locked_page) { + put_page(pages[i]); + pages_locked++; + continue; + } + if (page_ops & PAGE_CLEAR_DIRTY) + clear_page_dirty_for_io(pages[i]); + if (page_ops & PAGE_SET_WRITEBACK) + set_page_writeback(pages[i]); + if (page_ops & PAGE_SET_ERROR) + SetPageError(pages[i]); + if (page_ops & PAGE_END_WRITEBACK) + end_page_writeback(pages[i]); + if (page_ops & PAGE_UNLOCK) + unlock_page(pages[i]); + if (page_ops & PAGE_LOCK) { + lock_page(pages[i]); + if (!PageDirty(pages[i]) || + pages[i]->mapping != mapping) { + unlock_page(pages[i]); + for (; i < ret; i++) + put_page(pages[i]); + err = -EAGAIN; + goto out; + } + } + put_page(pages[i]); + pages_locked++; + } + nr_pages -= ret; + index += ret; + cond_resched(); + } +out: + if (err && index_ret) + *index_ret = start_index + pages_locked - 1; + return err; +} + static noinline void __unlock_for_delalloc(struct inode *inode, struct page *locked_page, @@ -1830,7 +1918,7 @@ static noinline void __unlock_for_delalloc(struct inode *inode, if (index == locked_page->index && end_index == index) return; - __process_pages_contig(inode->i_mapping, locked_page, index, end_index, + process_pages_contig(inode->i_mapping, locked_page, index, end_index, PAGE_UNLOCK, NULL); } @@ -1848,7 +1936,7 @@ static noinline int lock_delalloc_pages(struct inode *inode, if (index == locked_page->index && index == end_index) return 0; - ret = __process_pages_contig(inode->i_mapping, locked_page, index, + ret = process_pages_contig(inode->i_mapping, locked_page, index, end_index, PAGE_LOCK, &index_ret); if (ret == -EAGAIN) __unlock_for_delalloc(inode, locked_page, delalloc_start, @@ -1945,84 +2033,6 @@ noinline_for_stack bool find_lock_delalloc_range(struct inode *inode, return found; } -static int __process_pages_contig(struct address_space *mapping, - struct page *locked_page, - pgoff_t start_index, pgoff_t end_index, - unsigned long page_ops, pgoff_t *index_ret) -{ - unsigned long nr_pages = end_index - start_index + 1; - unsigned long pages_locked = 0; - pgoff_t index = start_index; - struct page *pages[16]; - unsigned ret; - int err = 0; - int i; - - if (page_ops & PAGE_LOCK) { - ASSERT(page_ops == PAGE_LOCK); - ASSERT(index_ret && *index_ret == start_index); - } - - if ((page_ops & PAGE_SET_ERROR) && nr_pages > 0) - mapping_set_error(mapping, -EIO); - - while (nr_pages > 0) { - ret = find_get_pages_contig(mapping, index, - min_t(unsigned long, - nr_pages, ARRAY_SIZE(pages)), pages); - if (ret == 0) { - /* - * Only if we're going to lock these pages, - * can we find nothing at @index. - */ - ASSERT(page_ops & PAGE_LOCK); - err = -EAGAIN; - goto out; - } - - for (i = 0; i < ret; i++) { - if (page_ops & PAGE_SET_PRIVATE2) - SetPagePrivate2(pages[i]); - - if (locked_page && pages[i] == locked_page) { - put_page(pages[i]); - pages_locked++; - continue; - } - if (page_ops & PAGE_CLEAR_DIRTY) - clear_page_dirty_for_io(pages[i]); - if (page_ops & PAGE_SET_WRITEBACK) - set_page_writeback(pages[i]); - if (page_ops & PAGE_SET_ERROR) - SetPageError(pages[i]); - if (page_ops & PAGE_END_WRITEBACK) - end_page_writeback(pages[i]); - if (page_ops & PAGE_UNLOCK) - unlock_page(pages[i]); - if (page_ops & PAGE_LOCK) { - lock_page(pages[i]); - if (!PageDirty(pages[i]) || - pages[i]->mapping != mapping) { - unlock_page(pages[i]); - for (; i < ret; i++) - put_page(pages[i]); - err = -EAGAIN; - goto out; - } - } - put_page(pages[i]); - pages_locked++; - } - nr_pages -= ret; - index += ret; - cond_resched(); - } -out: - if (err && index_ret) - *index_ret = start_index + pages_locked - 1; - return err; -} - void extent_clear_unlock_delalloc(struct btrfs_inode *inode, u64 start, u64 end, struct page *locked_page, unsigned clear_bits, @@ -2030,7 +2040,7 @@ void extent_clear_unlock_delalloc(struct btrfs_inode *inode, u64 start, u64 end, { clear_extent_bit(&inode->io_tree, start, end, clear_bits, 1, 0, NULL); - __process_pages_contig(inode->vfs_inode.i_mapping, locked_page, + process_pages_contig(inode->vfs_inode.i_mapping, locked_page, start >> PAGE_SHIFT, end >> PAGE_SHIFT, page_ops, NULL); } From patchwork Wed Oct 21 06:24:57 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848331 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 95D20C561F8 for ; Wed, 21 Oct 2020 06:26:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 338B222249 for ; Wed, 21 Oct 2020 06:26:27 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="DzQ/MA81" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440732AbgJUG0Y (ORCPT ); Wed, 21 Oct 2020 02:26:24 -0400 Received: from mx2.suse.de ([195.135.220.15]:42762 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2440725AbgJUG0Y (ORCPT ); Wed, 21 Oct 2020 02:26:24 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261583; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=wfoeGy0ClMq40LYE9sYe+LOw8Am2amNkICcMqUWZmlc=; b=DzQ/MA81u50p5msaboFhNKyrTArRizl4irI3PmnVKXnEhM4zs5JcC+ZKErISpmkSZw3e2+ xN6PhZY44QDIVuMLJ2lUrvREjRMhyyZ46yzODl7HS5Yretd456vTarxxogy6N5JWdJGi9i mFpCDjuou3eAups7oQYRaHT8buUx9yE= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 545CDAC35 for ; Wed, 21 Oct 2020 06:26:23 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 11/68] btrfs: extent_io: rename pages_locked in process_pages_contig() Date: Wed, 21 Oct 2020 14:24:57 +0800 Message-Id: <20201021062554.68132-12-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Function process_pages_contig() does not only handle page locking but also other operations. So rename the local variable pages_locked to pages_processed to reduce confusion. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index d5e03977c9c8..f20b8e886724 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -1834,7 +1834,7 @@ static int process_pages_contig(struct address_space *mapping, unsigned long page_ops, pgoff_t *index_ret) { unsigned long nr_pages = end_index - start_index + 1; - unsigned long pages_locked = 0; + unsigned long pages_processed = 0; pgoff_t index = start_index; struct page *pages[16]; unsigned ret; @@ -1869,7 +1869,7 @@ static int process_pages_contig(struct address_space *mapping, if (locked_page && pages[i] == locked_page) { put_page(pages[i]); - pages_locked++; + pages_processed++; continue; } if (page_ops & PAGE_CLEAR_DIRTY) @@ -1894,7 +1894,7 @@ static int process_pages_contig(struct address_space *mapping, } } put_page(pages[i]); - pages_locked++; + pages_processed++; } nr_pages -= ret; index += ret; @@ -1902,7 +1902,7 @@ static int process_pages_contig(struct address_space *mapping, } out: if (err && index_ret) - *index_ret = start_index + pages_locked - 1; + *index_ret = start_index + pages_processed - 1; return err; } From patchwork Wed Oct 21 06:24:58 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848337 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B0EE4C561F8 for ; Wed, 21 Oct 2020 06:26:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 482FB22249 for ; Wed, 21 Oct 2020 06:26:29 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="BZMbBVoV" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440735AbgJUG02 (ORCPT ); Wed, 21 Oct 2020 02:26:28 -0400 Received: from mx2.suse.de ([195.135.220.15]:42782 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2440725AbgJUG01 (ORCPT ); Wed, 21 Oct 2020 02:26:27 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261585; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5Sct496M1vc0JrFl6TRtnbRDLnUxfT7e2s6vXQodJyM=; b=BZMbBVoVNKlM12CjESGHU2KOptCpgiOlHZ6Yv7n2+g8yZtFUlc0qjl+wMdag9/v9buBydv dAxe0IZN6E4CbPAaTlb4uGHssi1PkP3SY52g2EXErdctbjMB4LZQneDrSAH8rrbkN80SdJ NN/YICu91XoKShbZPIDPqKWTORXALPk= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 70F34AC1D for ; Wed, 21 Oct 2020 06:26:25 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 12/68] btrfs: extent_io: only require sector size alignment for page read Date: Wed, 21 Oct 2020 14:24:58 +0800 Message-Id: <20201021062554.68132-13-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org If we're reading partial page, btrfs will warn about this as our read/write are always done in sector size, which equals page size. But for the incoming subpage RO support, our data read is only aligned to sectorsize, which can be smaller than page size. Thus here we change the warning condition to check it against sectorsize, thus the behavior is not changed for regular sectorsize == PAGE_SIZE case, and won't report error for subpage read. Also, pass the proper start/end with bv_offset for check_data_csum() to handle. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 34 ++++++++++++++++++---------------- 1 file changed, 18 insertions(+), 16 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index f20b8e886724..ce5b23169e47 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -2834,6 +2834,7 @@ static void end_bio_extent_readpage(struct bio *bio) struct page *page = bvec->bv_page; struct inode *inode = page->mapping->host; struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); + u32 sectorsize = fs_info->sectorsize; bool data_inode = btrfs_ino(BTRFS_I(inode)) != BTRFS_BTREE_INODE_OBJECTID; @@ -2844,24 +2845,25 @@ static void end_bio_extent_readpage(struct bio *bio) tree = &BTRFS_I(inode)->io_tree; failure_tree = &BTRFS_I(inode)->io_failure_tree; - /* We always issue full-page reads, but if some block + /* + * We always issue full-sector reads, but if some block * in a page fails to read, blk_update_request() will * advance bv_offset and adjust bv_len to compensate. - * Print a warning for nonzero offsets, and an error - * if they don't add up to a full page. */ - if (bvec->bv_offset || bvec->bv_len != PAGE_SIZE) { - if (bvec->bv_offset + bvec->bv_len != PAGE_SIZE) - btrfs_err(fs_info, - "partial page read in btrfs with offset %u and length %u", - bvec->bv_offset, bvec->bv_len); - else - btrfs_info(fs_info, - "incomplete page read in btrfs with offset %u and length %u", - bvec->bv_offset, bvec->bv_len); - } - - start = page_offset(page); - end = start + bvec->bv_offset + bvec->bv_len - 1; + * Print a warning for unaligned offsets, and an error + * if they don't add up to a full sector. + */ + if (!IS_ALIGNED(bvec->bv_offset, sectorsize)) + btrfs_err(fs_info, + "partial page read in btrfs with offset %u and length %u", + bvec->bv_offset, bvec->bv_len); + else if (!IS_ALIGNED(bvec->bv_offset + bvec->bv_len, + sectorsize)) + btrfs_info(fs_info, + "incomplete page read in btrfs with offset %u and length %u", + bvec->bv_offset, bvec->bv_len); + + start = page_offset(page) + bvec->bv_offset; + end = start + bvec->bv_len - 1; len = bvec->bv_len; mirror = io_bio->mirror_num; From patchwork Wed Oct 21 06:24:59 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848333 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5EF72C4363A for ; Wed, 21 Oct 2020 06:26:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0868922249 for ; Wed, 21 Oct 2020 06:26:29 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="OpF+nhVM" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440737AbgJUG03 (ORCPT ); Wed, 21 Oct 2020 02:26:29 -0400 Received: from mx2.suse.de ([195.135.220.15]:42810 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2440733AbgJUG02 (ORCPT ); Wed, 21 Oct 2020 02:26:28 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261587; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xs3+0yW28lA5Gb5G7SsbaX6OiwE0YoaOmCrHiObhQbg=; b=OpF+nhVMNh+ipETvQeEGHl/zWBsMLs9Mn9W4VVezWjnpuIQCqsyDyPgwd9HOaqy0f5Pj7o 90rAprFbDOj0nIeMDt6G9MTQnY+AgUEf48tT+OsfCqQ0Z6hS+eWR+3UTMeIOS1WLayyb6D wqbGZ/GTDIt4ERYAWgAeFudUjROGI9Y= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 1C535AC35 for ; Wed, 21 Oct 2020 06:26:27 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 13/68] btrfs: extent_io: remove the extent_start/extent_len for end_bio_extent_readpage() Date: Wed, 21 Oct 2020 14:24:59 +0800 Message-Id: <20201021062554.68132-14-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org In end_bio_extent_readpage() we had a strange dance around extent_start/extent_len. The truth is, no matter what we're doing using those two variable, the end result is just the same, clear the EXTENT_LOCKED bit and if needed set the EXTENT_UPTODATE bit for the io_tree. This doesn't need the complex dance, we can do it pretty easily by just calling endio_readpage_release_extent() for each bvec. This greatly streamlines the code. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 30 ++---------------------------- 1 file changed, 2 insertions(+), 28 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index ce5b23169e47..3819bf7505e3 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -2791,11 +2791,10 @@ static void end_bio_extent_writepage(struct bio *bio) } static void -endio_readpage_release_extent(struct extent_io_tree *tree, u64 start, u64 len, +endio_readpage_release_extent(struct extent_io_tree *tree, u64 start, u64 end, int uptodate) { struct extent_state *cached = NULL; - u64 end = start + len - 1; if (uptodate && tree->track_uptodate) set_extent_uptodate(tree, start, end, &cached, GFP_ATOMIC); @@ -2823,8 +2822,6 @@ static void end_bio_extent_readpage(struct bio *bio) u64 start; u64 end; u64 len; - u64 extent_start = 0; - u64 extent_len = 0; int mirror; int ret; struct bvec_iter_all iter_all; @@ -2932,32 +2929,9 @@ static void end_bio_extent_readpage(struct bio *bio) unlock_page(page); offset += len; - if (unlikely(!uptodate)) { - if (extent_len) { - endio_readpage_release_extent(tree, - extent_start, - extent_len, 1); - extent_start = 0; - extent_len = 0; - } - endio_readpage_release_extent(tree, start, - end - start + 1, 0); - } else if (!extent_len) { - extent_start = start; - extent_len = end + 1 - start; - } else if (extent_start + extent_len == start) { - extent_len += end + 1 - start; - } else { - endio_readpage_release_extent(tree, extent_start, - extent_len, uptodate); - extent_start = start; - extent_len = end + 1 - start; - } + endio_readpage_release_extent(tree, start, end, uptodate); } - if (extent_len) - endio_readpage_release_extent(tree, extent_start, extent_len, - uptodate); btrfs_io_bio_free_csum(io_bio); bio_put(bio); } From patchwork Wed Oct 21 06:25:00 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848335 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B4B19C56201 for ; Wed, 21 Oct 2020 06:26:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 69B9022249 for ; Wed, 21 Oct 2020 06:26:31 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="lHxhjbFD" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440740AbgJUG0a (ORCPT ); Wed, 21 Oct 2020 02:26:30 -0400 Received: from mx2.suse.de ([195.135.220.15]:42846 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2440738AbgJUG0a (ORCPT ); Wed, 21 Oct 2020 02:26:30 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261588; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=i8J86bq7GsmfMIy3AGHqzxW7QwXQBwnFu65/lrhid9w=; b=lHxhjbFDYzvRLJWwuMoKrOchkyw6hoNN89MDSHU6qr2yX1j7oOEIWUeHhXZsxyXd2PZ1Sq WyZwooTAsTySdzZDho5jsD/Aw9rWvxcM8aCkWwwEh5IF2cYn8/DuBjZ+5vfAfBzZOEVDNb UtjRLXgnf3Q5lubC1/Txs9V/HjuYF1Q= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id B7147AC1D for ; Wed, 21 Oct 2020 06:26:28 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 14/68] btrfs: extent_io: integrate page status update into endio_readpage_release_extent() Date: Wed, 21 Oct 2020 14:25:00 +0800 Message-Id: <20201021062554.68132-15-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org In end_bio_extent_readpage(), we set page uptodate or error according to the bio status. However that assumes all submitted read are in page size. To support case like subpage read, we should only set the whole page uptodate if all data in the page has been read from disk. This patch will integrate the page status update into endio_readpage_release_extent() for end_bio_extent_readpage(). Now in endio_readpage_release_extent() we will set the page uptodate if either: - start/end covers the full page This is the existing behavior already. - all the page range is already uptodate This adds the support for subpage read. And for the error path, we always clear the page uptodate and set the page error. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 39 +++++++++++++++++++++++++++++---------- 1 file changed, 29 insertions(+), 10 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 3819bf7505e3..ec0f1fb01a0f 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -2791,13 +2791,36 @@ static void end_bio_extent_writepage(struct bio *bio) } static void -endio_readpage_release_extent(struct extent_io_tree *tree, u64 start, u64 end, - int uptodate) +endio_readpage_release_extent(struct extent_io_tree *tree, struct page *page, + u64 start, u64 end, int uptodate) { struct extent_state *cached = NULL; - if (uptodate && tree->track_uptodate) - set_extent_uptodate(tree, start, end, &cached, GFP_ATOMIC); + if (uptodate) { + u64 page_start = page_offset(page); + u64 page_end = page_offset(page) + PAGE_SIZE - 1; + + if (tree->track_uptodate) { + /* + * The tree has EXTENT_UPTODATE bit tracking, update + * extent io tree, and use it to update the page if + * needed. + */ + set_extent_uptodate(tree, start, end, &cached, + GFP_NOFS); + check_page_uptodate(tree, page); + } else if ((start <= page_start && end >= page_end)) { + /* We have covered the full page, set it uptodate */ + SetPageUptodate(page); + } + } else if (!uptodate){ + if (tree->track_uptodate) + clear_extent_uptodate(tree, start, end, &cached); + + /* Any error in the page range would invalid the uptodate bit */ + ClearPageUptodate(page); + SetPageError(page); + } unlock_extent_cached_atomic(tree, start, end, &cached); } @@ -2921,15 +2944,11 @@ static void end_bio_extent_readpage(struct bio *bio) off = offset_in_page(i_size); if (page->index == end_index && off) zero_user_segment(page, off, PAGE_SIZE); - SetPageUptodate(page); - } else { - ClearPageUptodate(page); - SetPageError(page); } - unlock_page(page); offset += len; - endio_readpage_release_extent(tree, start, end, uptodate); + endio_readpage_release_extent(tree, page, start, end, uptodate); + unlock_page(page); } btrfs_io_bio_free_csum(io_bio); From patchwork Wed Oct 21 06:25:01 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848339 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 17DCDC561F8 for ; Wed, 21 Oct 2020 06:26:34 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9A3F622249 for ; Wed, 21 Oct 2020 06:26:33 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="DeFBXxus" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440742AbgJUG0c (ORCPT ); Wed, 21 Oct 2020 02:26:32 -0400 Received: from mx2.suse.de ([195.135.220.15]:42880 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2440738AbgJUG0c (ORCPT ); Wed, 21 Oct 2020 02:26:32 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261590; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tUTopscJhsVHaT3njNYY2hsgvNXAhR3XRRsgSZd+mJQ=; b=DeFBXxusXGAygsCEcoPFtNk+m2eq6I0m2Axdj2U45pUFC2WRAxPuMhKVQYK2kUfDHdPubp AOoK5kMox6LlEYDtnJn2nASGGJsumjswU0MANapbvwsTlyhKjtWVMvQqIJ1U6JeVeXE3AP vOfE45OQ/5yPOykDUU5uDKWFdAl441o= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 62281AC1D for ; Wed, 21 Oct 2020 06:26:30 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 15/68] btrfs: extent_io: rename page_size to io_size in submit_extent_page() Date: Wed, 21 Oct 2020 14:25:01 +0800 Message-Id: <20201021062554.68132-16-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org The variable @page_size of submit_extent_page() is not bounded to page size. It can already be smaller than PAGE_SIZE, so rename it to io_size to reduce confusion, this is especially important for later subpage support. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index ec0f1fb01a0f..5842d3522865 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -3047,7 +3047,7 @@ static int submit_extent_page(unsigned int opf, { int ret = 0; struct bio *bio; - size_t page_size = min_t(size_t, size, PAGE_SIZE); + size_t io_size = min_t(size_t, size, PAGE_SIZE); sector_t sector = offset >> 9; struct extent_io_tree *tree = &BTRFS_I(page->mapping->host)->io_tree; @@ -3064,12 +3064,12 @@ static int submit_extent_page(unsigned int opf, contig = bio_end_sector(bio) == sector; ASSERT(tree->ops); - if (btrfs_bio_fits_in_stripe(page, page_size, bio, bio_flags)) + if (btrfs_bio_fits_in_stripe(page, io_size, bio, bio_flags)) can_merge = false; if (prev_bio_flags != bio_flags || !contig || !can_merge || force_bio_submit || - bio_add_page(bio, page, page_size, pg_offset) < page_size) { + bio_add_page(bio, page, io_size, pg_offset) < io_size) { ret = submit_one_bio(bio, mirror_num, prev_bio_flags); if (ret < 0) { *bio_ret = NULL; @@ -3078,13 +3078,13 @@ static int submit_extent_page(unsigned int opf, bio = NULL; } else { if (wbc) - wbc_account_cgroup_owner(wbc, page, page_size); + wbc_account_cgroup_owner(wbc, page, io_size); return 0; } } bio = btrfs_bio_alloc(offset); - bio_add_page(bio, page, page_size, pg_offset); + bio_add_page(bio, page, io_size, pg_offset); bio->bi_end_io = end_io_func; bio->bi_private = tree; bio->bi_write_hint = page->mapping->host->i_write_hint; @@ -3095,7 +3095,7 @@ static int submit_extent_page(unsigned int opf, bdev = BTRFS_I(page->mapping->host)->root->fs_info->fs_devices->latest_bdev; bio_set_dev(bio, bdev); wbc_init_bio(wbc, bio); - wbc_account_cgroup_owner(wbc, page, page_size); + wbc_account_cgroup_owner(wbc, page, io_size); } *bio_ret = bio; From patchwork Wed Oct 21 06:25:02 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848343 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4353CC56201 for ; Wed, 21 Oct 2020 06:26:35 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E120822249 for ; Wed, 21 Oct 2020 06:26:34 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="ikyMjfaX" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440745AbgJUG0d (ORCPT ); Wed, 21 Oct 2020 02:26:33 -0400 Received: from mx2.suse.de ([195.135.220.15]:42898 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2440738AbgJUG0d (ORCPT ); Wed, 21 Oct 2020 02:26:33 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261592; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Lm95CuHINRPYUG6rieMPjVrDRkTEKBHaPGxfx8YE7W8=; b=ikyMjfaXI2wcRjqbvcIV83NYF7OhE/Tk7wdEvJjEztU9mppuYlxSJ4c3TcG94KwKA0QSPl Ggl11kAB14fD9QJt1wza/CI6kXb72SorWwBkiAwW/aYsHVjUIotg+aIJCFXVPWmvNhLJlK uR23s+g6UzOaXM/a1D770uXnIoabPLg= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 66917AC35; Wed, 21 Oct 2020 06:26:32 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Cc: Nikolay Borisov Subject: [PATCH v4 16/68] btrfs: extent_io: add assert_spin_locked() for attach_extent_buffer_page() Date: Wed, 21 Oct 2020 14:25:02 +0800 Message-Id: <20201021062554.68132-17-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org When calling attach_extent_buffer_page(), either we're attaching anonymous pages, called from btrfs_clone_extent_buffer(). Or we're attaching btree_inode pages, called from alloc_extent_buffer(). For the later case, we should have page->mapping->private_lock hold to avoid race modifying page->private. Add assert_spin_locked() if we're calling from alloc_extent_buffer(). Signed-off-by: Qu Wenruo Reviewed-by: Nikolay Borisov --- fs/btrfs/extent_io.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 5842d3522865..8bf38948bd37 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -3106,6 +3106,15 @@ static int submit_extent_page(unsigned int opf, static void attach_extent_buffer_page(struct extent_buffer *eb, struct page *page) { + /* + * If the page is mapped to btree inode, we should hold the private + * lock to prevent race. + * For cloned or dummy extent buffers, their pages are not mapped and + * will not race with any other ebs. + */ + if (page->mapping) + assert_spin_locked(&page->mapping->private_lock); + if (!PagePrivate(page)) attach_page_private(page, eb); else From patchwork Wed Oct 21 06:25:03 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848341 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 17DC7C561F8 for ; Wed, 21 Oct 2020 06:26:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C540A22249 for ; Wed, 21 Oct 2020 06:26:36 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="YiWIATgX" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440748AbgJUG0g (ORCPT ); Wed, 21 Oct 2020 02:26:36 -0400 Received: from mx2.suse.de ([195.135.220.15]:42938 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2440746AbgJUG0f (ORCPT ); Wed, 21 Oct 2020 02:26:35 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261594; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=08vX3OsbP4cRIH4PqVdGijiNbykNGElajfWY6H+enR8=; b=YiWIATgXUHTxWBxW/DVPMYxcLhX3mYl6EDkj+VWSxPnElWJipKbPpFO4HR4+GtwIZzBGVl nZLMfo5tkWUNEPk32Dezz285lEccnyrhiWHXcffp9iiDhr5WO79o4+soJGVF+xBV4qx+kU VliUmuKhi0Qw7JmOiVd9jV5HNy5JP0w= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 26D12AC35 for ; Wed, 21 Oct 2020 06:26:34 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 17/68] btrfs: extent_io: extract the btree page submission code into its own helper function Date: Wed, 21 Oct 2020 14:25:03 +0800 Message-Id: <20201021062554.68132-18-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org In btree_write_cache_pages() we have a btree page submission routine buried deeply into a nested loop. This patch will extract that part of code into a helper function, submit_btree_page(), to do the same work. Also, since submit_btree_page() now can return >0 for successfull extent buffer submission, remove the "ASSERT(ret <= 0);" line. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 116 +++++++++++++++++++++++++------------------ 1 file changed, 69 insertions(+), 47 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 8bf38948bd37..0d5d0581af06 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -3984,10 +3984,75 @@ static noinline_for_stack int write_one_eb(struct extent_buffer *eb, return ret; } +/* + * A helper to submit a btree page. + * + * This function is not always submitting the page, as we only submit the full + * extent buffer in a batch. + * + * @page: The btree page + * @prev_eb: Previous extent buffer, to determine if we need to submit + * this page. + * + * Return >0 if we have submitted the extent buffer successfully. + * Return 0 if we don't need to do anything for the page. + * Return <0 for fatal error. + */ +static int submit_btree_page(struct page *page, struct writeback_control *wbc, + struct extent_page_data *epd, + struct extent_buffer **prev_eb) +{ + struct address_space *mapping = page->mapping; + struct extent_buffer *eb; + int ret; + + if (!PagePrivate(page)) + return 0; + + spin_lock(&mapping->private_lock); + if (!PagePrivate(page)) { + spin_unlock(&mapping->private_lock); + return 0; + } + + eb = (struct extent_buffer *)page->private; + + /* + * Shouldn't happen and normally this would be a BUG_ON but no sense + * in crashing the users box for something we can survive anyway. + */ + if (WARN_ON(!eb)) { + spin_unlock(&mapping->private_lock); + return 0; + } + + if (eb == *prev_eb) { + spin_unlock(&mapping->private_lock); + return 0; + } + ret = atomic_inc_not_zero(&eb->refs); + spin_unlock(&mapping->private_lock); + if (!ret) + return 0; + + *prev_eb = eb; + + ret = lock_extent_buffer_for_io(eb, epd); + if (ret <= 0) { + free_extent_buffer(eb); + return ret; + } + ret = write_one_eb(eb, wbc, epd); + free_extent_buffer(eb); + if (ret < 0) + return ret; + return 1; +} + int btree_write_cache_pages(struct address_space *mapping, struct writeback_control *wbc) { - struct extent_buffer *eb, *prev_eb = NULL; + struct extent_buffer *prev_eb = NULL; struct extent_page_data epd = { .bio = NULL, .extent_locked = 0, @@ -4033,55 +4098,13 @@ int btree_write_cache_pages(struct address_space *mapping, for (i = 0; i < nr_pages; i++) { struct page *page = pvec.pages[i]; - if (!PagePrivate(page)) - continue; - - spin_lock(&mapping->private_lock); - if (!PagePrivate(page)) { - spin_unlock(&mapping->private_lock); - continue; - } - - eb = (struct extent_buffer *)page->private; - - /* - * Shouldn't happen and normally this would be a BUG_ON - * but no sense in crashing the users box for something - * we can survive anyway. - */ - if (WARN_ON(!eb)) { - spin_unlock(&mapping->private_lock); - continue; - } - - if (eb == prev_eb) { - spin_unlock(&mapping->private_lock); - continue; - } - - ret = atomic_inc_not_zero(&eb->refs); - spin_unlock(&mapping->private_lock); - if (!ret) - continue; - - prev_eb = eb; - ret = lock_extent_buffer_for_io(eb, &epd); - if (!ret) { - free_extent_buffer(eb); + ret = submit_btree_page(page, wbc, &epd, &prev_eb); + if (ret == 0) continue; - } else if (ret < 0) { - done = 1; - free_extent_buffer(eb); - break; - } - - ret = write_one_eb(eb, wbc, &epd); - if (ret) { + if (ret < 0) { done = 1; - free_extent_buffer(eb); break; } - free_extent_buffer(eb); /* * the filesystem may choose to bump up nr_to_write. @@ -4102,7 +4125,6 @@ int btree_write_cache_pages(struct address_space *mapping, index = 0; goto retry; } - ASSERT(ret <= 0); if (ret < 0) { end_write_bio(&epd, ret); return ret; From patchwork Wed Oct 21 06:25:04 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848345 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5F7F5C4363A for ; Wed, 21 Oct 2020 06:26:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0DF7922249 for ; Wed, 21 Oct 2020 06:26:39 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="F5uO7Ge6" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440750AbgJUG0i (ORCPT ); Wed, 21 Oct 2020 02:26:38 -0400 Received: from mx2.suse.de ([195.135.220.15]:42948 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2408709AbgJUG0h (ORCPT ); Wed, 21 Oct 2020 02:26:37 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261596; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2vdyEZwJ7LfD4eWqr8u00vys8zxlpS2fn0cgrxYhVFs=; b=F5uO7Ge6k7XsFmhjLIFxDOPnXxVHBomhqFN0Z4HulkJAy07+/OmrcCxbEchzYH2T/ykPmq gOnKIQMX3/9jbNvS1g/vs2QlVVjN0b0FbR2zopgGwS8kVDNt7vlpekCMxG5qmhC3pw6/BY 7J6UZwyY4pYWtkEcbr7me4D8alSYlmQ= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id E94A0AC1D for ; Wed, 21 Oct 2020 06:26:35 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 18/68] btrfs: extent_io: calculate inline extent buffer page size based on page size Date: Wed, 21 Oct 2020 14:25:04 +0800 Message-Id: <20201021062554.68132-19-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Btrfs only support 64K as max node size, thus for 4K page system, we would have at most 16 pages for one extent buffer. For a system using 64K page size, we would really have just one single page. While we always use 16 pages for extent_buffer::pages[], this means for systems using 64K pages, we are wasting memory for the 15 pages which will never be utilized. So this patch will change how the extent_buffer::pages[] array size is calclulated, now it will be calculated using BTRFS_MAX_METADATA_BLOCKSIZE and PAGE_SIZE. For systems using 4K page size, it will stay 16 pages. For systems using 64K page size, it will be just 1 page. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 6 +++--- fs/btrfs/extent_io.h | 8 +++++--- 2 files changed, 8 insertions(+), 6 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 0d5d0581af06..6e33fa1645c3 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -5020,9 +5020,9 @@ __alloc_extent_buffer(struct btrfs_fs_info *fs_info, u64 start, /* * Sanity checks, currently the maximum is 64k covered by 16x 4k pages */ - BUILD_BUG_ON(BTRFS_MAX_METADATA_BLOCKSIZE - > MAX_INLINE_EXTENT_BUFFER_SIZE); - BUG_ON(len > MAX_INLINE_EXTENT_BUFFER_SIZE); + BUILD_BUG_ON(BTRFS_MAX_METADATA_BLOCKSIZE > + INLINE_EXTENT_BUFFER_PAGES * PAGE_SIZE); + BUG_ON(len > BTRFS_MAX_METADATA_BLOCKSIZE); #ifdef CONFIG_BTRFS_DEBUG eb->spinning_writers = 0; diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h index 3c9252b429e0..e588b3100ede 100644 --- a/fs/btrfs/extent_io.h +++ b/fs/btrfs/extent_io.h @@ -85,9 +85,11 @@ struct extent_io_ops { int mirror); }; - -#define INLINE_EXTENT_BUFFER_PAGES 16 -#define MAX_INLINE_EXTENT_BUFFER_SIZE (INLINE_EXTENT_BUFFER_PAGES * PAGE_SIZE) +/* + * The SZ_64K is BTRFS_MAX_METADATA_BLOCKSIZE, here just to avoid circle + * including "ctree.h". + */ +#define INLINE_EXTENT_BUFFER_PAGES (SZ_64K / PAGE_SIZE) struct extent_buffer { u64 start; unsigned long len; From patchwork Wed Oct 21 06:25:05 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848349 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4346FC4363A for ; Wed, 21 Oct 2020 06:26:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D8F7222249 for ; Wed, 21 Oct 2020 06:26:41 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="dz/SEozr" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440753AbgJUG0k (ORCPT ); Wed, 21 Oct 2020 02:26:40 -0400 Received: from mx2.suse.de ([195.135.220.15]:42998 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2408709AbgJUG0k (ORCPT ); Wed, 21 Oct 2020 02:26:40 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261599; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jXW3Kjt7B4nsuMWk/PyGljcZSq6QqbxJ3d+n0lWHhLY=; b=dz/SEozrR5XmE4aiwtHJ7tdHigjRGs361GaFt2NtArRV0ZVUX87KJaTQbfJRhy74j0XKT4 BI7JYBZxsFYFGMiZdzpM4wTx926UXDizOgnB+oWoSv47RfGwA2Or4H+GpqSaI6mrLEpbha NFsHSCTaG4ggE9NVhf2ZCtMVeqvFoUs= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id F1DD3AC1D; Wed, 21 Oct 2020 06:26:38 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Cc: Nikolay Borisov Subject: [PATCH v4 19/68] btrfs: extent_io: make btrfs_fs_info::buffer_radix to take sector size devided values Date: Wed, 21 Oct 2020 14:25:05 +0800 Message-Id: <20201021062554.68132-20-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org For subpage sized sector size support, one page can contain mutliple tree blocks, thus we can no longer use (eb->start >> PAGE_SHIFT) any more, or we can easily get extent buffer doesn't belongs to the bytenr. This patch will use (extent_buffer::start / sectorsize) as index for radix tree so that we can get correct extent buffer for subpage size support. While still keep the behavior same for regular sector size. Signed-off-by: Qu Wenruo Reviewed-by: Nikolay Borisov --- fs/btrfs/extent_io.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 6e33fa1645c3..4ac315d8753f 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -5158,7 +5158,7 @@ struct extent_buffer *find_extent_buffer(struct btrfs_fs_info *fs_info, rcu_read_lock(); eb = radix_tree_lookup(&fs_info->buffer_radix, - start >> PAGE_SHIFT); + start / fs_info->sectorsize); if (eb && atomic_inc_not_zero(&eb->refs)) { rcu_read_unlock(); /* @@ -5210,7 +5210,7 @@ struct extent_buffer *alloc_test_extent_buffer(struct btrfs_fs_info *fs_info, } spin_lock(&fs_info->buffer_lock); ret = radix_tree_insert(&fs_info->buffer_radix, - start >> PAGE_SHIFT, eb); + start / fs_info->sectorsize, eb); spin_unlock(&fs_info->buffer_lock); radix_tree_preload_end(); if (ret == -EEXIST) { @@ -5318,7 +5318,7 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, spin_lock(&fs_info->buffer_lock); ret = radix_tree_insert(&fs_info->buffer_radix, - start >> PAGE_SHIFT, eb); + start / fs_info->sectorsize, eb); spin_unlock(&fs_info->buffer_lock); radix_tree_preload_end(); if (ret == -EEXIST) { @@ -5374,7 +5374,7 @@ static int release_extent_buffer(struct extent_buffer *eb) spin_lock(&fs_info->buffer_lock); radix_tree_delete(&fs_info->buffer_radix, - eb->start >> PAGE_SHIFT); + eb->start / fs_info->sectorsize); spin_unlock(&fs_info->buffer_lock); } else { spin_unlock(&eb->refs_lock); From patchwork Wed Oct 21 06:25:06 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848351 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4E090C561F8 for ; Wed, 21 Oct 2020 06:26:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id EC5F622249 for ; Wed, 21 Oct 2020 06:26:44 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="DOPZ1lhS" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440756AbgJUG0o (ORCPT ); Wed, 21 Oct 2020 02:26:44 -0400 Received: from mx2.suse.de ([195.135.220.15]:43036 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2408709AbgJUG0o (ORCPT ); Wed, 21 Oct 2020 02:26:44 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261602; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=OglTn2D1BP/D3KSqQOu+najGHjAbuRY4Xjxwjjkrzgw=; b=DOPZ1lhS3R4pjoFrUgOXEj1UpEKN4Fpf7034LhBA9e2/hjabHXXdQ7e92kpVoMZiwefFnU nR/NnI7w//1dg20OrY2aklN/gfyDbTPLW8kLvIwNVgKHeLEYl4QYC9jx9UCqVA1bWKi47g 9Ga1pjqMG3o0cCjd0eP+nnSEmjSqfdk= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 3DAF3AC1D for ; Wed, 21 Oct 2020 06:26:42 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 20/68] btrfs: extent_io: sink less common parameters for __set_extent_bit() Date: Wed, 21 Oct 2020 14:25:06 +0800 Message-Id: <20201021062554.68132-21-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org For __set_extent_bit(), those parameter are less common for most callers: - exclusive_bits - failed_start Paired together for EXTENT_LOCKED usage. - extent_changeset For qgroup usage. As a common design principle, less common parameters should have their default values and only callers really need them will set the parameters to non-default values. Sink those parameters into a new structure, extent_io_extra_options. So most callers won't bother those less used parameters, and make later expansion easier. Signed-off-by: Qu Wenruo --- fs/btrfs/extent-io-tree.h | 22 ++++++++++++++ fs/btrfs/extent_io.c | 61 ++++++++++++++++++++++++--------------- 2 files changed, 59 insertions(+), 24 deletions(-) diff --git a/fs/btrfs/extent-io-tree.h b/fs/btrfs/extent-io-tree.h index 3aaf83376797..dfbb65ac9c8c 100644 --- a/fs/btrfs/extent-io-tree.h +++ b/fs/btrfs/extent-io-tree.h @@ -82,6 +82,28 @@ struct extent_state { #endif }; +/* + * Extra options for extent io tree operations. + * + * All of these options are initialized to 0/false/NULL by default, + * and most callers should utilize the wrappers other than the extra options. + */ +struct extent_io_extra_options { + /* + * For __set_extent_bit(), to return -EEXIST when hit an extent with + * @excl_bits set, and update @excl_failed_start. + * Utizlied by EXTENT_LOCKED wrappers. + */ + u32 excl_bits; + u64 excl_failed_start; + + /* + * For __set/__clear_extent_bit() to record how many bytes is modified. + * For qgroup related functions. + */ + struct extent_changeset *changeset; +}; + int __init extent_state_cache_init(void); void __cold extent_state_cache_exit(void); diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 4ac315d8753f..5f899b27962b 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -29,6 +29,7 @@ static struct kmem_cache *extent_state_cache; static struct kmem_cache *extent_buffer_cache; static struct bio_set btrfs_bioset; +static struct extent_io_extra_options default_opts = { 0 }; static inline bool extent_state_in_tree(const struct extent_state *state) { return !RB_EMPTY_NODE(&state->rb_node); @@ -952,10 +953,10 @@ static void cache_state(struct extent_state *state, } /* - * set some bits on a range in the tree. This may require allocations or + * Set some bits on a range in the tree. This may require allocations or * sleeping, so the gfp mask is used to indicate what is allowed. * - * If any of the exclusive bits are set, this will fail with -EEXIST if some + * If *any* of the exclusive bits are set, this will fail with -EEXIST if some * part of the range already has the desired bits set. The start of the * existing range is returned in failed_start in this case. * @@ -964,26 +965,30 @@ static void cache_state(struct extent_state *state, static int __must_check __set_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, - unsigned bits, unsigned exclusive_bits, - u64 *failed_start, struct extent_state **cached_state, - gfp_t mask, struct extent_changeset *changeset) + unsigned bits, struct extent_state **cached_state, + gfp_t mask, struct extent_io_extra_options *extra_opts) { struct extent_state *state; struct extent_state *prealloc = NULL; struct rb_node *node; struct rb_node **p; struct rb_node *parent; + struct extent_changeset *changeset; int err = 0; + u32 exclusive_bits; + u64 *failed_start; u64 last_start; u64 last_end; btrfs_debug_check_extent_io_range(tree, start, end); trace_btrfs_set_extent_bit(tree, start, end - start + 1, bits); - if (exclusive_bits) - ASSERT(failed_start); - else - ASSERT(!failed_start); + if (!extra_opts) + extra_opts = &default_opts; + exclusive_bits = extra_opts->excl_bits; + failed_start = &extra_opts->excl_failed_start; + changeset = extra_opts->changeset; + again: if (!prealloc && gfpflags_allow_blocking(mask)) { /* @@ -1187,7 +1192,7 @@ int set_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, unsigned bits, struct extent_state **cached_state, gfp_t mask) { - return __set_extent_bit(tree, start, end, bits, 0, NULL, cached_state, + return __set_extent_bit(tree, start, end, bits, cached_state, mask, NULL); } @@ -1414,6 +1419,10 @@ int convert_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, int set_record_extent_bits(struct extent_io_tree *tree, u64 start, u64 end, unsigned bits, struct extent_changeset *changeset) { + struct extent_io_extra_options extra_opts = { + .changeset = changeset, + }; + /* * We don't support EXTENT_LOCKED yet, as current changeset will * record any bits changed, so for EXTENT_LOCKED case, it will @@ -1422,15 +1431,14 @@ int set_record_extent_bits(struct extent_io_tree *tree, u64 start, u64 end, */ BUG_ON(bits & EXTENT_LOCKED); - return __set_extent_bit(tree, start, end, bits, 0, NULL, NULL, GFP_NOFS, - changeset); + return __set_extent_bit(tree, start, end, bits, NULL, GFP_NOFS, + &extra_opts); } int set_extent_bits_nowait(struct extent_io_tree *tree, u64 start, u64 end, unsigned bits) { - return __set_extent_bit(tree, start, end, bits, 0, NULL, NULL, - GFP_NOWAIT, NULL); + return __set_extent_bit(tree, start, end, bits, NULL, GFP_NOWAIT, NULL); } int clear_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, @@ -1461,16 +1469,18 @@ int clear_record_extent_bits(struct extent_io_tree *tree, u64 start, u64 end, int lock_extent_bits(struct extent_io_tree *tree, u64 start, u64 end, struct extent_state **cached_state) { + struct extent_io_extra_options extra_opts = { + .excl_bits = EXTENT_LOCKED, + }; int err; - u64 failed_start; while (1) { err = __set_extent_bit(tree, start, end, EXTENT_LOCKED, - EXTENT_LOCKED, &failed_start, - cached_state, GFP_NOFS, NULL); + cached_state, GFP_NOFS, &extra_opts); if (err == -EEXIST) { - wait_extent_bit(tree, failed_start, end, EXTENT_LOCKED); - start = failed_start; + wait_extent_bit(tree, extra_opts.excl_failed_start, end, + EXTENT_LOCKED); + start = extra_opts.excl_failed_start; } else break; WARN_ON(start > end); @@ -1480,14 +1490,17 @@ int lock_extent_bits(struct extent_io_tree *tree, u64 start, u64 end, int try_lock_extent(struct extent_io_tree *tree, u64 start, u64 end) { + struct extent_io_extra_options extra_opts = { + .excl_bits = EXTENT_LOCKED, + }; int err; - u64 failed_start; - err = __set_extent_bit(tree, start, end, EXTENT_LOCKED, EXTENT_LOCKED, - &failed_start, NULL, GFP_NOFS, NULL); + err = __set_extent_bit(tree, start, end, EXTENT_LOCKED, + NULL, GFP_NOFS, &extra_opts); if (err == -EEXIST) { - if (failed_start > start) - clear_extent_bit(tree, start, failed_start - 1, + if (extra_opts.excl_failed_start > start) + clear_extent_bit(tree, start, + extra_opts.excl_failed_start - 1, EXTENT_LOCKED, 1, 0, NULL); return 0; } From patchwork Wed Oct 21 06:25:07 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848353 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0814CC4363A for ; Wed, 21 Oct 2020 06:26:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B1DC822249 for ; Wed, 21 Oct 2020 06:26:46 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="jBVZhdse" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440759AbgJUG0q (ORCPT ); Wed, 21 Oct 2020 02:26:46 -0400 Received: from mx2.suse.de ([195.135.220.15]:43090 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2408709AbgJUG0p (ORCPT ); Wed, 21 Oct 2020 02:26:45 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261604; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=yAsvUnTjhihyWg22r8hxrNfUDHPJhWIb8XUBYQF7rOw=; b=jBVZhdseOZ5ajHdoQwfxIdyUlkTiWtpq26PCpDPQGSykpoyPju5NKyly6/9Exm101L4JLD AyCZ1IKJr89pNQY2HuhezAkwmSeNzbcdJIAvb93XXqH0ziu6ZCMYCwLtuXuWw1mAwxTmws z+t7VQ6ndFRZdevNTZ5lCF+gfY6h3HA= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id E2E89AC35 for ; Wed, 21 Oct 2020 06:26:43 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 21/68] btrfs: extent_io: sink less common parameters for __clear_extent_bit() Date: Wed, 21 Oct 2020 14:25:07 +0800 Message-Id: <20201021062554.68132-22-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org The following parameters are less commonly used for __clear_extent_bit(): - wake To wake up the waiters - delete For cleanup cases, to remove the extent state regardless of its state - changeset Only utilized for qgroup Sink them into extent_io_extra_options structure. For most callers who don't care these options, we obviously sink some parameters, without any impact. For callers who care these options, we slightly increase the stack usage, as the extent_io_extra options has extra members only for __set_extent_bits(). Signed-off-by: Qu Wenruo --- fs/btrfs/extent-io-tree.h | 30 +++++++++++++++++++------- fs/btrfs/extent_io.c | 45 ++++++++++++++++++++++++++++----------- fs/btrfs/extent_map.c | 2 +- 3 files changed, 56 insertions(+), 21 deletions(-) diff --git a/fs/btrfs/extent-io-tree.h b/fs/btrfs/extent-io-tree.h index dfbb65ac9c8c..2893573eb556 100644 --- a/fs/btrfs/extent-io-tree.h +++ b/fs/btrfs/extent-io-tree.h @@ -102,6 +102,15 @@ struct extent_io_extra_options { * For qgroup related functions. */ struct extent_changeset *changeset; + + /* + * For __clear_extent_bit(). + * @wake: Wake up the waiters. Mostly for EXTENT_LOCKED case + * @delete: Delete the extent regardless of its state. Mostly for + * cleanup. + */ + bool wake; + bool delete; }; int __init extent_state_cache_init(void); @@ -139,9 +148,8 @@ int clear_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, unsigned bits, int wake, int delete, struct extent_state **cached); int __clear_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, - unsigned bits, int wake, int delete, - struct extent_state **cached, gfp_t mask, - struct extent_changeset *changeset); + unsigned bits, struct extent_state **cached_state, + gfp_t mask, struct extent_io_extra_options *extra_opts); static inline int unlock_extent(struct extent_io_tree *tree, u64 start, u64 end) { @@ -151,15 +159,21 @@ static inline int unlock_extent(struct extent_io_tree *tree, u64 start, u64 end) static inline int unlock_extent_cached(struct extent_io_tree *tree, u64 start, u64 end, struct extent_state **cached) { - return __clear_extent_bit(tree, start, end, EXTENT_LOCKED, 1, 0, cached, - GFP_NOFS, NULL); + struct extent_io_extra_options extra_opts = { + .wake = true, + }; + return __clear_extent_bit(tree, start, end, EXTENT_LOCKED, cached, + GFP_NOFS, &extra_opts); } static inline int unlock_extent_cached_atomic(struct extent_io_tree *tree, u64 start, u64 end, struct extent_state **cached) { - return __clear_extent_bit(tree, start, end, EXTENT_LOCKED, 1, 0, cached, - GFP_ATOMIC, NULL); + struct extent_io_extra_options extra_opts = { + .wake = true, + }; + return __clear_extent_bit(tree, start, end, EXTENT_LOCKED, cached, + GFP_ATOMIC, &extra_opts); } static inline int clear_extent_bits(struct extent_io_tree *tree, u64 start, @@ -190,7 +204,7 @@ static inline int set_extent_bits(struct extent_io_tree *tree, u64 start, static inline int clear_extent_uptodate(struct extent_io_tree *tree, u64 start, u64 end, struct extent_state **cached_state) { - return __clear_extent_bit(tree, start, end, EXTENT_UPTODATE, 0, 0, + return __clear_extent_bit(tree, start, end, EXTENT_UPTODATE, cached_state, GFP_NOFS, NULL); } diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 5f899b27962b..98b114becd52 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -688,26 +688,38 @@ static void extent_io_tree_panic(struct extent_io_tree *tree, int err) * or inserting elements in the tree, so the gfp mask is used to * indicate which allocations or sleeping are allowed. * - * pass 'wake' == 1 to kick any sleepers, and 'delete' == 1 to remove - * the given range from the tree regardless of state (ie for truncate). + * extar_opts::wake: To kick any sleeps. + * extra_opts::delete: To remove the given range regardless of state + * (ie for truncate) + * extra_opts::changeset: To record how many bytes are modified and + * which ranges are modified. (for qgroup) * - * the range [start, end] is inclusive. + * The range [start, end] is inclusive. * - * This takes the tree lock, and returns 0 on success and < 0 on error. + * Returns 0 on success + * No error can be returned yet, the ENOMEM for memory is handled by BUG_ON(). */ int __clear_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, - unsigned bits, int wake, int delete, - struct extent_state **cached_state, - gfp_t mask, struct extent_changeset *changeset) + unsigned bits, struct extent_state **cached_state, + gfp_t mask, struct extent_io_extra_options *extra_opts) { + struct extent_changeset *changeset; struct extent_state *state; struct extent_state *cached; struct extent_state *prealloc = NULL; struct rb_node *node; + bool wake; + bool delete; u64 last_end; int err; int clear = 0; + if (!extra_opts) + extra_opts = &default_opts; + changeset = extra_opts->changeset; + wake = extra_opts->wake; + delete = extra_opts->delete; + btrfs_debug_check_extent_io_range(tree, start, end); trace_btrfs_clear_extent_bit(tree, start, end - start + 1, bits); @@ -1445,21 +1457,30 @@ int clear_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, unsigned bits, int wake, int delete, struct extent_state **cached) { - return __clear_extent_bit(tree, start, end, bits, wake, delete, - cached, GFP_NOFS, NULL); + struct extent_io_extra_options extra_opts = { + .wake = wake, + .delete = delete, + }; + + return __clear_extent_bit(tree, start, end, bits, + cached, GFP_NOFS, &extra_opts); } int clear_record_extent_bits(struct extent_io_tree *tree, u64 start, u64 end, unsigned bits, struct extent_changeset *changeset) { + struct extent_io_extra_options extra_opts = { + .changeset = changeset, + }; + /* * Don't support EXTENT_LOCKED case, same reason as * set_record_extent_bits(). */ BUG_ON(bits & EXTENT_LOCKED); - return __clear_extent_bit(tree, start, end, bits, 0, 0, NULL, GFP_NOFS, - changeset); + return __clear_extent_bit(tree, start, end, bits, NULL, GFP_NOFS, + &extra_opts); } /* @@ -4479,7 +4500,7 @@ static int try_release_extent_state(struct extent_io_tree *tree, */ ret = __clear_extent_bit(tree, start, end, ~(EXTENT_LOCKED | EXTENT_NODATASUM), - 0, 0, NULL, mask, NULL); + NULL, mask, NULL); /* if clear_extent_bit failed for enomem reasons, * we can't allow the release to continue. diff --git a/fs/btrfs/extent_map.c b/fs/btrfs/extent_map.c index bd6229fb2b6f..95651ddbb3a7 100644 --- a/fs/btrfs/extent_map.c +++ b/fs/btrfs/extent_map.c @@ -380,7 +380,7 @@ static void extent_map_device_clear_bits(struct extent_map *em, unsigned bits) __clear_extent_bit(&device->alloc_state, stripe->physical, stripe->physical + stripe_size - 1, bits, - 0, 0, NULL, GFP_NOWAIT, NULL); + NULL, GFP_NOWAIT, NULL); } } From patchwork Wed Oct 21 06:25:08 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848347 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A389EC561F8 for ; Wed, 21 Oct 2020 06:26:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4779C22249 for ; Wed, 21 Oct 2020 06:26:48 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="oXgyXtDh" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440761AbgJUG0r (ORCPT ); Wed, 21 Oct 2020 02:26:47 -0400 Received: from mx2.suse.de ([195.135.220.15]:43100 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731630AbgJUG0r (ORCPT ); Wed, 21 Oct 2020 02:26:47 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261606; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ktf6xVJ3A6pAe8XjPeZRKQjD59dj9Vy8Xu06DjXEJpE=; b=oXgyXtDhUtFauWW7468JdTuU5NhyGUS6iiQHAU0YurXniktEj48I7R3nxpb3p3h6xckxpA bRsd3MI7aD1XWw63ifjc+w1DYbMgfxcgIb9sqDhr4WO/4R8bSL7GJYDevVU0/hHV87ner8 eYwqI+8aSdS6Ta25bGMyNRNPLvwLCMM= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 10A4BAC1D for ; Wed, 21 Oct 2020 06:26:46 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 22/68] btrfs: disk_io: grab fs_info from extent_buffer::fs_info directly for btrfs_mark_buffer_dirty() Date: Wed, 21 Oct 2020 14:25:08 +0800 Message-Id: <20201021062554.68132-23-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Since commit f28491e0a6c4 ("Btrfs: move the extent buffer radix tree into the fs_info"), fs_info can be grabbed from extent_buffer directly. So use that extent_buffer::fs_info directly in btrfs_mark_buffer_dirty() to make things a little easier. Signed-off-by: Qu Wenruo Reviewed-by: Goldwyn Rodrigues --- fs/btrfs/disk-io.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index c81b7e53149c..58928076d08d 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -4190,8 +4190,7 @@ int btrfs_buffer_uptodate(struct extent_buffer *buf, u64 parent_transid, void btrfs_mark_buffer_dirty(struct extent_buffer *buf) { - struct btrfs_fs_info *fs_info; - struct btrfs_root *root; + struct btrfs_fs_info *fs_info = buf->fs_info; u64 transid = btrfs_header_generation(buf); int was_dirty; @@ -4204,8 +4203,6 @@ void btrfs_mark_buffer_dirty(struct extent_buffer *buf) if (unlikely(test_bit(EXTENT_BUFFER_UNMAPPED, &buf->bflags))) return; #endif - root = BTRFS_I(buf->pages[0]->mapping->host)->root; - fs_info = root->fs_info; btrfs_assert_tree_locked(buf); if (transid != fs_info->generation) WARN(1, KERN_CRIT "btrfs transid mismatch buffer %llu, found %llu running %llu\n", From patchwork Wed Oct 21 06:25:09 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848355 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9CD76C561F8 for ; Wed, 21 Oct 2020 06:26:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3A14422249 for ; Wed, 21 Oct 2020 06:26:51 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="GjZJUXWe" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440764AbgJUG0u (ORCPT ); Wed, 21 Oct 2020 02:26:50 -0400 Received: from mx2.suse.de ([195.135.220.15]:43164 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731630AbgJUG0u (ORCPT ); Wed, 21 Oct 2020 02:26:50 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261608; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=E4RSujhSu4Qafq4unefxVwBtBbLSONfki7HDYZW7Opo=; b=GjZJUXWeA7/vHisodO0AjuxeRlEXTquEgLC7DL3fJBCftBZMlHKfJHPqLbj6We/88Fi4cb NOWcx5MzuCWR98zaZdqaBhmE2njC0ryUXEENad5jA5D6jGAcrM2DCD3w3oVJkBvWg2mEYd dW2lbwUN7BndhCRD8MP5mdCmXV3uR7I= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 747D9AC1D; Wed, 21 Oct 2020 06:26:48 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Cc: Goldwyn Rodrigues , Nikolay Borisov Subject: [PATCH v4 23/68] btrfs: disk-io: make csum_tree_block() handle sectorsize smaller than page size Date: Wed, 21 Oct 2020 14:25:09 +0800 Message-Id: <20201021062554.68132-24-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org For subpage size support, we only need to handle the first page. To make the code work for both cases, we modify the following behaviors: - num_pages calcuation Instead of "nodesize >> PAGE_SHIFT", we go "DIV_ROUND_UP(nodesize, PAGE_SIZE)", this ensures we get at least one page for subpage size support, while still get the same result for regular page size. - The length for the first run Instead of PAGE_SIZE - BTRFS_CSUM_SIZE, we go min(PAGE_SIZE, nodesize) - BTRFS_CSUM_SIZE. This allows us to handle both cases well. - The start location of the first run Instead of always use BTRFS_CSUM_SIZE as csum start position, add offset_in_page(eb->start) to get proper offset for both cases. Signed-off-by: Goldwyn Rodrigues Signed-off-by: Qu Wenruo Reviewed-by: Nikolay Borisov --- fs/btrfs/disk-io.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 58928076d08d..55bb4f2def3c 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -257,16 +257,16 @@ struct extent_map *btree_get_extent(struct btrfs_inode *inode, static void csum_tree_block(struct extent_buffer *buf, u8 *result) { struct btrfs_fs_info *fs_info = buf->fs_info; - const int num_pages = fs_info->nodesize >> PAGE_SHIFT; + const int num_pages = DIV_ROUND_UP(fs_info->nodesize, PAGE_SIZE); SHASH_DESC_ON_STACK(shash, fs_info->csum_shash); char *kaddr; int i; shash->tfm = fs_info->csum_shash; crypto_shash_init(shash); - kaddr = page_address(buf->pages[0]); + kaddr = page_address(buf->pages[0]) + offset_in_page(buf->start); crypto_shash_update(shash, kaddr + BTRFS_CSUM_SIZE, - PAGE_SIZE - BTRFS_CSUM_SIZE); + min_t(u32, PAGE_SIZE, fs_info->nodesize) - BTRFS_CSUM_SIZE); for (i = 1; i < num_pages; i++) { kaddr = page_address(buf->pages[i]); From patchwork Wed Oct 21 06:25:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848361 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8253FC4363A for ; Wed, 21 Oct 2020 06:26:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 367AF22249 for ; Wed, 21 Oct 2020 06:26:53 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="jz7KyUos" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440767AbgJUG0w (ORCPT ); Wed, 21 Oct 2020 02:26:52 -0400 Received: from mx2.suse.de ([195.135.220.15]:43214 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731630AbgJUG0w (ORCPT ); Wed, 21 Oct 2020 02:26:52 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261610; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cawPiZZ87qrJlrJzuQaVsonurC0ql6oGPeNQMYAhc00=; b=jz7KyUosSAdVVY0vnYpsodIZVqk29wIj4HtXwZtdl52JQMCoKdVesJmyRvdY/LCfNX+vIX t/Opsdja1GDcUxQwe4i/HQ0+CTjtsYDFyfQ9ba1PkXPzTrA1O4SEJ046wSH3oUfMiNGAMr piqJyac3+uOGrfrRtn1PQPwy6wCzMUQ= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 45188AC35 for ; Wed, 21 Oct 2020 06:26:50 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 24/68] btrfs: disk-io: extract the extent buffer verification from btree_readpage_end_io_hook() Date: Wed, 21 Oct 2020 14:25:10 +0800 Message-Id: <20201021062554.68132-25-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Currently btree_readpage_end_io_hook() only needs to handle one extent buffer as currently one page only maps to one extent buffer. But for incoming subpage support, one page can be mapped to multiple extent buffers, thus we can no longer use current code. This refactor would allow us to call btrfs_check_extent_buffer() on all involved extent buffers at btree_readpage_end_io_hook() and other locations. Signed-off-by: Qu Wenruo --- fs/btrfs/disk-io.c | 78 ++++++++++++++++++++++++++-------------------- 1 file changed, 44 insertions(+), 34 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 55bb4f2def3c..ee2a6d480a7d 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -574,60 +574,37 @@ static int check_tree_block_fsid(struct extent_buffer *eb) return ret; } -static int btree_readpage_end_io_hook(struct btrfs_io_bio *io_bio, - u64 phy_offset, struct page *page, - u64 start, u64 end, int mirror) +/* Do basic extent buffer check at read time */ +static int btrfs_check_extent_buffer(struct extent_buffer *eb) { - u64 found_start; - int found_level; - struct extent_buffer *eb; - struct btrfs_fs_info *fs_info; + struct btrfs_fs_info *fs_info = eb->fs_info; u16 csum_size; - int ret = 0; + u64 found_start; + u8 found_level; u8 result[BTRFS_CSUM_SIZE]; - int reads_done; - - if (!page->private) - goto out; + int ret = 0; - eb = (struct extent_buffer *)page->private; - fs_info = eb->fs_info; csum_size = btrfs_super_csum_size(fs_info->super_copy); - /* the pending IO might have been the only thing that kept this buffer - * in memory. Make sure we have a ref for all this other checks - */ - atomic_inc(&eb->refs); - - reads_done = atomic_dec_and_test(&eb->io_pages); - if (!reads_done) - goto err; - - eb->read_mirror = mirror; - if (test_bit(EXTENT_BUFFER_READ_ERR, &eb->bflags)) { - ret = -EIO; - goto err; - } - found_start = btrfs_header_bytenr(eb); if (found_start != eb->start) { btrfs_err_rl(fs_info, "bad tree block start, want %llu have %llu", eb->start, found_start); ret = -EIO; - goto err; + goto out; } if (check_tree_block_fsid(eb)) { btrfs_err_rl(fs_info, "bad fsid on block %llu", eb->start); ret = -EIO; - goto err; + goto out; } found_level = btrfs_header_level(eb); if (found_level >= BTRFS_MAX_LEVEL) { btrfs_err(fs_info, "bad tree block level %d on %llu", (int)btrfs_header_level(eb), eb->start); ret = -EIO; - goto err; + goto out; } btrfs_set_buffer_lockdep_class(btrfs_header_owner(eb), @@ -647,7 +624,7 @@ static int btree_readpage_end_io_hook(struct btrfs_io_bio *io_bio, fs_info->sb->s_id, eb->start, val, found, btrfs_header_level(eb)); ret = -EUCLEAN; - goto err; + goto out; } /* @@ -669,6 +646,40 @@ static int btree_readpage_end_io_hook(struct btrfs_io_bio *io_bio, btrfs_err(fs_info, "block=%llu read time tree block corruption detected", eb->start); +out: + return ret; +} + +static int btree_readpage_end_io_hook(struct btrfs_io_bio *io_bio, + u64 phy_offset, struct page *page, + u64 start, u64 end, int mirror) +{ + struct extent_buffer *eb; + int ret = 0; + bool reads_done; + + /* Metadata pages that goes through IO should all have private set */ + ASSERT(PagePrivate(page) && page->private); + eb = (struct extent_buffer *)page->private; + + /* + * The pending IO might have been the only thing that kept this buffer + * in memory. Make sure we have a ref for all this other checks + */ + atomic_inc(&eb->refs); + + reads_done = atomic_dec_and_test(&eb->io_pages); + if (!reads_done) + goto err; + + eb->read_mirror = mirror; + if (test_bit(EXTENT_BUFFER_READ_ERR, &eb->bflags)) { + ret = -EIO; + goto err; + } + + ret = btrfs_check_extent_buffer(eb); + err: if (reads_done && test_and_clear_bit(EXTENT_BUFFER_READAHEAD, &eb->bflags)) @@ -684,7 +695,6 @@ static int btree_readpage_end_io_hook(struct btrfs_io_bio *io_bio, clear_extent_buffer_uptodate(eb); } free_extent_buffer(eb); -out: return ret; } From patchwork Wed Oct 21 06:25:11 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848359 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 57D14C4363A for ; Wed, 21 Oct 2020 06:26:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 141C522249 for ; Wed, 21 Oct 2020 06:26:56 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="B0/r6Cgx" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440770AbgJUG0z (ORCPT ); Wed, 21 Oct 2020 02:26:55 -0400 Received: from mx2.suse.de ([195.135.220.15]:43234 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731630AbgJUG0z (ORCPT ); Wed, 21 Oct 2020 02:26:55 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261614; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2grTSz56albd0BUDyalScJ+xvMEWdyaflq7+LcK8vwg=; b=B0/r6Cgxcfd6g0/jAlSjCFmGQXR48H6qpsTSlCdsR5dSQF4C3/snnDS6KJEyH56tC+Cckq 3bWkDsKoT5ecWe6e/j8d/EvbvdTCd3bU/IuBuPTsrDM0wIwdbJgqHD2GtmnR6RGJEFE8MW JxPurkquHtv5Ib85Vz5G8eYTfldKThI= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 16AB9AC1D for ; Wed, 21 Oct 2020 06:26:54 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 25/68] btrfs: disk-io: accept bvec directly for csum_dirty_buffer() Date: Wed, 21 Oct 2020 14:25:11 +0800 Message-Id: <20201021062554.68132-26-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Currently csum_dirty_buffer() uses page to grab extent buffer, but that only works for regular sector size == PAGE_SIZE case. For subpage we need page + page_offset to grab extent buffer. This patch will change csum_dirty_buffer() to accept bvec directly so that we can extract both page and page_offset for later subpage support. Signed-off-by: Qu Wenruo --- fs/btrfs/disk-io.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index ee2a6d480a7d..b34a3f312e0c 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -495,13 +495,14 @@ static int btree_read_extent_buffer_pages(struct extent_buffer *eb, * we only fill in the checksum field in the first page of a multi-page block */ -static int csum_dirty_buffer(struct btrfs_fs_info *fs_info, struct page *page) +static int csum_dirty_buffer(struct btrfs_fs_info *fs_info, struct bio_vec *bvec) { + struct extent_buffer *eb; + struct page *page = bvec->bv_page; u64 start = page_offset(page); u64 found_start; u8 result[BTRFS_CSUM_SIZE]; u16 csum_size = btrfs_super_csum_size(fs_info->super_copy); - struct extent_buffer *eb; int ret; eb = (struct extent_buffer *)page->private; @@ -848,7 +849,7 @@ static blk_status_t btree_csum_one_bio(struct bio *bio) ASSERT(!bio_flagged(bio, BIO_CLONED)); bio_for_each_segment_all(bvec, bio, iter_all) { root = BTRFS_I(bvec->bv_page->mapping->host)->root; - ret = csum_dirty_buffer(root->fs_info, bvec->bv_page); + ret = csum_dirty_buffer(root->fs_info, bvec); if (ret) break; } From patchwork Wed Oct 21 06:25:12 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848357 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 03D07C4363A for ; Wed, 21 Oct 2020 06:27:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A125222249 for ; Wed, 21 Oct 2020 06:27:00 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="htGv9dFy" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440773AbgJUG07 (ORCPT ); Wed, 21 Oct 2020 02:26:59 -0400 Received: from mx2.suse.de ([195.135.220.15]:43568 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731630AbgJUG07 (ORCPT ); Wed, 21 Oct 2020 02:26:59 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261617; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=x0hTzaWvJOVg+KlsXb8kgInKkCDhXFvNHeDA9RQUfjM=; b=htGv9dFy77g4uNgbW7235LdLa3TS8uoqcnbPJhhwAkzp70BOi8np02WWcJJsoqOzyfx3Kf 9BlKzb9jZIKtaQIlJODlNO587ItGj2frh+tIaMq0i+G6+TY8DEI9DLXJhO/h0Oj3FfE4CY nvsjp5Ig5WM8oKtgQaWpYHSF9Y6SFxY= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 7C538AC1D; Wed, 21 Oct 2020 06:26:57 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Cc: Goldwyn Rodrigues Subject: [PATCH v4 26/68] btrfs: inode: make btrfs_readpage_end_io_hook() follow sector size Date: Wed, 21 Oct 2020 14:25:12 +0800 Message-Id: <20201021062554.68132-27-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Currently btrfs_readpage_end_io_hook() just pass the whole page to check_data_csum(), which is fine since we only support sectorsize == PAGE_SIZE. To support subpage, we need to properly honor per-sector checksum verification, just like what we did in dio read path. This patch will do the csum verification in a for loop, starts with pg_off == start - page_offset(page), with sectorsize increasement for each loop. For sectorsize == PAGE_SIZE case, the pg_off will always be 0, and we will only finish with just one loop. For subpage, we do the proper loop. Signed-off-by: Goldwyn Rodrigues Signed-off-by: Qu Wenruo --- fs/btrfs/inode.c | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 24fbf2c46e56..f22ee5d3c105 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -2849,9 +2849,12 @@ static int btrfs_readpage_end_io_hook(struct btrfs_io_bio *io_bio, u64 start, u64 end, int mirror) { size_t offset = start - page_offset(page); + size_t pg_off; struct inode *inode = page->mapping->host; struct extent_io_tree *io_tree = &BTRFS_I(inode)->io_tree; struct btrfs_root *root = BTRFS_I(inode)->root; + u32 sectorsize = root->fs_info->sectorsize; + bool found_err = false; if (PageChecked(page)) { ClearPageChecked(page); @@ -2868,7 +2871,17 @@ static int btrfs_readpage_end_io_hook(struct btrfs_io_bio *io_bio, } phy_offset >>= inode->i_sb->s_blocksize_bits; - return check_data_csum(inode, io_bio, phy_offset, page, offset); + for (pg_off = offset; pg_off < end - page_offset(page); + pg_off += sectorsize, phy_offset++) { + int ret; + + ret = check_data_csum(inode, io_bio, phy_offset, page, pg_off); + if (ret < 0) + found_err = true; + } + if (found_err) + return -EIO; + return 0; } /* From patchwork Wed Oct 21 06:25:13 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848365 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 14F80C4363A for ; Wed, 21 Oct 2020 06:27:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B457022249 for ; Wed, 21 Oct 2020 06:27:02 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="sMC0yKND" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440776AbgJUG1C (ORCPT ); Wed, 21 Oct 2020 02:27:02 -0400 Received: from mx2.suse.de ([195.135.220.15]:43610 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731630AbgJUG1B (ORCPT ); Wed, 21 Oct 2020 02:27:01 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261620; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9tM6N148Ox/Nh5KjVgE5urdKTEGEu8tgI2CnSGEo4UU=; b=sMC0yKNDWN2dMyD9LHwHjQbXNGtkkSsLl5b20Sr6spx7I2En/bLgDYoIp6oalrUrrT3N+z sl0YY/Nkf/hsiRb3We8gBYa0YOvRY/Ms7+4wT98u12m1cLm6IGGxJgxC6SHHrxS6h4Bqg5 YbkJe8fQnkiGLe+E0vL9RmS04MWGH1E= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 4F457AC35 for ; Wed, 21 Oct 2020 06:27:00 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 27/68] btrfs: introduce a helper to determine if the sectorsize is smaller than PAGE_SIZE Date: Wed, 21 Oct 2020 14:25:13 +0800 Message-Id: <20201021062554.68132-28-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Just to save us several letters for the incoming patches. Signed-off-by: Qu Wenruo --- fs/btrfs/ctree.h | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 9a72896bed2e..e3501dad88e2 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -3532,6 +3532,11 @@ static inline int btrfs_defrag_cancelled(struct btrfs_fs_info *fs_info) return signal_pending(current); } +static inline bool btrfs_is_subpage(struct btrfs_fs_info *fs_info) +{ + return (fs_info->sectorsize < PAGE_SIZE); +} + #define in_range(b, first, len) ((b) >= (first) && (b) < (first) + (len)) /* Sanity test specific functions */ From patchwork Wed Oct 21 06:25:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848369 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E5BD9C561F8 for ; Wed, 21 Oct 2020 06:27:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8384B22249 for ; Wed, 21 Oct 2020 06:27:05 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="Cd+ENuNo" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440780AbgJUG1E (ORCPT ); Wed, 21 Oct 2020 02:27:04 -0400 Received: from mx2.suse.de ([195.135.220.15]:43652 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731630AbgJUG1E (ORCPT ); Wed, 21 Oct 2020 02:27:04 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261622; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=wFyIRPEHArhl359hA+i5LbeTIDNEcMD4Zy8q8/tiflY=; b=Cd+ENuNojJ15xdOHitf3qigr+rUwSWCj+VJErlf8ULU0i3usP7DckG50KTkzGiKoNgjUEJ lqFwAuQEqkIdP3FrOr0K8PxaHriGRsvGN8UhoXPFOcH4lZeOGIFnElYH2/h0PdqXxYcGxK 1JsFLHDPvyqYyAN9zGIfMxrzlH8RSww= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 2BDA7AC1D for ; Wed, 21 Oct 2020 06:27:02 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 28/68] btrfs: extent_io: allow find_first_extent_bit() to find a range with exact bits match Date: Wed, 21 Oct 2020 14:25:14 +0800 Message-Id: <20201021062554.68132-29-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Currently if we pass mutliple @bits to find_first_extent_bit(), it will return the first range with one or more bits matching @bits. This is fine for current code, since most of them are just doing their own extra checks, and all existing callers only call it with 1 or 2 bits. But for the incoming subpage support, we want the ability to return range with exact match, so that caller can skip some extra checks. So this patch will add a new bool parameter, @exact_match, to find_first_extent_bit() and its callees. Currently all callers just pass 'false' to the new parameter, thus no functional change is introduced. Signed-off-by: Qu Wenruo --- fs/btrfs/block-group.c | 2 +- fs/btrfs/disk-io.c | 4 ++-- fs/btrfs/extent-io-tree.h | 2 +- fs/btrfs/extent-tree.c | 2 +- fs/btrfs/extent_io.c | 42 +++++++++++++++++++++++++------------ fs/btrfs/free-space-cache.c | 2 +- fs/btrfs/relocation.c | 2 +- fs/btrfs/transaction.c | 4 ++-- fs/btrfs/volumes.c | 2 +- 9 files changed, 39 insertions(+), 23 deletions(-) diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c index ea8aaf36647e..7e6ab6b765f6 100644 --- a/fs/btrfs/block-group.c +++ b/fs/btrfs/block-group.c @@ -461,7 +461,7 @@ u64 add_new_free_space(struct btrfs_block_group *block_group, u64 start, u64 end ret = find_first_extent_bit(&info->excluded_extents, start, &extent_start, &extent_end, EXTENT_DIRTY | EXTENT_UPTODATE, - NULL); + false, NULL); if (ret) break; diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index b34a3f312e0c..1ca121ca28aa 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -4516,7 +4516,7 @@ static int btrfs_destroy_marked_extents(struct btrfs_fs_info *fs_info, while (1) { ret = find_first_extent_bit(dirty_pages, start, &start, &end, - mark, NULL); + mark, false, NULL); if (ret) break; @@ -4556,7 +4556,7 @@ static int btrfs_destroy_pinned_extent(struct btrfs_fs_info *fs_info, */ mutex_lock(&fs_info->unused_bg_unpin_mutex); ret = find_first_extent_bit(unpin, 0, &start, &end, - EXTENT_DIRTY, &cached_state); + EXTENT_DIRTY, false, &cached_state); if (ret) { mutex_unlock(&fs_info->unused_bg_unpin_mutex); break; diff --git a/fs/btrfs/extent-io-tree.h b/fs/btrfs/extent-io-tree.h index 2893573eb556..48fdaf5f3a19 100644 --- a/fs/btrfs/extent-io-tree.h +++ b/fs/btrfs/extent-io-tree.h @@ -258,7 +258,7 @@ static inline int set_extent_uptodate(struct extent_io_tree *tree, u64 start, int find_first_extent_bit(struct extent_io_tree *tree, u64 start, u64 *start_ret, u64 *end_ret, unsigned bits, - struct extent_state **cached_state); + bool exact_match, struct extent_state **cached_state); void find_first_clear_extent_bit(struct extent_io_tree *tree, u64 start, u64 *start_ret, u64 *end_ret, unsigned bits); int find_contiguous_extent_bit(struct extent_io_tree *tree, u64 start, diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index e9eedc053fc5..406329dabb48 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -2880,7 +2880,7 @@ int btrfs_finish_extent_commit(struct btrfs_trans_handle *trans) mutex_lock(&fs_info->unused_bg_unpin_mutex); ret = find_first_extent_bit(unpin, 0, &start, &end, - EXTENT_DIRTY, &cached_state); + EXTENT_DIRTY, false, &cached_state); if (ret) { mutex_unlock(&fs_info->unused_bg_unpin_mutex); break; diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 98b114becd52..37c721294ffe 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -1559,13 +1559,27 @@ void extent_range_redirty_for_io(struct inode *inode, u64 start, u64 end) } } -/* find the first state struct with 'bits' set after 'start', and - * return it. tree->lock must be held. NULL will returned if - * nothing was found after 'start' +static bool match_extent_state(struct extent_state *state, unsigned bits, + bool exact_match) +{ + if (exact_match) + return ((state->state & bits) == bits); + return (state->state & bits); +} + +/* + * Find the first state struct with @bits set after @start. + * + * NOTE: tree->lock must be hold. + * + * @exact_match: Do we need to have all @bits set, or just any of + * the @bits. + * + * Return NULL if we can't find a match. */ static struct extent_state * find_first_extent_bit_state(struct extent_io_tree *tree, - u64 start, unsigned bits) + u64 start, unsigned bits, bool exact_match) { struct rb_node *node; struct extent_state *state; @@ -1580,7 +1594,8 @@ find_first_extent_bit_state(struct extent_io_tree *tree, while (1) { state = rb_entry(node, struct extent_state, rb_node); - if (state->end >= start && (state->state & bits)) + if (state->end >= start && + match_extent_state(state, bits, exact_match)) return state; node = rb_next(node); @@ -1601,7 +1616,7 @@ find_first_extent_bit_state(struct extent_io_tree *tree, */ int find_first_extent_bit(struct extent_io_tree *tree, u64 start, u64 *start_ret, u64 *end_ret, unsigned bits, - struct extent_state **cached_state) + bool exact_match, struct extent_state **cached_state) { struct extent_state *state; int ret = 1; @@ -1611,7 +1626,8 @@ int find_first_extent_bit(struct extent_io_tree *tree, u64 start, state = *cached_state; if (state->end == start - 1 && extent_state_in_tree(state)) { while ((state = next_state(state)) != NULL) { - if (state->state & bits) + if (match_extent_state(state, bits, + exact_match)) goto got_it; } free_extent_state(*cached_state); @@ -1622,7 +1638,7 @@ int find_first_extent_bit(struct extent_io_tree *tree, u64 start, *cached_state = NULL; } - state = find_first_extent_bit_state(tree, start, bits); + state = find_first_extent_bit_state(tree, start, bits, exact_match); got_it: if (state) { cache_state_if_flags(state, cached_state, 0); @@ -1657,7 +1673,7 @@ int find_contiguous_extent_bit(struct extent_io_tree *tree, u64 start, int ret = 1; spin_lock(&tree->lock); - state = find_first_extent_bit_state(tree, start, bits); + state = find_first_extent_bit_state(tree, start, bits, false); if (state) { *start_ret = state->start; *end_ret = state->end; @@ -2443,9 +2459,8 @@ int clean_io_failure(struct btrfs_fs_info *fs_info, goto out; spin_lock(&io_tree->lock); - state = find_first_extent_bit_state(io_tree, - failrec->start, - EXTENT_LOCKED); + state = find_first_extent_bit_state(io_tree, failrec->start, + EXTENT_LOCKED, false); spin_unlock(&io_tree->lock); if (state && state->start <= failrec->start && @@ -2481,7 +2496,8 @@ void btrfs_free_io_failure_record(struct btrfs_inode *inode, u64 start, u64 end) return; spin_lock(&failure_tree->lock); - state = find_first_extent_bit_state(failure_tree, start, EXTENT_DIRTY); + state = find_first_extent_bit_state(failure_tree, start, EXTENT_DIRTY, + false); while (state) { if (state->start > end) break; diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c index dc82fd0c80cb..1533df86536b 100644 --- a/fs/btrfs/free-space-cache.c +++ b/fs/btrfs/free-space-cache.c @@ -1093,7 +1093,7 @@ static noinline_for_stack int write_pinned_extent_entries( while (start < block_group->start + block_group->length) { ret = find_first_extent_bit(unpin, start, &extent_start, &extent_end, - EXTENT_DIRTY, NULL); + EXTENT_DIRTY, false, NULL); if (ret) return 0; diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c index 4ba1ab9cc76d..77a7e35a500c 100644 --- a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -3153,7 +3153,7 @@ int find_next_extent(struct reloc_control *rc, struct btrfs_path *path, ret = find_first_extent_bit(&rc->processed_blocks, key.objectid, &start, &end, - EXTENT_DIRTY, NULL); + EXTENT_DIRTY, false, NULL); if (ret == 0 && start <= key.objectid) { btrfs_release_path(path); diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c index 20c6ac1a5de7..5b3444641ea5 100644 --- a/fs/btrfs/transaction.c +++ b/fs/btrfs/transaction.c @@ -974,7 +974,7 @@ int btrfs_write_marked_extents(struct btrfs_fs_info *fs_info, atomic_inc(&BTRFS_I(fs_info->btree_inode)->sync_writers); while (!find_first_extent_bit(dirty_pages, start, &start, &end, - mark, &cached_state)) { + mark, false, &cached_state)) { bool wait_writeback = false; err = convert_extent_bit(dirty_pages, start, end, @@ -1029,7 +1029,7 @@ static int __btrfs_wait_marked_extents(struct btrfs_fs_info *fs_info, u64 end; while (!find_first_extent_bit(dirty_pages, start, &start, &end, - EXTENT_NEED_WAIT, &cached_state)) { + EXTENT_NEED_WAIT, false, &cached_state)) { /* * Ignore -ENOMEM errors returned by clear_extent_bit(). * When committing the transaction, we'll remove any entries diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 214856c4ccb1..c54329e92ced 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -1382,7 +1382,7 @@ static bool contains_pending_extent(struct btrfs_device *device, u64 *start, if (!find_first_extent_bit(&device->alloc_state, *start, &physical_start, &physical_end, - CHUNK_ALLOCATED, NULL)) { + CHUNK_ALLOCATED, false, NULL)) { if (in_range(physical_start, *start, len) || in_range(*start, physical_start, From patchwork Wed Oct 21 06:25:15 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848363 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A185BC4363A for ; Wed, 21 Oct 2020 06:27:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4968822249 for ; Wed, 21 Oct 2020 06:27:06 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="O8D0XuSb" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440782AbgJUG1F (ORCPT ); Wed, 21 Oct 2020 02:27:05 -0400 Received: from mx2.suse.de ([195.135.220.15]:43704 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2440777AbgJUG1E (ORCPT ); Wed, 21 Oct 2020 02:27:04 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261623; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xZs56ZrWz1mX7a6TkMnY5loO3liu+lj87R4D6B7Q+uU=; b=O8D0XuSb9p/CvuXgyK1sO5zMHBVmPDhxsbCqpDtrZndfuZTlozfCVZ7GHOsoDlEtuOeI0/ NfiTYkjPy05pN+zla1Ir2PA3Y+dbkN1QtLgA3l2o7QJGMaskaCeq0Eb8j4W3kP+He4as6n o7uSZer3/IX1k2brt6TT+cKlhzeTqTs= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id C8758AC35 for ; Wed, 21 Oct 2020 06:27:03 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 29/68] btrfs: extent_io: don't allow tree block to cross page boundary for subpage support Date: Wed, 21 Oct 2020 14:25:15 +0800 Message-Id: <20201021062554.68132-30-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org As a preparation for subpage sector size support (allowing filesystem with sector size smaller than page size to be mounted) if the sector size is smaller than page size, we don't allow tree block to be read if it crosses 64K(*) boundary. The 64K is selected because: - We are only going to support 64K page size for subpage for now - 64K is also the max node size btrfs supports This ensures that, tree blocks are always contained in one page for a system with 64K page size, which can greatly simplify the handling. Or we need to do complex multi-page handling for tree blocks. Currently the only way to create such tree blocks crossing 64K boundary is by btrfs-convert, which will get fixed soon and doesn't get wide-spread usage. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 37c721294ffe..6f41371290e2 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -5298,6 +5298,13 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, btrfs_err(fs_info, "bad tree block start %llu", start); return ERR_PTR(-EINVAL); } + if (btrfs_is_subpage(fs_info) && round_down(start, PAGE_SIZE) != + round_down(start + len - 1, PAGE_SIZE)) { + btrfs_err(fs_info, + "tree block crosses page boundary, start %llu nodesize %lu", + start, len); + return ERR_PTR(-EINVAL); + } eb = find_extent_buffer(fs_info, start); if (eb) From patchwork Wed Oct 21 06:25:16 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848377 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B2D5CC561F8 for ; Wed, 21 Oct 2020 06:27:08 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5826922249 for ; Wed, 21 Oct 2020 06:27:08 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="L8fU3tYP" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440786AbgJUG1H (ORCPT ); Wed, 21 Oct 2020 02:27:07 -0400 Received: from mx2.suse.de ([195.135.220.15]:43744 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2440783AbgJUG1G (ORCPT ); Wed, 21 Oct 2020 02:27:06 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261625; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+4NijXaoGWKrqkulDOUvaZmN5yE7iYY8ZfK84iLI580=; b=L8fU3tYP8AR2zB5w47fxIvU0BXQI4Hq2AG/S64OTcUlzDgJRVSFo/Zg0BjgsDRnp7Z183r 7tLEZVmjyh5/uM5kb6r6/idWxM8Coy4JY5SEUQ16g6Heb9SjSg2wSLpKiRTV6+Kbaz8fCD GAEejq62btjXsuclbvhfCJKkgWLmIrA= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id AB14CAC1D for ; Wed, 21 Oct 2020 06:27:05 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 30/68] btrfs: extent_io: update num_extent_pages() to support subpage sized extent buffer Date: Wed, 21 Oct 2020 14:25:16 +0800 Message-Id: <20201021062554.68132-31-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org For subpage sized extent buffer, we have ensured no extent buffer will cross page boundary, thus we would only need one page for any extent buffer. This patch will update the function num_extent_pages() to handle such case. Now num_extent_pages() would return 1 instead of for subpage sized extent buffer. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.h | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h index e588b3100ede..552afc1c0bbc 100644 --- a/fs/btrfs/extent_io.h +++ b/fs/btrfs/extent_io.h @@ -229,8 +229,15 @@ void wait_on_extent_buffer_writeback(struct extent_buffer *eb); static inline int num_extent_pages(const struct extent_buffer *eb) { - return (round_up(eb->start + eb->len, PAGE_SIZE) >> PAGE_SHIFT) - - (eb->start >> PAGE_SHIFT); + /* + * For sectorsize == PAGE_SIZE case, since eb is always aligned to + * sectorsize, it's just (eb->len / PAGE_SIZE) >> PAGE_SHIFT. + * + * For sectorsize < PAGE_SIZE case, we only want to support 64K + * PAGE_SIZE, and ensured all tree blocks won't cross page boundary. + * So in that case we always got 1 page. + */ + return (round_up(eb->len, PAGE_SIZE) >> PAGE_SHIFT); } static inline int extent_buffer_uptodate(const struct extent_buffer *eb) From patchwork Wed Oct 21 06:25:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848367 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 18054C4363A for ; Wed, 21 Oct 2020 06:27:12 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9CF7B22249 for ; Wed, 21 Oct 2020 06:27:11 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="f7Y5d7vV" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440792AbgJUG1L (ORCPT ); Wed, 21 Oct 2020 02:27:11 -0400 Received: from mx2.suse.de ([195.135.220.15]:43764 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2440783AbgJUG1K (ORCPT ); Wed, 21 Oct 2020 02:27:10 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261628; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=WIM/Vg1zXhj5rgqwOiKXLmFhyArsYyHo7pCpYHFCu/g=; b=f7Y5d7vVvGOLy/4R7JB17bZNE0qwK1tMPzG+YVnOl1HP+KBjAIRrFYiqCw87w50PTe+i8l xJc/6J8LU0vjbZ223QkwW/WRlLZsI9/ngLWcfRz5uSYWztlFotZM+NgkJ8uRETWhFI1Mia FeVSD784EvlVkgNFYXBIiFzIBaQaHwY= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id DCAB1AC1D; Wed, 21 Oct 2020 06:27:07 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Cc: Goldwyn Rodrigues Subject: [PATCH v4 31/68] btrfs: handle sectorsize < PAGE_SIZE case for extent buffer accessors Date: Wed, 21 Oct 2020 14:25:17 +0800 Message-Id: <20201021062554.68132-32-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org To support sectorsize < PAGE_SIZE case, we need to take extra care for extent buffer accessors. Since sectorsize is smaller than PAGE_SIZE, one page can contain multiple tree blocks, we must use eb->start to determine the real offset to read/write for extent buffer accessors. This patch introduces two helpers to do these: - get_eb_page_index() This is to calculate the index to access extent_buffer::pages. It's just a simple wrapper around "start >> PAGE_SHIFT". For sectorsize == PAGE_SIZE case, nothing is changed. For sectorsize < PAGE_SIZE case, we always get index as 0, and the existing page shift works also fine. - get_eb_page_offset() This is to calculate the offset to access extent_buffer::pages. This needs to take extent_buffer::start into consideration. For sectorsize == PAGE_SIZE case, extent_buffer::start is always aligned to PAGE_SIZE, thus adding extent_buffer::start to offset_in_page() won't change the result. For sectorsize < PAGE_SIZE case, adding extent_buffer::start gives us the correct offset to access. This patch will touch the following parts to cover all extent buffer accessors: - BTRFS_SETGET_HEADER_FUNCS() - read_extent_buffer() - read_extent_buffer_to_user() - memcmp_extent_buffer() - write_extent_buffer_chunk_tree_uuid() - write_extent_buffer_fsid() - write_extent_buffer() - memzero_extent_buffer() - copy_extent_buffer_full() - copy_extent_buffer() - memcpy_extent_buffer() - memmove_extent_buffer() - btrfs_get_token_##bits() - btrfs_get_##bits() - btrfs_set_token_##bits() - btrfs_set_##bits() - generic_bin_search() Signed-off-by: Goldwyn Rodrigues Signed-off-by: Qu Wenruo --- fs/btrfs/ctree.c | 5 ++-- fs/btrfs/ctree.h | 38 ++++++++++++++++++++++-- fs/btrfs/extent_io.c | 66 ++++++++++++++++++++++++----------------- fs/btrfs/struct-funcs.c | 18 ++++++----- 4 files changed, 88 insertions(+), 39 deletions(-) diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c index cd392da69b81..0f6944a3a836 100644 --- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@ -1712,10 +1712,11 @@ static noinline int generic_bin_search(struct extent_buffer *eb, oip = offset_in_page(offset); if (oip + key_size <= PAGE_SIZE) { - const unsigned long idx = offset >> PAGE_SHIFT; + const unsigned long idx = get_eb_page_index(offset); char *kaddr = page_address(eb->pages[idx]); - tmp = (struct btrfs_disk_key *)(kaddr + oip); + tmp = (struct btrfs_disk_key *)(kaddr + + get_eb_page_offset(eb, offset)); } else { read_extent_buffer(eb, &unaligned, offset, key_size); tmp = &unaligned; diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index e3501dad88e2..0c3ea3599dc7 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -1448,14 +1448,15 @@ static inline void btrfs_set_token_##name(struct btrfs_map_token *token,\ #define BTRFS_SETGET_HEADER_FUNCS(name, type, member, bits) \ static inline u##bits btrfs_##name(const struct extent_buffer *eb) \ { \ - const type *p = page_address(eb->pages[0]); \ + const type *p = page_address(eb->pages[0]) + \ + offset_in_page(eb->start); \ u##bits res = le##bits##_to_cpu(p->member); \ return res; \ } \ static inline void btrfs_set_##name(const struct extent_buffer *eb, \ u##bits val) \ { \ - type *p = page_address(eb->pages[0]); \ + type *p = page_address(eb->pages[0]) + offset_in_page(eb->start); \ p->member = cpu_to_le##bits(val); \ } @@ -3241,6 +3242,39 @@ static inline void assertfail(const char *expr, const char* file, int line) { } #define ASSERT(expr) (void)(expr) #endif +/* + * Get the correct offset inside the page of extent buffer. + * + * Will handle both sectorsize == PAGE_SIZE and sectorsize < PAGE_SIZE cases. + * + * @eb: The target extent buffer + * @start: The offset inside the extent buffer + */ +static inline size_t get_eb_page_offset(const struct extent_buffer *eb, + unsigned long offset_in_eb) +{ + /* + * For sectorsize == PAGE_SIZE case, eb->start will always be aligned + * to PAGE_SIZE, thus adding it won't cause any difference. + * + * For sectorsize < PAGE_SIZE, we must only read the data belongs to + * the eb, thus we have to take the eb->start into consideration. + */ + return offset_in_page(offset_in_eb + eb->start); +} + +static inline unsigned long get_eb_page_index(unsigned long offset_in_eb) +{ + /* + * For sectorsize == PAGE_SIZE case, plain >> PAGE_SHIFT is enough. + * + * For sectorsize < PAGE_SIZE case, we only support 64K PAGE_SIZE, + * and has ensured all tree blocks are contained in one page, thus + * we always get index == 0. + */ + return offset_in_eb >> PAGE_SHIFT; +} + /* * Use that for functions that are conditionally exported for sanity tests but * otherwise static diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 6f41371290e2..ea248e2689c9 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -5703,7 +5703,7 @@ void read_extent_buffer(const struct extent_buffer *eb, void *dstv, struct page *page; char *kaddr; char *dst = (char *)dstv; - unsigned long i = start >> PAGE_SHIFT; + unsigned long i = get_eb_page_index(start); if (start + len > eb->len) { WARN(1, KERN_ERR "btrfs bad mapping eb start %llu len %lu, wanted %lu %lu\n", @@ -5712,7 +5712,7 @@ void read_extent_buffer(const struct extent_buffer *eb, void *dstv, return; } - offset = offset_in_page(start); + offset = get_eb_page_offset(eb, start); while (len > 0) { page = eb->pages[i]; @@ -5737,13 +5737,13 @@ int read_extent_buffer_to_user_nofault(const struct extent_buffer *eb, struct page *page; char *kaddr; char __user *dst = (char __user *)dstv; - unsigned long i = start >> PAGE_SHIFT; + unsigned long i = get_eb_page_index(start); int ret = 0; WARN_ON(start > eb->len); WARN_ON(start + len > eb->start + eb->len); - offset = offset_in_page(start); + offset = get_eb_page_offset(eb, start); while (len > 0) { page = eb->pages[i]; @@ -5772,13 +5772,13 @@ int memcmp_extent_buffer(const struct extent_buffer *eb, const void *ptrv, struct page *page; char *kaddr; char *ptr = (char *)ptrv; - unsigned long i = start >> PAGE_SHIFT; + unsigned long i = get_eb_page_index(start); int ret = 0; WARN_ON(start > eb->len); WARN_ON(start + len > eb->start + eb->len); - offset = offset_in_page(start); + offset = get_eb_page_offset(eb, start); while (len > 0) { page = eb->pages[i]; @@ -5804,7 +5804,7 @@ void write_extent_buffer_chunk_tree_uuid(const struct extent_buffer *eb, char *kaddr; WARN_ON(!PageUptodate(eb->pages[0])); - kaddr = page_address(eb->pages[0]); + kaddr = page_address(eb->pages[0]) + get_eb_page_offset(eb, 0); memcpy(kaddr + offsetof(struct btrfs_header, chunk_tree_uuid), srcv, BTRFS_FSID_SIZE); } @@ -5814,7 +5814,7 @@ void write_extent_buffer_fsid(const struct extent_buffer *eb, const void *srcv) char *kaddr; WARN_ON(!PageUptodate(eb->pages[0])); - kaddr = page_address(eb->pages[0]); + kaddr = page_address(eb->pages[0]) + get_eb_page_offset(eb, 0); memcpy(kaddr + offsetof(struct btrfs_header, fsid), srcv, BTRFS_FSID_SIZE); } @@ -5827,12 +5827,12 @@ void write_extent_buffer(const struct extent_buffer *eb, const void *srcv, struct page *page; char *kaddr; char *src = (char *)srcv; - unsigned long i = start >> PAGE_SHIFT; + unsigned long i = get_eb_page_index(start); WARN_ON(start > eb->len); WARN_ON(start + len > eb->start + eb->len); - offset = offset_in_page(start); + offset = get_eb_page_offset(eb, start); while (len > 0) { page = eb->pages[i]; @@ -5856,12 +5856,12 @@ void memzero_extent_buffer(const struct extent_buffer *eb, unsigned long start, size_t offset; struct page *page; char *kaddr; - unsigned long i = start >> PAGE_SHIFT; + unsigned long i = get_eb_page_index(start); WARN_ON(start > eb->len); WARN_ON(start + len > eb->start + eb->len); - offset = offset_in_page(start); + offset = get_eb_page_offset(eb, start); while (len > 0) { page = eb->pages[i]; @@ -5885,10 +5885,22 @@ void copy_extent_buffer_full(const struct extent_buffer *dst, ASSERT(dst->len == src->len); - num_pages = num_extent_pages(dst); - for (i = 0; i < num_pages; i++) - copy_page(page_address(dst->pages[i]), - page_address(src->pages[i])); + if (dst->fs_info->sectorsize == PAGE_SIZE) { + num_pages = num_extent_pages(dst); + for (i = 0; i < num_pages; i++) + copy_page(page_address(dst->pages[i]), + page_address(src->pages[i])); + } else { + unsigned long src_index = get_eb_page_index(0); + unsigned long dst_index = get_eb_page_index(0); + size_t src_offset = get_eb_page_offset(src, 0); + size_t dst_offset = get_eb_page_offset(dst, 0); + + ASSERT(src_index == 0 && dst_index == 0); + memcpy(page_address(dst->pages[dst_index]) + dst_offset, + page_address(src->pages[src_index]) + src_offset, + src->len); + } } void copy_extent_buffer(const struct extent_buffer *dst, @@ -5901,11 +5913,11 @@ void copy_extent_buffer(const struct extent_buffer *dst, size_t offset; struct page *page; char *kaddr; - unsigned long i = dst_offset >> PAGE_SHIFT; + unsigned long i = get_eb_page_index(dst_offset); WARN_ON(src->len != dst_len); - offset = offset_in_page(dst_offset); + offset = get_eb_page_offset(dst, dst_offset); while (len > 0) { page = dst->pages[i]; @@ -5949,7 +5961,7 @@ static inline void eb_bitmap_offset(const struct extent_buffer *eb, * the bitmap item in the extent buffer + the offset of the byte in the * bitmap item. */ - offset = start + byte_offset; + offset = start + offset_in_page(eb->start) + byte_offset; *page_index = offset >> PAGE_SHIFT; *page_offset = offset_in_page(offset); @@ -6113,11 +6125,11 @@ void memcpy_extent_buffer(const struct extent_buffer *dst, } while (len > 0) { - dst_off_in_page = offset_in_page(dst_offset); - src_off_in_page = offset_in_page(src_offset); + dst_off_in_page = get_eb_page_offset(dst, dst_offset); + src_off_in_page = get_eb_page_offset(dst, src_offset); - dst_i = dst_offset >> PAGE_SHIFT; - src_i = src_offset >> PAGE_SHIFT; + dst_i = get_eb_page_index(dst_offset); + src_i = get_eb_page_index(src_offset); cur = min(len, (unsigned long)(PAGE_SIZE - src_off_in_page)); @@ -6163,11 +6175,11 @@ void memmove_extent_buffer(const struct extent_buffer *dst, return; } while (len > 0) { - dst_i = dst_end >> PAGE_SHIFT; - src_i = src_end >> PAGE_SHIFT; + dst_i = get_eb_page_index(dst_end); + src_i = get_eb_page_index(src_end); - dst_off_in_page = offset_in_page(dst_end); - src_off_in_page = offset_in_page(src_end); + dst_off_in_page = get_eb_page_offset(dst, dst_end); + src_off_in_page = get_eb_page_offset(dst, src_end); cur = min_t(unsigned long, len, src_off_in_page + 1); cur = min(cur, dst_off_in_page + 1); diff --git a/fs/btrfs/struct-funcs.c b/fs/btrfs/struct-funcs.c index 079b059818e9..769901c2b3c9 100644 --- a/fs/btrfs/struct-funcs.c +++ b/fs/btrfs/struct-funcs.c @@ -67,8 +67,9 @@ u##bits btrfs_get_token_##bits(struct btrfs_map_token *token, \ const void *ptr, unsigned long off) \ { \ const unsigned long member_offset = (unsigned long)ptr + off; \ - const unsigned long idx = member_offset >> PAGE_SHIFT; \ - const unsigned long oip = offset_in_page(member_offset); \ + const unsigned long idx = get_eb_page_index(member_offset); \ + const unsigned long oip = get_eb_page_offset(token->eb, \ + member_offset); \ const int size = sizeof(u##bits); \ u8 lebytes[sizeof(u##bits)]; \ const int part = PAGE_SIZE - oip; \ @@ -95,8 +96,8 @@ u##bits btrfs_get_##bits(const struct extent_buffer *eb, \ const void *ptr, unsigned long off) \ { \ const unsigned long member_offset = (unsigned long)ptr + off; \ - const unsigned long oip = offset_in_page(member_offset); \ - const unsigned long idx = member_offset >> PAGE_SHIFT; \ + const unsigned long oip = get_eb_page_offset(eb, member_offset);\ + const unsigned long idx = get_eb_page_index(member_offset); \ char *kaddr = page_address(eb->pages[idx]); \ const int size = sizeof(u##bits); \ const int part = PAGE_SIZE - oip; \ @@ -116,8 +117,9 @@ void btrfs_set_token_##bits(struct btrfs_map_token *token, \ u##bits val) \ { \ const unsigned long member_offset = (unsigned long)ptr + off; \ - const unsigned long idx = member_offset >> PAGE_SHIFT; \ - const unsigned long oip = offset_in_page(member_offset); \ + const unsigned long idx = get_eb_page_index(member_offset); \ + const unsigned long oip = get_eb_page_offset(token->eb, \ + member_offset); \ const int size = sizeof(u##bits); \ u8 lebytes[sizeof(u##bits)]; \ const int part = PAGE_SIZE - oip; \ @@ -146,8 +148,8 @@ void btrfs_set_##bits(const struct extent_buffer *eb, void *ptr, \ unsigned long off, u##bits val) \ { \ const unsigned long member_offset = (unsigned long)ptr + off; \ - const unsigned long oip = offset_in_page(member_offset); \ - const unsigned long idx = member_offset >> PAGE_SHIFT; \ + const unsigned long oip = get_eb_page_offset(eb, member_offset);\ + const unsigned long idx = get_eb_page_index(member_offset); \ char *kaddr = page_address(eb->pages[idx]); \ const int size = sizeof(u##bits); \ const int part = PAGE_SIZE - oip; \ From patchwork Wed Oct 21 06:25:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848371 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D6FF1C561F8 for ; Wed, 21 Oct 2020 06:27:12 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7CEE322249 for ; Wed, 21 Oct 2020 06:27:12 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="iOHHw0m2" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440790AbgJUG1K (ORCPT ); Wed, 21 Oct 2020 02:27:10 -0400 Received: from mx2.suse.de ([195.135.220.15]:43792 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2440787AbgJUG1K (ORCPT ); Wed, 21 Oct 2020 02:27:10 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261629; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nATJwzB7HNMv3p8xpmRB7SDjAV7YkGig36/MnT0PnCc=; b=iOHHw0m2wiHaReFK+UOdRPq4lZD1vXcfptpo5IYPmmfg6cYPq8e0Ko6p8OGLcMc0ViEUaX X5ZQUXq2J0PRsIZiRcTH/tWwTrc5p1DXjcVvct4RCRwTrDtiT0s6KjcQVPLNkPmdMypmIk pL+PUCwyTkMILlaH2933GqU8IYIRBoc= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 8D118AC12 for ; Wed, 21 Oct 2020 06:27:09 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 32/68] btrfs: disk-io: only clear EXTENT_LOCK bit for extent_invalidatepage() Date: Wed, 21 Oct 2020 14:25:18 +0800 Message-Id: <20201021062554.68132-33-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org In extent_invalidatepage() it will try to clear all possible bits since it's calling clear_extent_bit() with delete == 1. That would try to clear all existing bits. This is currently fine, since for btree io tree, it only utilizes EXTENT_LOCK bit. But this could be a problem for later subpage support, which will utilize extra io tree bit to represent extra info. This patch will just convert that clear_extent_bit() to unlock_extent_cached(). As for btree io tree, only EXTENT_LOCKED bit is utilized, this doesn't change the behavior, but provides a much cleaner basis for incoming subpage support. Signed-off-by: Qu Wenruo --- fs/btrfs/disk-io.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 1ca121ca28aa..10bdb0a8a92f 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -996,8 +996,13 @@ static void extent_invalidatepage(struct extent_io_tree *tree, lock_extent_bits(tree, start, end, &cached_state); wait_on_page_writeback(page); - clear_extent_bit(tree, start, end, EXTENT_LOCKED | EXTENT_DELALLOC | - EXTENT_DO_ACCOUNTING, 1, 1, &cached_state); + + /* + * Currently for btree io tree, only EXTENT_LOCKED is utilized, + * so here we only need to unlock the extent range to free any + * existing extent state. + */ + unlock_extent_cached(tree, start, end, &cached_state); } static void btree_invalidatepage(struct page *page, unsigned int offset, From patchwork Wed Oct 21 06:25:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848375 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B0529C4363A for ; Wed, 21 Oct 2020 06:27:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 52FE522249 for ; Wed, 21 Oct 2020 06:27:15 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="LUSZIqGg" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440796AbgJUG1O (ORCPT ); Wed, 21 Oct 2020 02:27:14 -0400 Received: from mx2.suse.de ([195.135.220.15]:43824 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2440783AbgJUG1O (ORCPT ); Wed, 21 Oct 2020 02:27:14 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261631; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Kpk4D88XxJBekl1asCbC2SL1fHRN1rz1ZqlTtNCu+Ko=; b=LUSZIqGgO2iWc+Sw2pmg35tr96BD0+cIWX8W94Pu3HEZZYY4HIF2R1N/iPIJp4t1DqUCZX UKHFmyfp1g4T0rEkMEuHWiZEWp6IgMRrl5czd0cyCjY0l1go+/s/Yqi6z/IzendsQUrHdu gZ4PyfS/qQJoS8a9/NARr4bxme27f2I= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 6A731AC12 for ; Wed, 21 Oct 2020 06:27:11 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 33/68] btrfs: extent-io: make type of extent_state::state to be at least 32 bits Date: Wed, 21 Oct 2020 14:25:19 +0800 Message-Id: <20201021062554.68132-34-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Currently we use 'unsigned' for extent_state::state, which is only ensured to be at least 16 bits. But for incoming subpage support, we are going to introduce more bits to at least match the following page bits: - PageUptodate - PagePrivate2 Thus we will go beyond 16 bits. To support this, make extent_state::state at least 32bit and to be more explicit, we use "u32" to be clear about the max supported bits. This doesn't increase the memory usage for x86_64, but may affect other architectures. Signed-off-by: Qu Wenruo --- fs/btrfs/extent-io-tree.h | 37 ++++++++++++++++------------- fs/btrfs/extent_io.c | 50 +++++++++++++++++++-------------------- fs/btrfs/extent_io.h | 2 +- 3 files changed, 45 insertions(+), 44 deletions(-) diff --git a/fs/btrfs/extent-io-tree.h b/fs/btrfs/extent-io-tree.h index 48fdaf5f3a19..176e0e8e1f7c 100644 --- a/fs/btrfs/extent-io-tree.h +++ b/fs/btrfs/extent-io-tree.h @@ -22,6 +22,10 @@ struct io_failure_record; #define EXTENT_QGROUP_RESERVED (1U << 12) #define EXTENT_CLEAR_DATA_RESV (1U << 13) #define EXTENT_DELALLOC_NEW (1U << 14) + +/* For subpage btree io tree, to indicate there is an extent buffer */ +#define EXTENT_HAS_TREE_BLOCK (1U << 15) + #define EXTENT_DO_ACCOUNTING (EXTENT_CLEAR_META_RESV | \ EXTENT_CLEAR_DATA_RESV) #define EXTENT_CTLBITS (EXTENT_DO_ACCOUNTING) @@ -73,7 +77,7 @@ struct extent_state { /* ADD NEW ELEMENTS AFTER THIS */ wait_queue_head_t wq; refcount_t refs; - unsigned state; + u32 state; struct io_failure_record *failrec; @@ -136,19 +140,19 @@ void __cold extent_io_exit(void); u64 count_range_bits(struct extent_io_tree *tree, u64 *start, u64 search_end, - u64 max_bytes, unsigned bits, int contig); + u64 max_bytes, u32 bits, int contig); void free_extent_state(struct extent_state *state); int test_range_bit(struct extent_io_tree *tree, u64 start, u64 end, - unsigned bits, int filled, + u32 bits, int filled, struct extent_state *cached_state); int clear_record_extent_bits(struct extent_io_tree *tree, u64 start, u64 end, - unsigned bits, struct extent_changeset *changeset); + u32 bits, struct extent_changeset *changeset); int clear_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, - unsigned bits, int wake, int delete, + u32 bits, int wake, int delete, struct extent_state **cached); int __clear_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, - unsigned bits, struct extent_state **cached_state, + u32 bits, struct extent_state **cached_state, gfp_t mask, struct extent_io_extra_options *extra_opts); static inline int unlock_extent(struct extent_io_tree *tree, u64 start, u64 end) @@ -177,7 +181,7 @@ static inline int unlock_extent_cached_atomic(struct extent_io_tree *tree, } static inline int clear_extent_bits(struct extent_io_tree *tree, u64 start, - u64 end, unsigned bits) + u64 end, u32 bits) { int wake = 0; @@ -188,15 +192,14 @@ static inline int clear_extent_bits(struct extent_io_tree *tree, u64 start, } int set_record_extent_bits(struct extent_io_tree *tree, u64 start, u64 end, - unsigned bits, struct extent_changeset *changeset); + u32 bits, struct extent_changeset *changeset); int set_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, - unsigned bits, struct extent_state **cached_state, - gfp_t mask); + u32 bits, struct extent_state **cached_state, gfp_t mask); int set_extent_bits_nowait(struct extent_io_tree *tree, u64 start, u64 end, - unsigned bits); + u32 bits); static inline int set_extent_bits(struct extent_io_tree *tree, u64 start, - u64 end, unsigned bits) + u64 end, u32 bits) { return set_extent_bit(tree, start, end, bits, NULL, GFP_NOFS); } @@ -223,11 +226,11 @@ static inline int clear_extent_dirty(struct extent_io_tree *tree, u64 start, } int convert_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, - unsigned bits, unsigned clear_bits, + u32 bits, u32 clear_bits, struct extent_state **cached_state); static inline int set_extent_delalloc(struct extent_io_tree *tree, u64 start, - u64 end, unsigned int extra_bits, + u64 end, u32 extra_bits, struct extent_state **cached_state) { return set_extent_bit(tree, start, end, @@ -257,12 +260,12 @@ static inline int set_extent_uptodate(struct extent_io_tree *tree, u64 start, } int find_first_extent_bit(struct extent_io_tree *tree, u64 start, - u64 *start_ret, u64 *end_ret, unsigned bits, + u64 *start_ret, u64 *end_ret, u32 bits, bool exact_match, struct extent_state **cached_state); void find_first_clear_extent_bit(struct extent_io_tree *tree, u64 start, - u64 *start_ret, u64 *end_ret, unsigned bits); + u64 *start_ret, u64 *end_ret, u32 bits); int find_contiguous_extent_bit(struct extent_io_tree *tree, u64 start, - u64 *start_ret, u64 *end_ret, unsigned bits); + u64 *start_ret, u64 *end_ret, u32 bits); bool btrfs_find_delalloc_range(struct extent_io_tree *tree, u64 *start, u64 *end, u64 max_bytes, struct extent_state **cached_state); diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index ea248e2689c9..a7e4d3c65162 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -143,7 +143,7 @@ struct extent_page_data { unsigned int sync_io:1; }; -static int add_extent_changeset(struct extent_state *state, unsigned bits, +static int add_extent_changeset(struct extent_state *state, u32 bits, struct extent_changeset *changeset, int set) { @@ -531,7 +531,7 @@ static void merge_state(struct extent_io_tree *tree, } static void set_state_bits(struct extent_io_tree *tree, - struct extent_state *state, unsigned *bits, + struct extent_state *state, u32 *bits, struct extent_changeset *changeset); /* @@ -548,7 +548,7 @@ static int insert_state(struct extent_io_tree *tree, struct extent_state *state, u64 start, u64 end, struct rb_node ***p, struct rb_node **parent, - unsigned *bits, struct extent_changeset *changeset) + u32 *bits, struct extent_changeset *changeset) { struct rb_node *node; @@ -629,11 +629,11 @@ static struct extent_state *next_state(struct extent_state *state) */ static struct extent_state *clear_state_bit(struct extent_io_tree *tree, struct extent_state *state, - unsigned *bits, int wake, + u32 *bits, int wake, struct extent_changeset *changeset) { struct extent_state *next; - unsigned bits_to_clear = *bits & ~EXTENT_CTLBITS; + u32 bits_to_clear = *bits & ~EXTENT_CTLBITS; int ret; if ((bits_to_clear & EXTENT_DIRTY) && (state->state & EXTENT_DIRTY)) { @@ -700,7 +700,7 @@ static void extent_io_tree_panic(struct extent_io_tree *tree, int err) * No error can be returned yet, the ENOMEM for memory is handled by BUG_ON(). */ int __clear_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, - unsigned bits, struct extent_state **cached_state, + u32 bits, struct extent_state **cached_state, gfp_t mask, struct extent_io_extra_options *extra_opts) { struct extent_changeset *changeset; @@ -881,7 +881,7 @@ static void wait_on_state(struct extent_io_tree *tree, * The tree lock is taken by this function */ static void wait_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, - unsigned long bits) + u32 bits) { struct extent_state *state; struct rb_node *node; @@ -928,9 +928,9 @@ static void wait_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, static void set_state_bits(struct extent_io_tree *tree, struct extent_state *state, - unsigned *bits, struct extent_changeset *changeset) + u32 *bits, struct extent_changeset *changeset) { - unsigned bits_to_set = *bits & ~EXTENT_CTLBITS; + u32 bits_to_set = *bits & ~EXTENT_CTLBITS; int ret; if (tree->private_data && is_data_inode(tree->private_data)) @@ -977,7 +977,7 @@ static void cache_state(struct extent_state *state, static int __must_check __set_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, - unsigned bits, struct extent_state **cached_state, + u32 bits, struct extent_state **cached_state, gfp_t mask, struct extent_io_extra_options *extra_opts) { struct extent_state *state; @@ -1201,8 +1201,7 @@ __set_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, } int set_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, - unsigned bits, struct extent_state **cached_state, - gfp_t mask) + u32 bits, struct extent_state **cached_state, gfp_t mask) { return __set_extent_bit(tree, start, end, bits, cached_state, mask, NULL); @@ -1228,7 +1227,7 @@ int set_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, * All allocations are done with GFP_NOFS. */ int convert_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, - unsigned bits, unsigned clear_bits, + u32 bits, u32 clear_bits, struct extent_state **cached_state) { struct extent_state *state; @@ -1429,7 +1428,7 @@ int convert_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, /* wrappers around set/clear extent bit */ int set_record_extent_bits(struct extent_io_tree *tree, u64 start, u64 end, - unsigned bits, struct extent_changeset *changeset) + u32 bits, struct extent_changeset *changeset) { struct extent_io_extra_options extra_opts = { .changeset = changeset, @@ -1448,13 +1447,13 @@ int set_record_extent_bits(struct extent_io_tree *tree, u64 start, u64 end, } int set_extent_bits_nowait(struct extent_io_tree *tree, u64 start, u64 end, - unsigned bits) + u32 bits) { return __set_extent_bit(tree, start, end, bits, NULL, GFP_NOWAIT, NULL); } int clear_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, - unsigned bits, int wake, int delete, + u32 bits, int wake, int delete, struct extent_state **cached) { struct extent_io_extra_options extra_opts = { @@ -1467,7 +1466,7 @@ int clear_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, } int clear_record_extent_bits(struct extent_io_tree *tree, u64 start, u64 end, - unsigned bits, struct extent_changeset *changeset) + u32 bits, struct extent_changeset *changeset) { struct extent_io_extra_options extra_opts = { .changeset = changeset, @@ -1559,7 +1558,7 @@ void extent_range_redirty_for_io(struct inode *inode, u64 start, u64 end) } } -static bool match_extent_state(struct extent_state *state, unsigned bits, +static bool match_extent_state(struct extent_state *state, u32 bits, bool exact_match) { if (exact_match) @@ -1579,7 +1578,7 @@ static bool match_extent_state(struct extent_state *state, unsigned bits, */ static struct extent_state * find_first_extent_bit_state(struct extent_io_tree *tree, - u64 start, unsigned bits, bool exact_match) + u64 start, u32 bits, bool exact_match) { struct rb_node *node; struct extent_state *state; @@ -1615,7 +1614,7 @@ find_first_extent_bit_state(struct extent_io_tree *tree, * Return 1 if we found nothing. */ int find_first_extent_bit(struct extent_io_tree *tree, u64 start, - u64 *start_ret, u64 *end_ret, unsigned bits, + u64 *start_ret, u64 *end_ret, u32 bits, bool exact_match, struct extent_state **cached_state) { struct extent_state *state; @@ -1667,7 +1666,7 @@ int find_first_extent_bit(struct extent_io_tree *tree, u64 start, * returned will be the full contiguous area with the bits set. */ int find_contiguous_extent_bit(struct extent_io_tree *tree, u64 start, - u64 *start_ret, u64 *end_ret, unsigned bits) + u64 *start_ret, u64 *end_ret, u32 bits) { struct extent_state *state; int ret = 1; @@ -1704,7 +1703,7 @@ int find_contiguous_extent_bit(struct extent_io_tree *tree, u64 start, * trim @end_ret to the appropriate size. */ void find_first_clear_extent_bit(struct extent_io_tree *tree, u64 start, - u64 *start_ret, u64 *end_ret, unsigned bits) + u64 *start_ret, u64 *end_ret, u32 bits) { struct extent_state *state; struct rb_node *node, *prev = NULL, *next; @@ -2085,8 +2084,7 @@ noinline_for_stack bool find_lock_delalloc_range(struct inode *inode, void extent_clear_unlock_delalloc(struct btrfs_inode *inode, u64 start, u64 end, struct page *locked_page, - unsigned clear_bits, - unsigned long page_ops) + u32 clear_bits, unsigned long page_ops) { clear_extent_bit(&inode->io_tree, start, end, clear_bits, 1, 0, NULL); @@ -2102,7 +2100,7 @@ void extent_clear_unlock_delalloc(struct btrfs_inode *inode, u64 start, u64 end, */ u64 count_range_bits(struct extent_io_tree *tree, u64 *start, u64 search_end, u64 max_bytes, - unsigned bits, int contig) + u32 bits, int contig) { struct rb_node *node; struct extent_state *state; @@ -2222,7 +2220,7 @@ struct io_failure_record *get_state_failrec(struct extent_io_tree *tree, u64 sta * range is found set. */ int test_range_bit(struct extent_io_tree *tree, u64 start, u64 end, - unsigned bits, int filled, struct extent_state *cached) + u32 bits, int filled, struct extent_state *cached) { struct extent_state *state = NULL; struct rb_node *node; diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h index 552afc1c0bbc..602d6568c8ea 100644 --- a/fs/btrfs/extent_io.h +++ b/fs/btrfs/extent_io.h @@ -288,7 +288,7 @@ void extent_range_clear_dirty_for_io(struct inode *inode, u64 start, u64 end); void extent_range_redirty_for_io(struct inode *inode, u64 start, u64 end); void extent_clear_unlock_delalloc(struct btrfs_inode *inode, u64 start, u64 end, struct page *locked_page, - unsigned bits_to_clear, + u32 bits_to_clear, unsigned long page_ops); struct bio *btrfs_bio_alloc(u64 first_byte); struct bio *btrfs_io_bio_alloc(unsigned int nr_iovecs); From patchwork Wed Oct 21 06:25:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848373 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CA8E7C561F8 for ; Wed, 21 Oct 2020 06:27:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7C2A322249 for ; Wed, 21 Oct 2020 06:27:16 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="NeZeXYV4" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440800AbgJUG1P (ORCPT ); Wed, 21 Oct 2020 02:27:15 -0400 Received: from mx2.suse.de ([195.135.220.15]:43844 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2440793AbgJUG1P (ORCPT ); Wed, 21 Oct 2020 02:27:15 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261633; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=FLGKZpaAbbPVK4VM1PuL7b7jn4nOQwMa4IKYX1IdJfI=; b=NeZeXYV4bAPx4fUiWkPFfayxpg6kh6O6BDPpssGl98G3n0ruLMrN03Rijnx+ceQfOu/VWn mdT5LjBL76y23HnE3wEGnTl5ig6zRYOCvDMKdzBGDTgwf1VFhMDNdRkRjvYaQB9ISRj83a MqI2WEhEjCVIEMyytf5tftAkkoEwYlc= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 42952AC1D for ; Wed, 21 Oct 2020 06:27:13 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 34/68] btrfs: extent_io: use extent_io_tree to handle subpage extent buffer allocation Date: Wed, 21 Oct 2020 14:25:20 +0800 Message-Id: <20201021062554.68132-35-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Currently btrfs uses page::private as an indicator of who owns the extent buffer, this method won't really work on subpage support, as one page can contain several tree blocks (up to 16 for 4K node size and 64K page size). Instead, here we utilize btree extent io tree to handle them. For btree io tree, we introduce a new bit, EXTENT_HAS_TREE_BLOCK to indicate that we have an in-tree extent buffer for the range. This will affects the following functions: - alloc_extent_buffer() Now for subpage we never use page->private to grab an existing eb. Instead, we rely on extra safenet in alloc_extent_buffer() to detect two callers on the same eb. - btrfs_release_extent_buffer_pages() Now for subpage, we clear the EXTENT_HAS_TREE_BLOCK bit first, then check if the remaining range in the page has EXTENT_HAS_TREE_BLOCK bit. If not, then clear the private bit for the page. - attach_extent_buffer_page() Now we set EXTENT_HAS_TREE_BLOCK bit for the new extent buffer to be attached, and set the page private, with NULL as page::private. Signed-off-by: Qu Wenruo --- fs/btrfs/btrfs_inode.h | 12 ++++++ fs/btrfs/extent-io-tree.h | 2 +- fs/btrfs/extent_io.c | 80 ++++++++++++++++++++++++++++++++++++++- 3 files changed, 91 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h index c47b6c6fea9f..cff818e0c406 100644 --- a/fs/btrfs/btrfs_inode.h +++ b/fs/btrfs/btrfs_inode.h @@ -217,6 +217,18 @@ static inline struct btrfs_inode *BTRFS_I(const struct inode *inode) return container_of(inode, struct btrfs_inode, vfs_inode); } +static inline struct btrfs_fs_info *page_to_fs_info(struct page *page) +{ + ASSERT(page->mapping); + return BTRFS_I(page->mapping->host)->root->fs_info; +} + +static inline struct extent_io_tree +*info_to_btree_io_tree(struct btrfs_fs_info *fs_info) +{ + return &BTRFS_I(fs_info->btree_inode)->io_tree; +} + static inline unsigned long btrfs_inode_hash(u64 objectid, const struct btrfs_root *root) { diff --git a/fs/btrfs/extent-io-tree.h b/fs/btrfs/extent-io-tree.h index 176e0e8e1f7c..bdafac1bd15f 100644 --- a/fs/btrfs/extent-io-tree.h +++ b/fs/btrfs/extent-io-tree.h @@ -23,7 +23,7 @@ struct io_failure_record; #define EXTENT_CLEAR_DATA_RESV (1U << 13) #define EXTENT_DELALLOC_NEW (1U << 14) -/* For subpage btree io tree, to indicate there is an extent buffer */ +/* For subpage btree io tree, indicates there is an in-tree extent buffer */ #define EXTENT_HAS_TREE_BLOCK (1U << 15) #define EXTENT_DO_ACCOUNTING (EXTENT_CLEAR_META_RESV | \ diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index a7e4d3c65162..d899a75db977 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -3163,6 +3163,18 @@ static void attach_extent_buffer_page(struct extent_buffer *eb, if (page->mapping) assert_spin_locked(&page->mapping->private_lock); + if (btrfs_is_subpage(eb->fs_info) && page->mapping) { + struct extent_io_tree *io_tree = + info_to_btree_io_tree(eb->fs_info); + + if (!PagePrivate(page)) + attach_page_private(page, NULL); + + set_extent_bit(io_tree, eb->start, eb->start + eb->len - 1, + EXTENT_HAS_TREE_BLOCK, NULL, GFP_ATOMIC); + return; + } + if (!PagePrivate(page)) attach_page_private(page, eb); else @@ -4984,6 +4996,36 @@ int extent_buffer_under_io(const struct extent_buffer *eb) test_bit(EXTENT_BUFFER_DIRTY, &eb->bflags)); } +static void detach_extent_buffer_subpage(struct extent_buffer *eb) +{ + struct btrfs_fs_info *fs_info = eb->fs_info; + struct extent_io_tree *io_tree = info_to_btree_io_tree(fs_info); + struct page *page = eb->pages[0]; + bool mapped = !test_bit(EXTENT_BUFFER_UNMAPPED, &eb->bflags); + int ret; + + if (!page) + return; + + if (mapped) + spin_lock(&page->mapping->private_lock); + + __clear_extent_bit(io_tree, eb->start, eb->start + eb->len - 1, + EXTENT_HAS_TREE_BLOCK, NULL, GFP_ATOMIC, NULL); + + /* Test if we still have other extent buffer in the page range */ + ret = test_range_bit(io_tree, round_down(eb->start, PAGE_SIZE), + round_down(eb->start, PAGE_SIZE) + PAGE_SIZE - 1, + EXTENT_HAS_TREE_BLOCK, 0, NULL); + if (!ret) + detach_page_private(eb->pages[0]); + if (mapped) + spin_unlock(&page->mapping->private_lock); + + /* One for when we allocated the page */ + put_page(page); +} + /* * Release all pages attached to the extent buffer. */ @@ -4995,6 +5037,9 @@ static void btrfs_release_extent_buffer_pages(struct extent_buffer *eb) BUG_ON(extent_buffer_under_io(eb)); + if (btrfs_is_subpage(eb->fs_info) && mapped) + return detach_extent_buffer_subpage(eb); + num_pages = num_extent_pages(eb); for (i = 0; i < num_pages; i++) { struct page *page = eb->pages[i]; @@ -5289,6 +5334,7 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, struct extent_buffer *exists = NULL; struct page *p; struct address_space *mapping = fs_info->btree_inode->i_mapping; + bool subpage = btrfs_is_subpage(fs_info); int uptodate = 1; int ret; @@ -5321,7 +5367,12 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, } spin_lock(&mapping->private_lock); - if (PagePrivate(p)) { + /* + * Subpage support doesn't use page::private at all, so we + * completely rely on the radix insert lock to prevent two + * ebs allocated for the same bytenr. + */ + if (PagePrivate(p) && !subpage) { /* * We could have already allocated an eb for this page * and attached one so lets see if we can get a ref on @@ -5362,8 +5413,21 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, * we could crash. */ } - if (uptodate) + if (uptodate) { set_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags); + } else if (subpage) { + /* + * For subpage, we must check extent_io_tree to get if the eb + * is really uptodate, as the page uptodate is only set if the + * whole page is uptodate. + * We can still have uptodate range in the page. + */ + struct extent_io_tree *io_tree = info_to_btree_io_tree(fs_info); + + if (test_range_bit(io_tree, eb->start, eb->start + eb->len - 1, + EXTENT_UPTODATE, 1, NULL)) + set_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags); + } again: ret = radix_tree_preload(GFP_NOFS); if (ret) { @@ -5402,6 +5466,18 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, if (eb->pages[i]) unlock_page(eb->pages[i]); } + /* + * For subpage case, btrfs_release_extent_buffer() will clear the + * EXTENT_HAS_TREE_BLOCK bit if there is a page. + * + * Since we're here because we hit a race with another caller, who + * succeeded in inserting the eb, we shouldn't clear that + * EXTENT_HAS_TREE_BLOCK bit. So here we cleanup the page manually. + */ + if (subpage) { + put_page(eb->pages[0]); + eb->pages[i] = NULL; + } btrfs_release_extent_buffer(eb); return exists; From patchwork Wed Oct 21 06:25:21 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848379 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AE44FC56201 for ; Wed, 21 Oct 2020 06:27:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 46C9422249 for ; Wed, 21 Oct 2020 06:27:18 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="SMH8JlHh" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440803AbgJUG1R (ORCPT ); Wed, 21 Oct 2020 02:27:17 -0400 Received: from mx2.suse.de ([195.135.220.15]:43862 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2440798AbgJUG1Q (ORCPT ); Wed, 21 Oct 2020 02:27:16 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261635; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ntl2tibyHsHwynEGHyQmLvgX3LrTBVLXGip2iIzJ3FI=; b=SMH8JlHh+nw13nBtkuURuqUsZyDZKrEbyBIkW7FheUAewFWFkOiJ/0QvyVcnGFCnJmP2dW v4D2BLwj9al+8HdbQK6fs3h8q8yGcZxDnRcPHZB5hyluN/eDZT9IuDkKBSef4IcKFSaMV+ i5q2546zePX/x6kKclEQxjlY6vUHu5M= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id F1630AC12 for ; Wed, 21 Oct 2020 06:27:14 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 35/68] btrfs: extent_io: make set/clear_extent_buffer_uptodate() to support subpage size Date: Wed, 21 Oct 2020 14:25:21 +0800 Message-Id: <20201021062554.68132-36-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org For those two functions, to support subpage size they just need the follow work: - set/clear the EXTENT_UPTODATE bits for io_tree - set page Uptodate if the full range of the page is uptodate Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 28 ++++++++++++++++++++++++++-- 1 file changed, 26 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index d899a75db977..1e959e6e8ce8 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -5631,10 +5631,18 @@ bool set_extent_buffer_dirty(struct extent_buffer *eb) void clear_extent_buffer_uptodate(struct extent_buffer *eb) { int i; - struct page *page; + struct page *page = eb->pages[0]; int num_pages; clear_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags); + + if (btrfs_is_subpage(eb->fs_info) && page->mapping) { + struct extent_io_tree *io_tree = + info_to_btree_io_tree(eb->fs_info); + + clear_extent_uptodate(io_tree, eb->start, + eb->start + eb->len - 1, NULL); + } num_pages = num_extent_pages(eb); for (i = 0; i < num_pages; i++) { page = eb->pages[i]; @@ -5646,10 +5654,26 @@ void clear_extent_buffer_uptodate(struct extent_buffer *eb) void set_extent_buffer_uptodate(struct extent_buffer *eb) { int i; - struct page *page; + struct page *page = eb->pages[0]; int num_pages; set_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags); + + if (btrfs_is_subpage(eb->fs_info) && page->mapping) { + struct extent_state *cached = NULL; + struct extent_io_tree *io_tree = + info_to_btree_io_tree(eb->fs_info); + u64 page_start = page_offset(page); + u64 page_end = page_offset(page) + PAGE_SIZE - 1; + + set_extent_uptodate(io_tree, eb->start, eb->start + eb->len - 1, + &cached, GFP_NOFS); + if (test_range_bit(io_tree, page_start, page_end, + EXTENT_UPTODATE, 1, cached)) + SetPageUptodate(page); + free_extent_state(cached); + return; + } num_pages = num_extent_pages(eb); for (i = 0; i < num_pages; i++) { page = eb->pages[i]; From patchwork Wed Oct 21 06:25:22 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848383 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 17B26C561F8 for ; Wed, 21 Oct 2020 06:27:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C091B22249 for ; Wed, 21 Oct 2020 06:27:19 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="uN/ZGpf9" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440806AbgJUG1T (ORCPT ); Wed, 21 Oct 2020 02:27:19 -0400 Received: from mx2.suse.de ([195.135.220.15]:43892 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2440802AbgJUG1S (ORCPT ); Wed, 21 Oct 2020 02:27:18 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261637; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=wfzgZfpuqC40mp71PSIP/hR8jxVXp76IWffKRQwphv0=; b=uN/ZGpf9mlBOmQLk6zCKUsLJ1F8OvomGa72o0Tz6Sgr2UW3BP98WbUiOpct5MC+NPJvmj4 IX/ML7HlUs8jgWcDYqtCoG2eFVXPt5TGAuvoCqeQab+gicik+yGOvxwi1f3E4WHQpF4kxf vwPsw9+9C4UbQczjXq6Tip6FymOj980= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id E86D2AC35 for ; Wed, 21 Oct 2020 06:27:16 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 36/68] btrfs: extent_io: make the assert test on page uptodate able to handle subpage Date: Wed, 21 Oct 2020 14:25:22 +0800 Message-Id: <20201021062554.68132-37-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org There are quite some assert test on page uptodate in extent buffer write accessors. They ensure the destination page is already uptodate. This is fine for regular sector size case, but not for subpage case, as for subpage we only mark the page uptodate if the page contains no hole and all its extent buffers are uptodate. So instead of checking PageUptodate(), for subpage case we check EXTENT_UPTODATE bit for the range covered by the extent buffer. To make the check more elegant, introduce a helper, assert_eb_range_uptodate() to do the check for both subpage and regular sector size cases. The following functions are involved: - write_extent_buffer_chunk_tree_uuid() - write_extent_buffer_fsid() - write_extent_buffer() - memzero_extent_buffer() - copy_extent_buffer() - extent_buffer_test_bit() - extent_buffer_bitmap_set() - extent_buffer_bitmap_clear() Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 44 ++++++++++++++++++++++++++++++++++---------- 1 file changed, 34 insertions(+), 10 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 1e959e6e8ce8..dcc7d4602cea 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -5896,12 +5896,36 @@ int memcmp_extent_buffer(const struct extent_buffer *eb, const void *ptrv, return ret; } +/* + * A helper to ensure that the extent buffer is uptodate. + * + * For regular sector size == PAGE_SIZE case, check if @page is uptodate. + * For subpage case, check if the range covered by the eb has EXTENT_UPTODATE. + */ +static void assert_eb_range_uptodate(const struct extent_buffer *eb, + struct page *page) +{ + struct btrfs_fs_info *fs_info = eb->fs_info; + + if (btrfs_is_subpage(fs_info) && page->mapping) { + struct extent_io_tree *io_tree = info_to_btree_io_tree(fs_info); + + /* For subpage and mapped eb, check the EXTENT_UPTODATE bit. */ + WARN_ON(!test_range_bit(io_tree, eb->start, + eb->start + eb->len - 1, EXTENT_UPTODATE, 1, + NULL)); + } else { + /* For regular eb or dummy eb, check the page status directly */ + WARN_ON(!PageUptodate(page)); + } +} + void write_extent_buffer_chunk_tree_uuid(const struct extent_buffer *eb, const void *srcv) { char *kaddr; - WARN_ON(!PageUptodate(eb->pages[0])); + assert_eb_range_uptodate(eb, eb->pages[0]); kaddr = page_address(eb->pages[0]) + get_eb_page_offset(eb, 0); memcpy(kaddr + offsetof(struct btrfs_header, chunk_tree_uuid), srcv, BTRFS_FSID_SIZE); @@ -5911,7 +5935,7 @@ void write_extent_buffer_fsid(const struct extent_buffer *eb, const void *srcv) { char *kaddr; - WARN_ON(!PageUptodate(eb->pages[0])); + assert_eb_range_uptodate(eb, eb->pages[0]); kaddr = page_address(eb->pages[0]) + get_eb_page_offset(eb, 0); memcpy(kaddr + offsetof(struct btrfs_header, fsid), srcv, BTRFS_FSID_SIZE); @@ -5934,7 +5958,7 @@ void write_extent_buffer(const struct extent_buffer *eb, const void *srcv, while (len > 0) { page = eb->pages[i]; - WARN_ON(!PageUptodate(page)); + assert_eb_range_uptodate(eb, page); cur = min(len, PAGE_SIZE - offset); kaddr = page_address(page); @@ -5963,7 +5987,7 @@ void memzero_extent_buffer(const struct extent_buffer *eb, unsigned long start, while (len > 0) { page = eb->pages[i]; - WARN_ON(!PageUptodate(page)); + assert_eb_range_uptodate(eb, page); cur = min(len, PAGE_SIZE - offset); kaddr = page_address(page); @@ -6019,7 +6043,7 @@ void copy_extent_buffer(const struct extent_buffer *dst, while (len > 0) { page = dst->pages[i]; - WARN_ON(!PageUptodate(page)); + assert_eb_range_uptodate(dst, page); cur = min(len, (unsigned long)(PAGE_SIZE - offset)); @@ -6081,7 +6105,7 @@ int extent_buffer_test_bit(const struct extent_buffer *eb, unsigned long start, eb_bitmap_offset(eb, start, nr, &i, &offset); page = eb->pages[i]; - WARN_ON(!PageUptodate(page)); + assert_eb_range_uptodate(eb, page); kaddr = page_address(page); return 1U & (kaddr[offset] >> (nr & (BITS_PER_BYTE - 1))); } @@ -6106,7 +6130,7 @@ void extent_buffer_bitmap_set(const struct extent_buffer *eb, unsigned long star eb_bitmap_offset(eb, start, pos, &i, &offset); page = eb->pages[i]; - WARN_ON(!PageUptodate(page)); + assert_eb_range_uptodate(eb, page); kaddr = page_address(page); while (len >= bits_to_set) { @@ -6117,7 +6141,7 @@ void extent_buffer_bitmap_set(const struct extent_buffer *eb, unsigned long star if (++offset >= PAGE_SIZE && len > 0) { offset = 0; page = eb->pages[++i]; - WARN_ON(!PageUptodate(page)); + assert_eb_range_uptodate(eb, page); kaddr = page_address(page); } } @@ -6149,7 +6173,7 @@ void extent_buffer_bitmap_clear(const struct extent_buffer *eb, eb_bitmap_offset(eb, start, pos, &i, &offset); page = eb->pages[i]; - WARN_ON(!PageUptodate(page)); + assert_eb_range_uptodate(eb, page); kaddr = page_address(page); while (len >= bits_to_clear) { @@ -6160,7 +6184,7 @@ void extent_buffer_bitmap_clear(const struct extent_buffer *eb, if (++offset >= PAGE_SIZE && len > 0) { offset = 0; page = eb->pages[++i]; - WARN_ON(!PageUptodate(page)); + assert_eb_range_uptodate(eb, page); kaddr = page_address(page); } } From patchwork Wed Oct 21 06:25:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848391 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 286DAC561F8 for ; Wed, 21 Oct 2020 06:27:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C02D920790 for ; Wed, 21 Oct 2020 06:27:21 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="qtPb46yl" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440807AbgJUG1V (ORCPT ); Wed, 21 Oct 2020 02:27:21 -0400 Received: from mx2.suse.de ([195.135.220.15]:43862 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2440805AbgJUG1U (ORCPT ); Wed, 21 Oct 2020 02:27:20 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261638; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=v6+Z4XnRVG2xQngEzqbT8NgtEb3+oT44JPfM1xSLBaQ=; b=qtPb46ylD/UZu9Ca2fyQoAd/JuGq38NmDUqxl7EOL5fjH/p4shXt8w3fAV2t/ygciCcv+w MgZHUOooFzX3OB1NroDMoQWQ7RC6tQ0W4rFY3NAwXg49s0p29DvkT7JW8B3SZFG5tr3l2e oZlFEx6igTy7BZdIJjroim8iVRiq4M4= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id A81C2AC1D for ; Wed, 21 Oct 2020 06:27:18 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 37/68] btrfs: extent_io: implement subpage metadata read and its endio function Date: Wed, 21 Oct 2020 14:25:23 +0800 Message-Id: <20201021062554.68132-38-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org For subpage metadata read, since we're completely relying on io tree other than page bits, its read submission and endio function is different from the regular page size. For submission part: - Do extent locking/waiting Instead of page locking, we do extent io tree locking, which provides subpage granularity locking. And since we're no longer relying on full page locking, which means in theory we can submit parallel metadata read even they are in the same page. - Submit extent page directly To simply the process, as all the metadata read is always contained in one page. For endio part: - Do extent locking The same as submission part, instead of page locking, only reply on extent io tree locking. This behavior has a small problem that, extent locking/waiting are all going to allocate memory, thus they can all fail. Currently we're relying on the BUG_ON() in various set_extent_bits() calls. But when we're going to handle the error from them, this way would make it more complex to pass all the ENOMEM error upwards. Signed-off-by: Qu Wenruo --- fs/btrfs/disk-io.c | 81 ++++++++++++++++++++++++++++++++++++++++++++ fs/btrfs/extent_io.c | 74 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 155 insertions(+) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 10bdb0a8a92f..89021e552da0 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -651,6 +651,84 @@ static int btrfs_check_extent_buffer(struct extent_buffer *eb) return ret; } +static int btree_read_subpage_endio_hook(struct page *page, u64 start, u64 end, + int mirror) +{ + struct btrfs_fs_info *fs_info = page_to_fs_info(page); + struct extent_buffer *eb; + int reads_done; + int ret = 0; + + if (!IS_ALIGNED(start, fs_info->sectorsize) || + !IS_ALIGNED(end - start + 1, fs_info->sectorsize) || + !IS_ALIGNED(end - start + 1, fs_info->nodesize)) { + WARN_ON(IS_ENABLED(CONFIG_BTRFS_DEBUG)); + btrfs_err(fs_info, "invalid tree read bytenr"); + return -EUCLEAN; + } + + /* + * We don't allow bio merge for subpage metadata read, so we should + * only get one eb for each endio hook. + */ + ASSERT(end == start + fs_info->nodesize - 1); + ASSERT(PagePrivate(page)); + + rcu_read_lock(); + eb = radix_tree_lookup(&fs_info->buffer_radix, + start / fs_info->sectorsize); + rcu_read_unlock(); + + /* + * When we are reading one tree block, eb must have been + * inserted into the radix tree. If not something is wrong. + */ + if (!eb) { + WARN_ON(IS_ENABLED(CONFIG_BTRFS_DEBUG)); + btrfs_err(fs_info, + "can't find extent buffer for bytenr %llu", + start); + return -EUCLEAN; + } + /* + * The pending IO might have been the only thing that kept + * this buffer in memory. Make sure we have a ref for all + * this other checks + */ + atomic_inc(&eb->refs); + + reads_done = atomic_dec_and_test(&eb->io_pages); + /* Subpage read must finish in page read */ + ASSERT(reads_done); + + eb->read_mirror= mirror; + if (test_bit(EXTENT_BUFFER_READ_ERR, &eb->bflags)) { + ret = -EIO; + goto err; + } + ret = btrfs_check_extent_buffer(eb); + if (ret < 0) + goto err; + + if (test_and_clear_bit(EXTENT_BUFFER_READAHEAD, &eb->bflags)) + btree_readahead_hook(eb, ret); + + set_extent_buffer_uptodate(eb); + + free_extent_buffer(eb); + return ret; +err: + /* + * our io error hook is going to dec the io pages + * again, we have to make sure it has something to + * decrement + */ + atomic_inc(&eb->io_pages); + clear_extent_buffer_uptodate(eb); + free_extent_buffer(eb); + return ret; +} + static int btree_readpage_end_io_hook(struct btrfs_io_bio *io_bio, u64 phy_offset, struct page *page, u64 start, u64 end, int mirror) @@ -659,6 +737,9 @@ static int btree_readpage_end_io_hook(struct btrfs_io_bio *io_bio, int ret = 0; bool reads_done; + if (btrfs_is_subpage(page_to_fs_info(page))) + return btree_read_subpage_endio_hook(page, start, end, mirror); + /* Metadata pages that goes through IO should all have private set */ ASSERT(PagePrivate(page) && page->private); eb = (struct extent_buffer *)page->private; diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index dcc7d4602cea..2f9609d35f0c 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -3111,6 +3111,15 @@ static int submit_extent_page(unsigned int opf, else contig = bio_end_sector(bio) == sector; + /* + * For subpage metadata read, never merge request, so that + * we get endio hook called on each metadata read. + */ + if (btrfs_is_subpage(page_to_fs_info(page)) && + tree->owner == IO_TREE_BTREE_INODE_IO && + (opf & REQ_OP_READ)) + ASSERT(force_bio_submit); + ASSERT(tree->ops); if (btrfs_bio_fits_in_stripe(page, io_size, bio, bio_flags)) can_merge = false; @@ -5681,6 +5690,68 @@ void set_extent_buffer_uptodate(struct extent_buffer *eb) } } +static int read_extent_buffer_subpage(struct extent_buffer *eb, int wait, + int mirror_num) +{ + struct btrfs_fs_info *fs_info = eb->fs_info; + struct extent_io_tree *io_tree = info_to_btree_io_tree(fs_info); + struct page *page = eb->pages[0]; + struct bio *bio = NULL; + int ret = 0; + + ASSERT(!test_bit(EXTENT_BUFFER_UNMAPPED, &eb->bflags)); + + if (wait == WAIT_NONE) { + ret = try_lock_extent(io_tree, eb->start, + eb->start + eb->len - 1); + if (ret <= 0) + return ret; + } else { + ret = lock_extent(io_tree, eb->start, eb->start + eb->len - 1); + if (ret < 0) + return ret; + } + + ret = 0; + if (test_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags) || + PageUptodate(page) || + test_range_bit(io_tree, eb->start, eb->start + eb->len - 1, + EXTENT_UPTODATE, 1, NULL)) { + set_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags); + unlock_extent(io_tree, eb->start, eb->start + eb->len - 1); + return ret; + } + atomic_set(&eb->io_pages, 1); + + ret = submit_extent_page(REQ_OP_READ | REQ_META, NULL, page, eb->start, + eb->len, eb->start - page_offset(page), &bio, + end_bio_extent_readpage, mirror_num, 0, 0, + true); + if (ret) { + /* + * In the endio function, if we hit something wrong we will + * increase the io_pages, so here we need to decrease it for error + * path. + */ + atomic_dec(&eb->io_pages); + } + if (bio) { + int tmp; + + tmp = submit_one_bio(bio, mirror_num, 0); + if (tmp < 0) + return tmp; + } + if (ret || wait != WAIT_COMPLETE) + return ret; + + wait_extent_bit(io_tree, eb->start, eb->start + eb->len - 1, + EXTENT_LOCKED); + if (!test_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags)) + ret = -EIO; + return ret; +} + int read_extent_buffer_pages(struct extent_buffer *eb, int wait, int mirror_num) { int i; @@ -5697,6 +5768,9 @@ int read_extent_buffer_pages(struct extent_buffer *eb, int wait, int mirror_num) if (test_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags)) return 0; + if (btrfs_is_subpage(eb->fs_info)) + return read_extent_buffer_subpage(eb, wait, mirror_num); + num_pages = num_extent_pages(eb); for (i = 0; i < num_pages; i++) { page = eb->pages[i]; From patchwork Wed Oct 21 06:25:24 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848385 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EB0B6C56201 for ; Wed, 21 Oct 2020 06:27:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9193420790 for ; Wed, 21 Oct 2020 06:27:23 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="btnijf03" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440809AbgJUG1W (ORCPT ); Wed, 21 Oct 2020 02:27:22 -0400 Received: from mx2.suse.de ([195.135.220.15]:43966 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2440808AbgJUG1W (ORCPT ); Wed, 21 Oct 2020 02:27:22 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261640; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=e+xXPfiXumYbmvBYiA2E0ETkq2njf5glPziPuvE59I4=; b=btnijf03hiIpav/xLGrsyOXSb7T8N1is/i5THoCC0q0GVkqCPFS8r5rphu0EqKTIqC6dv2 8Fd/lCzEvtScrQjU4wFMkmdzNpHsEFJgIl3XSetlh8FJCt3Et132AB74kTVnGWiM8wOJJJ V58w++T4feleQAclUdMjBjtlWwazK0c= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 96E83AC1D for ; Wed, 21 Oct 2020 06:27:20 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 38/68] btrfs: extent_io: implement try_release_extent_buffer() for subpage metadata support Date: Wed, 21 Oct 2020 14:25:24 +0800 Message-Id: <20201021062554.68132-39-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org For try_release_extent_buffer(), we just iterate through all the range with EXTENT_NEW set, and try freeing each extent buffer. Also introduce a helper, find_first_subpage_eb(), to locate find the first eb in the range. This helper will also be utilized for later subpage patches. Signed-off-by: Qu Wenruo --- fs/btrfs/disk-io.c | 6 ++++ fs/btrfs/extent_io.c | 83 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 89 insertions(+) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 89021e552da0..efbe12e4f952 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -1047,6 +1047,12 @@ static int btree_writepages(struct address_space *mapping, static int btree_readpage(struct file *file, struct page *page) { + /* + * For subpage, we don't support VFS to call btree_readpages(), + * directly. + */ + if (btrfs_is_subpage(page_to_fs_info(page))) + return -ENOTTY; return extent_read_full_page(page, btree_get_extent, 0); } diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 2f9609d35f0c..6a34b33be1fc 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -2772,6 +2772,48 @@ blk_status_t btrfs_submit_read_repair(struct inode *inode, return status; } +/* + * A helper for locate subpage extent buffer. + * + * NOTE: returned extent buffer won't has its ref increased. + * + * @extra_bits: Extra bits to match. + * The returned eb range will match all extra_bits. + * + * Return 0 if we found one extent buffer and record it in @eb_ret. + * Return 1 if there is no extent buffer in the range. + */ +static int find_first_subpage_eb(struct btrfs_fs_info *fs_info, + struct extent_buffer **eb_ret, u64 start, + u64 end, u32 extra_bits) +{ + struct extent_io_tree *io_tree = info_to_btree_io_tree(fs_info); + u64 found_start; + u64 found_end; + int ret; + + ASSERT(btrfs_is_subpage(fs_info) && eb_ret); + + ret = find_first_extent_bit(io_tree, start, &found_start, &found_end, + EXTENT_HAS_TREE_BLOCK | extra_bits, true, NULL); + if (ret > 0 || found_start > end) + return 1; + + /* found_start can be smaller than start */ + start = max(start, found_start); + + /* + * Here we can't call find_extent_buffer() which will increase + * eb->refs. + */ + rcu_read_lock(); + *eb_ret = radix_tree_lookup(&fs_info->buffer_radix, + start / fs_info->sectorsize); + rcu_read_unlock(); + ASSERT(*eb_ret); + return 0; +} + /* lots and lots of room for performance fixes in the end_bio funcs */ void end_extent_writepage(struct page *page, int err, u64 start, u64 end) @@ -6389,10 +6431,51 @@ void memmove_extent_buffer(const struct extent_buffer *dst, } } +static int try_release_subpage_eb(struct page *page) +{ + struct btrfs_fs_info *fs_info = page_to_fs_info(page); + struct extent_io_tree *io_tree = info_to_btree_io_tree(fs_info); + u64 cur = page_offset(page); + u64 end = page_offset(page) + PAGE_SIZE - 1; + int ret; + + while (cur <= end) { + struct extent_buffer *eb; + + ret = find_first_subpage_eb(fs_info, &eb, cur, end, 0); + if (ret > 0) + break; + + cur = eb->start + eb->len; + + spin_lock(&eb->refs_lock); + if (atomic_read(&eb->refs) != 1 || extent_buffer_under_io(eb) || + !test_and_clear_bit(EXTENT_BUFFER_TREE_REF, &eb->bflags)) { + spin_unlock(&eb->refs_lock); + continue; + } + /* + * Here we don't care the return value, we will always check + * the EXTENT_HAS_TREE_BLOCK bit at the end. + */ + release_extent_buffer(eb); + } + + /* Finally check if there is any EXTENT_HAS_TREE_BLOCK bit remaining */ + if (test_range_bit(io_tree, page_offset(page), end, + EXTENT_HAS_TREE_BLOCK, 0, NULL)) + ret = 0; + else + ret = 1; + return ret; +} + int try_release_extent_buffer(struct page *page) { struct extent_buffer *eb; + if (btrfs_is_subpage(page_to_fs_info(page))) + return try_release_subpage_eb(page); /* * We need to make sure nobody is attaching this page to an eb right * now. From patchwork Wed Oct 21 06:25:25 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848387 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6DA1BC4363A for ; Wed, 21 Oct 2020 06:27:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0BFF521D43 for ; Wed, 21 Oct 2020 06:27:27 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="pClc9fsF" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440812AbgJUG11 (ORCPT ); Wed, 21 Oct 2020 02:27:27 -0400 Received: from mx2.suse.de ([195.135.220.15]:43996 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2440808AbgJUG1Z (ORCPT ); Wed, 21 Oct 2020 02:27:25 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261643; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Z73rGowfb+sS0owO39aX70dawhN8LD7wCaX353w+TJI=; b=pClc9fsFT9vedr1iWZayIbvOGyS0cBHFW/rmtlk+Cre79/aO9yVK825WTZMIE4WnCT7vxn qUInWmVVWXmGsLFIEGBItP0B9oDaCUHux7f3zB/JeAMlsRXyWQ8FAs/Kgvet79eT7+xV30 sqSwyQTs3F1m+9K8+ZzZtPqHxnYKvL0= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 61C7AAC1D for ; Wed, 21 Oct 2020 06:27:23 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 39/68] btrfs: extent_io: extra the core of test_range_bit() into test_range_bit_nolock() Date: Wed, 21 Oct 2020 14:25:25 +0800 Message-Id: <20201021062554.68132-40-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org This allows later function to utilize test_range_bit_nolock() with caller handling the lock. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 32 ++++++++++++++++++++++---------- 1 file changed, 22 insertions(+), 10 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 6a34b33be1fc..37593b599522 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -2213,20 +2213,16 @@ struct io_failure_record *get_state_failrec(struct extent_io_tree *tree, u64 sta return failrec; } -/* - * searches a range in the state tree for a given mask. - * If 'filled' == 1, this returns 1 only if every extent in the tree - * has the bits set. Otherwise, 1 is returned if any bit in the - * range is found set. - */ -int test_range_bit(struct extent_io_tree *tree, u64 start, u64 end, - u32 bits, int filled, struct extent_state *cached) +static int test_range_bit_nolock(struct extent_io_tree *tree, u64 start, + u64 end, u32 bits, int filled, + struct extent_state *cached) { struct extent_state *state = NULL; struct rb_node *node; int bitset = 0; - spin_lock(&tree->lock); + assert_spin_locked(&tree->lock); + if (cached && extent_state_in_tree(cached) && cached->start <= start && cached->end > start) node = &cached->rb_node; @@ -2265,10 +2261,26 @@ int test_range_bit(struct extent_io_tree *tree, u64 start, u64 end, break; } } - spin_unlock(&tree->lock); return bitset; } +/* + * searches a range in the state tree for a given mask. + * If 'filled' == 1, this returns 1 only if every extent in the tree + * has the bits set. Otherwise, 1 is returned if any bit in the + * range is found set. + */ +int test_range_bit(struct extent_io_tree *tree, u64 start, u64 end, + u32 bits, int filled, struct extent_state *cached) +{ + int ret; + + spin_lock(&tree->lock); + ret = test_range_bit_nolock(tree, start, end, bits, filled, cached); + spin_unlock(&tree->lock); + return ret; +} + /* * helper function to set a given page up to date if all the * extents in the tree for that page are up to date From patchwork Wed Oct 21 06:25:26 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848393 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4934EC4363A for ; Wed, 21 Oct 2020 06:27:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E7BD521D43 for ; Wed, 21 Oct 2020 06:27:29 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="Gujcd2Xf" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440813AbgJUG13 (ORCPT ); Wed, 21 Oct 2020 02:27:29 -0400 Received: from mx2.suse.de ([195.135.220.15]:44016 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2436568AbgJUG12 (ORCPT ); Wed, 21 Oct 2020 02:27:28 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261646; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dDuFan7Aeq1BHQlsx4hjW1K4v3F3c0HWDMwkUlRqFBU=; b=Gujcd2XfE6X9AgLsDImKN2XRg4DGilo5HZA8rmdegIBplNP7A2LQY/v2GfMZRdX9kjyyg0 by/rhfSFAkiqkhekPab2Lm75wWXEdxJv9CYPk5/APGCv72ub8ln2/I7c1s/L6gzGXmfslZ sIhzeA5cHHetBNznW8CS+jKuVP5IAek= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 55AACAC12 for ; Wed, 21 Oct 2020 06:27:26 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 40/68] btrfs: extent_io: introduce EXTENT_READ_SUBMITTED to handle subpage data read Date: Wed, 21 Oct 2020 14:25:26 +0800 Message-Id: <20201021062554.68132-41-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org In end_bio_extent_readpage(), we will unlock the page for each segment, this is fine for regular sectorsize == PAGE_SIZE case. But for subpage size case, we may have several bio segments for the same page, and unlock the page unconditionally could easily screw up the locking. To address the problem: - Introduce a new bit, EXTENT_READ_SUBMITTED Now for subpage data read, each submitted read bio will have its range with EXTENT_READ_SUBMITTED set. - Set the EXTENT_READ_SUBMITTED in __do_readpage() Set the full page with EXTENT_READ_SUBMITTED set. - Clear and test if we're the last owner of EXTENT_READ_SUBMITTED in end_bio_extent_readpage() and __do_readpage() This ensures that no matter who finishes filling the page, the last owner will unlock the page. This is quite different from regular sectorsize case, where one page either get unlocked in __do_readpage() or in end_bio_extent_readpage(). Signed-off-by: Qu Wenruo --- fs/btrfs/extent-io-tree.h | 22 ++++++++ fs/btrfs/extent_io.c | 115 +++++++++++++++++++++++++++++++++++--- 2 files changed, 129 insertions(+), 8 deletions(-) diff --git a/fs/btrfs/extent-io-tree.h b/fs/btrfs/extent-io-tree.h index bdafac1bd15f..d3b21c732634 100644 --- a/fs/btrfs/extent-io-tree.h +++ b/fs/btrfs/extent-io-tree.h @@ -26,6 +26,15 @@ struct io_failure_record; /* For subpage btree io tree, indicates there is an in-tree extent buffer */ #define EXTENT_HAS_TREE_BLOCK (1U << 15) +/* + * For subpage data io tree, indicates there is an read bio submitted. + * The last one to clear the bit in the page will be responsible to unlock + * the containg page. + * + * TODO: Remove this if we use iomap for data read. + */ +#define EXTENT_READ_SUBMITTED (1U << 16) + #define EXTENT_DO_ACCOUNTING (EXTENT_CLEAR_META_RESV | \ EXTENT_CLEAR_DATA_RESV) #define EXTENT_CTLBITS (EXTENT_DO_ACCOUNTING) @@ -115,6 +124,19 @@ struct extent_io_extra_options { */ bool wake; bool delete; + + /* + * For __clear_extent_bit(), to skip the spin lock and rely on caller + * for the lock. + * This allows the caller to do test-and-clear in a spinlock. + */ + bool skip_lock; + + /* + * For __clear_extent_bit(), paired with skip_lock, to provide the + * preallocated extent_state. + */ + struct extent_state *prealloc; }; int __init extent_state_cache_init(void); diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 37593b599522..5254a4ce2598 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -710,6 +710,7 @@ int __clear_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, struct rb_node *node; bool wake; bool delete; + bool skip_lock; u64 last_end; int err; int clear = 0; @@ -719,8 +720,13 @@ int __clear_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, changeset = extra_opts->changeset; wake = extra_opts->wake; delete = extra_opts->delete; + skip_lock = extra_opts->skip_lock; - btrfs_debug_check_extent_io_range(tree, start, end); + if (skip_lock) + ASSERT(!gfpflags_allow_blocking(mask)); + + if (!skip_lock) + btrfs_debug_check_extent_io_range(tree, start, end); trace_btrfs_clear_extent_bit(tree, start, end - start + 1, bits); if (bits & EXTENT_DELALLOC) @@ -742,8 +748,11 @@ int __clear_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, */ prealloc = alloc_extent_state(mask); } + if (!prealloc && skip_lock) + prealloc = extra_opts->prealloc; - spin_lock(&tree->lock); + if (!skip_lock) + spin_lock(&tree->lock); if (cached_state) { cached = *cached_state; @@ -848,15 +857,20 @@ int __clear_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, search_again: if (start > end) goto out; - spin_unlock(&tree->lock); - if (gfpflags_allow_blocking(mask)) - cond_resched(); + if (!skip_lock) { + spin_unlock(&tree->lock); + if (gfpflags_allow_blocking(mask)) + cond_resched(); + } goto again; out: - spin_unlock(&tree->lock); + if (!skip_lock) + spin_unlock(&tree->lock); if (prealloc) free_extent_state(prealloc); + if (skip_lock) + extra_opts->prealloc = NULL; return 0; @@ -2926,6 +2940,70 @@ endio_readpage_release_extent(struct extent_io_tree *tree, struct page *page, unlock_extent_cached_atomic(tree, start, end, &cached); } +/* + * Finish the read and unlock the page if needed. + * + * For regular sectorsize == PAGE_SIZE case, just unlock the page. + * For subpage case, clear the EXTENT_READ_SUBMITTED bit, then if and + * only if we're the last EXTENT_READ_SUBMITTED of the page. + */ +static void finish_and_unlock_read_page(struct btrfs_fs_info *fs_info, + struct extent_io_tree *tree, u64 start, u64 end, + struct page *page, bool in_endio_context) +{ + struct extent_io_extra_options extra_opts = { + .skip_lock = true, + }; + u64 page_start = round_down(start, PAGE_SIZE); + u64 page_end = page_start + PAGE_SIZE - 1; + bool metadata = (tree->owner == IO_TREE_BTREE_INODE_IO); + bool has_bit = true; + bool last_owner = false; + + /* + * For subpage metadata, we don't lock page for read/write at all, + * just exit. + */ + if (btrfs_is_subpage(fs_info) && metadata) + return; + + /* For regular sector size, we need to unlock the full page for endio */ + if (!btrfs_is_subpage(fs_info)) { + /* + * This function can be called in __do_readpage(), in that case we + * shouldn't unlock the page. + */ + if (in_endio_context) + unlock_page(page); + return; + } + + /* + * The remaining case is subpage data read, which we need to update + * EXTENT_READ_SUBMITTED and unlock the page for the last reader. + */ + ASSERT(end <= page_end); + + /* Will be freed in __clear_extent_bit() */ + extra_opts.prealloc = alloc_extent_state(GFP_NOFS); + + spin_lock(&tree->lock); + /* Check if we have the bit first */ + if (IS_ENABLED(CONFIG_BTRFS_DEBUG)) { + has_bit = test_range_bit_nolock(tree, start, end, + EXTENT_READ_SUBMITTED, 1, NULL); + WARN_ON(!has_bit); + } + + __clear_extent_bit(tree, start, end, EXTENT_READ_SUBMITTED, NULL, + GFP_ATOMIC, &extra_opts); + last_owner = !test_range_bit_nolock(tree, page_start, page_end, + EXTENT_READ_SUBMITTED, 0, NULL); + spin_unlock(&tree->lock); + if (has_bit && last_owner) + unlock_page(page); +} + /* * after a readpage IO is done, we need to: * clear the uptodate bits on error @@ -3050,7 +3128,7 @@ static void end_bio_extent_readpage(struct bio *bio) offset += len; endio_readpage_release_extent(tree, page, start, end, uptodate); - unlock_page(page); + finish_and_unlock_read_page(fs_info, tree, start, end, page, true); } btrfs_io_bio_free_csum(io_bio); @@ -3277,6 +3355,7 @@ __get_extent_map(struct inode *inode, struct page *page, size_t pg_offset, } return em; } + /* * basic readpage implementation. Locked extent state structs are inserted * into the tree that are removed when the IO is done (by the end_io @@ -3292,6 +3371,7 @@ static int __do_readpage(struct page *page, u64 *prev_em_start) { struct inode *inode = page->mapping->host; + struct btrfs_fs_info *fs_info = BTRFS_I(inode)->root->fs_info; u64 start = page_offset(page); const u64 end = start + PAGE_SIZE - 1; u64 cur = start; @@ -3330,6 +3410,9 @@ static int __do_readpage(struct page *page, kunmap_atomic(userpage); } } + + if (btrfs_is_subpage(fs_info)) + set_extent_bits(tree, start, end, EXTENT_READ_SUBMITTED); while (cur <= end) { bool force_bio_submit = false; u64 offset; @@ -3347,6 +3430,8 @@ static int __do_readpage(struct page *page, &cached, GFP_NOFS); unlock_extent_cached(tree, cur, cur + iosize - 1, &cached); + finish_and_unlock_read_page(fs_info, tree, cur, + cur + iosize - 1, page, false); break; } em = __get_extent_map(inode, page, pg_offset, cur, @@ -3354,6 +3439,8 @@ static int __do_readpage(struct page *page, if (IS_ERR_OR_NULL(em)) { SetPageError(page); unlock_extent(tree, cur, end); + finish_and_unlock_read_page(fs_info, tree, cur, + cur + iosize - 1, page, false); break; } extent_offset = cur - em->start; @@ -3436,6 +3523,8 @@ static int __do_readpage(struct page *page, &cached, GFP_NOFS); unlock_extent_cached(tree, cur, cur + iosize - 1, &cached); + finish_and_unlock_read_page(fs_info, tree, cur, + cur + iosize - 1, page, false); cur = cur + iosize; pg_offset += iosize; continue; @@ -3445,6 +3534,8 @@ static int __do_readpage(struct page *page, EXTENT_UPTODATE, 1, NULL)) { check_page_uptodate(tree, page); unlock_extent(tree, cur, cur + iosize - 1); + finish_and_unlock_read_page(fs_info, tree, cur, + cur + iosize - 1, page, false); cur = cur + iosize; pg_offset += iosize; continue; @@ -3455,6 +3546,8 @@ static int __do_readpage(struct page *page, if (block_start == EXTENT_MAP_INLINE) { SetPageError(page); unlock_extent(tree, cur, cur + iosize - 1); + finish_and_unlock_read_page(fs_info, tree, cur, + cur + iosize - 1, page, false); cur = cur + iosize; pg_offset += iosize; continue; @@ -3482,7 +3575,13 @@ static int __do_readpage(struct page *page, if (!nr) { if (!PageError(page)) SetPageUptodate(page); - unlock_page(page); + /* + * Subpage case will unlock the page in + * finish_and_unlock_read_page() according to the + * EXTENT_READ_SUBMITTED status. + */ + if (!btrfs_is_subpage(fs_info)) + unlock_page(page); } return ret; } From patchwork Wed Oct 21 06:25:27 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848381 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D3FD6C4363A for ; Wed, 21 Oct 2020 06:27:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6911321D43 for ; Wed, 21 Oct 2020 06:27:32 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="G6ykUonD" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440816AbgJUG1b (ORCPT ); Wed, 21 Oct 2020 02:27:31 -0400 Received: from mx2.suse.de ([195.135.220.15]:44046 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2436568AbgJUG1b (ORCPT ); Wed, 21 Oct 2020 02:27:31 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261650; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NaisI4tRgIxExBLYX2BkuNqgYAVDdP62/iyDNZC8Z8Y=; b=G6ykUonDHyuc5Kx0IPaDewbiYzAXD6148vejd9Hm/sqJWN3UKauQdnthheacK93yeykSoe bEF7Tk8kmx5gdnjbB4swEXq6pDireEDvyMzYMf5Njn5uLoVVTQ2mX8b2nEM1GQnkRy+/Q6 T/yuSpdqvfQz1ACuaE/lBTMRvJUeMSM= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 088E7AC12 for ; Wed, 21 Oct 2020 06:27:30 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 41/68] btrfs: set btree inode track_uptodate for subpage support Date: Wed, 21 Oct 2020 14:25:27 +0800 Message-Id: <20201021062554.68132-42-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Let btree io tree to track EXTENT_UPTODATE bit, so that for subpage metadata IO, we don't need to bother tracking the UPTODATE status manually through bio submission/endio functions. Currently only subpage metadata will cleanup the extra bits utizlied (EXTENT_HAS_TREE_BLOCK, EXTENT_UPTODATE, EXTENT_LOCKED), while the regular page size will only clean up EXTENT_LOCKED. This still allows the regular page size case to avoid the extra delay in extent io tree operations, but allows subpage case to be sector size aligned. Signed-off-by: Qu Wenruo --- fs/btrfs/disk-io.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index efbe12e4f952..97c44f518a49 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -2244,7 +2244,14 @@ static void btrfs_init_btree_inode(struct btrfs_fs_info *fs_info) RB_CLEAR_NODE(&BTRFS_I(inode)->rb_node); extent_io_tree_init(fs_info, &BTRFS_I(inode)->io_tree, IO_TREE_BTREE_INODE_IO, inode); - BTRFS_I(inode)->io_tree.track_uptodate = false; + /* + * For subpage size support, btree inode tracks EXTENT_UPTODATE for + * its IO. + */ + if (btrfs_is_subpage(fs_info)) + BTRFS_I(inode)->io_tree.track_uptodate = true; + else + BTRFS_I(inode)->io_tree.track_uptodate = false; extent_map_tree_init(&BTRFS_I(inode)->extent_tree); BTRFS_I(inode)->io_tree.ops = &btree_extent_io_ops; From patchwork Wed Oct 21 06:25:28 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848397 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1F1E1C561F8 for ; Wed, 21 Oct 2020 06:27:35 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BEAF321D43 for ; Wed, 21 Oct 2020 06:27:34 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="fhtwk9D/" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440819AbgJUG1e (ORCPT ); Wed, 21 Oct 2020 02:27:34 -0400 Received: from mx2.suse.de ([195.135.220.15]:44064 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2436568AbgJUG1d (ORCPT ); Wed, 21 Oct 2020 02:27:33 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261652; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NiJeh8dbrKT2BY0wSXL7NHSBFI06q2uR9GdxgHUXOWA=; b=fhtwk9D/chLPm5QQq4Adb7t/UpZ94Mq0uL4jb6IOgNlaBkJcDjgfHLY1dY/iCLdSLuRC5u NziEYWGUox5yHPSXDwweL2cToQkQaTPEo6unn0sdEk3WBXfAGoUmnTd4tBV0KiAc2YYqIn JZWfYV28wZooBhyljB7YHkBxFtsbmNo= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 8A9A9AC1D for ; Wed, 21 Oct 2020 06:27:32 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 42/68] btrfs: allow RO mount of 4K sector size fs on 64K page system Date: Wed, 21 Oct 2020 14:25:28 +0800 Message-Id: <20201021062554.68132-43-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org This adds the basic RO mount ability for 4K sector size on 64K page system. Currently we only plan to support 4K and 64K page system. Signed-off-by: Qu Wenruo --- fs/btrfs/disk-io.c | 24 +++++++++++++++++++++--- fs/btrfs/super.c | 7 +++++++ 2 files changed, 28 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 97c44f518a49..e0dc7b92411e 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -2565,13 +2565,21 @@ static int validate_super(struct btrfs_fs_info *fs_info, btrfs_err(fs_info, "invalid sectorsize %llu", sectorsize); ret = -EINVAL; } - /* Only PAGE SIZE is supported yet */ - if (sectorsize != PAGE_SIZE) { + + /* + * For 4K page size, we only support 4K sector size. + * For 64K page size, we support RW for 64K sector size, and RO for + * 4K sector size. + */ + if ((PAGE_SIZE == SZ_4K && sectorsize != PAGE_SIZE) || + (PAGE_SIZE == SZ_64K && (sectorsize != SZ_4K && + sectorsize != SZ_64K))) { btrfs_err(fs_info, - "sectorsize %llu not supported yet, only support %lu", + "sectorsize %llu not supported yet for page size %lu", sectorsize, PAGE_SIZE); ret = -EINVAL; } + if (!is_power_of_2(nodesize) || nodesize < sectorsize || nodesize > BTRFS_MAX_METADATA_BLOCKSIZE) { btrfs_err(fs_info, "invalid nodesize %llu", nodesize); @@ -3219,6 +3227,16 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device goto fail_alloc; } + /* For 4K sector size support, it's only read-only yet */ + if (PAGE_SIZE == SZ_64K && sectorsize == SZ_4K) { + if (!sb_rdonly(sb) || btrfs_super_log_root(disk_super)) { + btrfs_err(fs_info, + "subpage sector size only support RO yet"); + err = -EINVAL; + goto fail_alloc; + } + } + ret = btrfs_init_workqueues(fs_info, fs_devices); if (ret) { err = ret; diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 25967ecaaf0a..743a2fadf4ee 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -1922,6 +1922,13 @@ static int btrfs_remount(struct super_block *sb, int *flags, char *data) ret = -EINVAL; goto restore; } + if (btrfs_is_subpage(fs_info)) { + btrfs_warn(fs_info, + "read-write mount is not yet allowed for sector size %u page size %lu", + fs_info->sectorsize, PAGE_SIZE); + ret = -EINVAL; + goto restore; + } ret = btrfs_cleanup_fs_roots(fs_info); if (ret) From patchwork Wed Oct 21 06:25:29 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848389 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AFA12C561F8 for ; Wed, 21 Oct 2020 06:27:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 578AD21D43 for ; Wed, 21 Oct 2020 06:27:37 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="shlu450y" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440822AbgJUG1g (ORCPT ); Wed, 21 Oct 2020 02:27:36 -0400 Received: from mx2.suse.de ([195.135.220.15]:44106 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2436568AbgJUG1g (ORCPT ); Wed, 21 Oct 2020 02:27:36 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261654; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=IuY4S+2EtACZ9worVlRYr7BkM7HczHMSuuFSkUtGpco=; b=shlu450yWXVMiWGvwvbEgxC+gj2qJkUU/fT5ARkJpX1fHX+6VQeUIga/esvLCIUnQ6nE4G gP1VUUyVA+rmbCyFVa3wqWDSfCCMg68y8Qeo1u5oQNhJ3ibhy0m2gFC3TBIKzVcbvR7r2n eiZ3B4333d1rA8DzPzy3gIg6k1tLY4c= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 4AD87AC12 for ; Wed, 21 Oct 2020 06:27:34 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 43/68] btrfs: disk-io: allow btree_set_page_dirty() to do more sanity check on subpage metadata Date: Wed, 21 Oct 2020 14:25:29 +0800 Message-Id: <20201021062554.68132-44-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org For btree_set_page_dirty(), we should also check the extent buffer sanity for subpage support. Unlike the regular sector size case, since one page can contain multile extent buffers, and page::private no longer contains the pointer to extent buffer. So this patch will iterate through the extent_io_tree to find out any EXTENT_HAS_TREE_BLOCK bit, and check if any extent buffers in the page range has EXTENT_BUFFER_DIRTY and proper refs. Also, since we need to find subpage extent outside of extent_io.c, export find_first_subpage_eb() as btrfs_find_first_subpage_eb(). Signed-off-by: Qu Wenruo --- fs/btrfs/disk-io.c | 36 ++++++++++++++++++++++++++++++------ fs/btrfs/extent_io.c | 8 ++++---- fs/btrfs/extent_io.h | 4 ++++ 3 files changed, 38 insertions(+), 10 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index e0dc7b92411e..d31999978821 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -1110,14 +1110,38 @@ static void btree_invalidatepage(struct page *page, unsigned int offset, static int btree_set_page_dirty(struct page *page) { #ifdef DEBUG + struct btrfs_fs_info *fs_info = page_to_fs_info(page); struct extent_buffer *eb; - BUG_ON(!PagePrivate(page)); - eb = (struct extent_buffer *)page->private; - BUG_ON(!eb); - BUG_ON(!test_bit(EXTENT_BUFFER_DIRTY, &eb->bflags)); - BUG_ON(!atomic_read(&eb->refs)); - btrfs_assert_tree_locked(eb); + if (fs_info->sectorsize == PAGE_SIZE) { + BUG_ON(!PagePrivate(page)); + eb = (struct extent_buffer *)page->private; + BUG_ON(!eb); + BUG_ON(!test_bit(EXTENT_BUFFER_DIRTY, &eb->bflags)); + BUG_ON(!atomic_read(&eb->refs)); + btrfs_assert_tree_locked(eb); + } else { + u64 page_start = page_offset(page); + u64 page_end = page_start + PAGE_SIZE - 1; + u64 cur = page_start; + bool found_dirty_eb = false; + int ret; + + ASSERT(btrfs_is_subpage(fs_info)); + while (cur <= page_end) { + ret = btrfs_find_first_subpage_eb(fs_info, &eb, cur, + page_end, 0); + if (ret > 0) + break; + cur = eb->start + eb->len; + if (test_bit(EXTENT_BUFFER_DIRTY, &eb->bflags)) { + found_dirty_eb = true; + ASSERT(atomic_read(&eb->refs)); + btrfs_assert_tree_locked(eb); + } + } + BUG_ON(!found_dirty_eb); + } #endif return __set_page_dirty_nobuffers(page); } diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 5254a4ce2598..278154d405ea 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -2809,9 +2809,9 @@ blk_status_t btrfs_submit_read_repair(struct inode *inode, * Return 0 if we found one extent buffer and record it in @eb_ret. * Return 1 if there is no extent buffer in the range. */ -static int find_first_subpage_eb(struct btrfs_fs_info *fs_info, - struct extent_buffer **eb_ret, u64 start, - u64 end, u32 extra_bits) +int btrfs_find_first_subpage_eb(struct btrfs_fs_info *fs_info, + struct extent_buffer **eb_ret, u64 start, + u64 end, u32 extra_bits) { struct extent_io_tree *io_tree = info_to_btree_io_tree(fs_info); u64 found_start; @@ -6553,7 +6553,7 @@ static int try_release_subpage_eb(struct page *page) while (cur <= end) { struct extent_buffer *eb; - ret = find_first_subpage_eb(fs_info, &eb, cur, end, 0); + ret = btrfs_find_first_subpage_eb(fs_info, &eb, cur, end, 0); if (ret > 0) break; diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h index 602d6568c8ea..f527b6fa258d 100644 --- a/fs/btrfs/extent_io.h +++ b/fs/btrfs/extent_io.h @@ -298,6 +298,10 @@ struct bio *btrfs_bio_clone_partial(struct bio *orig, int offset, int size); struct btrfs_fs_info; struct btrfs_inode; +int btrfs_find_first_subpage_eb(struct btrfs_fs_info *fs_info, + struct extent_buffer **eb_ret, u64 start, + u64 end, unsigned int extra_bits); + int repair_io_failure(struct btrfs_fs_info *fs_info, u64 ino, u64 start, u64 length, u64 logical, struct page *page, unsigned int pg_offset, int mirror_num); From patchwork Wed Oct 21 06:25:30 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848395 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 04D3BC4363A for ; Wed, 21 Oct 2020 06:27:40 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9A69B21D43 for ; Wed, 21 Oct 2020 06:27:39 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="JQhhv8cO" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440824AbgJUG1j (ORCPT ); Wed, 21 Oct 2020 02:27:39 -0400 Received: from mx2.suse.de ([195.135.220.15]:44126 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2436568AbgJUG1i (ORCPT ); Wed, 21 Oct 2020 02:27:38 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261657; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=IQuXTQ8wGZMeGkbOyMaJVaGYtD6UrQBAoNom1MdB1HQ=; b=JQhhv8cOZ696vu3t6IReKZkBh+w/wAl2h3sXgepcdL17sQYminBSawgP0QHOQ2ZCyo5xr8 VpxyrvzTAY6QTP2RgPamZeLWy42MJcQwDG15NOt1VPsObQbyzAoyfsL19W0FFJnUgYMcQk bi8TIsJwNpK5vr5EsYJHgWb890qpCXo= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 0BA4DAC12 for ; Wed, 21 Oct 2020 06:27:37 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 44/68] btrfs: disk-io: support subpage metadata csum calculation at write time Date: Wed, 21 Oct 2020 14:25:30 +0800 Message-Id: <20201021062554.68132-45-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Add a new helper, csum_dirty_subpage_buffers(), to iterate through all possible extent buffers in one bvec. Also extract the code to calculate csum for one extent buffer into csum_one_extent_buffer(), so that both the existing csum_dirty_buffer() and the new csum_dirty_subpage_buffers() can reuse the same routine. Signed-off-by: Qu Wenruo --- fs/btrfs/disk-io.c | 103 ++++++++++++++++++++++++++++++++++----------- 1 file changed, 79 insertions(+), 24 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index d31999978821..9aa68e2344e1 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -490,35 +490,13 @@ static int btree_read_extent_buffer_pages(struct extent_buffer *eb, return ret; } -/* - * checksum a dirty tree block before IO. This has extra checks to make sure - * we only fill in the checksum field in the first page of a multi-page block - */ - -static int csum_dirty_buffer(struct btrfs_fs_info *fs_info, struct bio_vec *bvec) +static int csum_one_extent_buffer(struct extent_buffer *eb) { - struct extent_buffer *eb; - struct page *page = bvec->bv_page; - u64 start = page_offset(page); - u64 found_start; + struct btrfs_fs_info *fs_info = eb->fs_info; u8 result[BTRFS_CSUM_SIZE]; u16 csum_size = btrfs_super_csum_size(fs_info->super_copy); int ret; - eb = (struct extent_buffer *)page->private; - if (page != eb->pages[0]) - return 0; - - found_start = btrfs_header_bytenr(eb); - /* - * Please do not consolidate these warnings into a single if. - * It is useful to know what went wrong. - */ - if (WARN_ON(found_start != start)) - return -EUCLEAN; - if (WARN_ON(!PageUptodate(page))) - return -EUCLEAN; - ASSERT(memcmp_extent_buffer(eb, fs_info->fs_devices->metadata_uuid, offsetof(struct btrfs_header, fsid), BTRFS_FSID_SIZE) == 0); @@ -543,6 +521,83 @@ static int csum_dirty_buffer(struct btrfs_fs_info *fs_info, struct bio_vec *bvec return 0; } +/* + * Do all the csum calculation and extra sanity checks on all extent + * buffers in the bvec. + */ +static int csum_dirty_subpage_buffers(struct btrfs_fs_info *fs_info, + struct bio_vec *bvec) +{ + struct page *page = bvec->bv_page; + u64 page_start = page_offset(page); + u64 start = page_start + bvec->bv_offset; + u64 end = start + bvec->bv_len - 1; + u64 cur = start; + int ret = 0; + + while (cur <= end) { + struct extent_io_tree *io_tree = info_to_btree_io_tree(fs_info); + struct extent_buffer *eb; + + ret = btrfs_find_first_subpage_eb(fs_info, &eb, cur, end, 0); + if (ret > 0) { + ret = 0; + break; + } + + /* + * Here we can't use PageUptodate() to check the status. + * As one page is uptodate only when all its extent buffers + * are uptodate, and no holes between them. + * So here we use EXTENT_UPTODATE bit to make sure the exntent + * buffer is uptodate. + */ + if (WARN_ON(test_range_bit(io_tree, eb->start, + eb->start + eb->len - 1, EXTENT_UPTODATE, 1, + NULL) == 0)) + return -EUCLEAN; + if (WARN_ON(cur != btrfs_header_bytenr(eb))) + return -EUCLEAN; + + ret = csum_one_extent_buffer(eb); + if (ret < 0) + return ret; + cur = eb->start + eb->len; + } + return ret; +} + +/* + * checksum a dirty tree block before IO. This has extra checks to make sure + * we only fill in the checksum field in the first page of a multi-page block + */ +static int csum_dirty_buffer(struct btrfs_fs_info *fs_info, struct bio_vec *bvec) +{ + struct extent_buffer *eb; + struct page *page = bvec->bv_page; + u64 start = page_offset(page) + bvec->bv_offset; + u64 found_start; + + if (btrfs_is_subpage(fs_info)) + return csum_dirty_subpage_buffers(fs_info, bvec); + + eb = (struct extent_buffer *)page->private; + if (page != eb->pages[0]) + return 0; + + found_start = btrfs_header_bytenr(eb); + /* + * Please do not consolidate these warnings into a single if. + * It is useful to know what went wrong. + */ + if (WARN_ON(found_start != start)) + return -EUCLEAN; + if (WARN_ON(!PageUptodate(page))) + return -EUCLEAN; + + return csum_one_extent_buffer(eb); +} + static int check_tree_block_fsid(struct extent_buffer *eb) { struct btrfs_fs_info *fs_info = eb->fs_info; From patchwork Wed Oct 21 06:25:31 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848409 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CAE12C561F8 for ; Wed, 21 Oct 2020 06:27:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 74A6421D43 for ; Wed, 21 Oct 2020 06:27:43 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="W4DbKG7I" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440827AbgJUG1m (ORCPT ); Wed, 21 Oct 2020 02:27:42 -0400 Received: from mx2.suse.de ([195.135.220.15]:44212 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2436568AbgJUG1m (ORCPT ); Wed, 21 Oct 2020 02:27:42 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261660; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=HQlmND4SXpiRAAqgKKdSzrzziEt9mtAut8198udfZg0=; b=W4DbKG7I6UO5WKpJo5bKv8xQVDnTsdstqvewORoyTYu5nJgYgRw1VcjjJoNKMzoyKw9v4i vqCiXCEp6/Vcmo9jsyGKmIsBcrXCUNj48GPQXA7TK+heMNICn0dYRM5ocKO0mbY/0GvYP4 Ffww6CRf/bDB6yOgWjK8+P06vMTjRZc= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id D4FCFAC12 for ; Wed, 21 Oct 2020 06:27:40 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 45/68] btrfs: extent_io: prevent extent_state from being merged for btree io tree Date: Wed, 21 Oct 2020 14:25:31 +0800 Message-Id: <20201021062554.68132-46-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org For incoming subpage metadata rw support, prevent extent_state from being merged for btree io tree. The main cause is set_extent_buffer_dirty(). In the following call chain, we could fall into the situation where we have to call set_extent_dirty() with atomic context: alloc_reserved_tree_block() |- path->leave_spinning = 1; |- btrfs_insert_empty_item() |- btrfs_search_slot() | Now the path has all its tree block spinning locked |- setup_items_for_insert(); |- btrfs_unlock_up_safe(path, 1); | Now path->nodes[0] still spin locked |- btrfs_mark_buffer_dirty(leaf); |- set_extent_buffer_dirty() Since set_extent_buffer_dirty() is in fact a pretty common call, just fall back to GFP_ATOMIC allocation used in __set_extent_bit() may exhause the pool sooner than we expected. So this patch goes another direction, by not merging all extent_state for subpage btree io tree. Since for subpage btree io tree, all in tree extent buffers has EXTENT_HAS_TREE_BLOCK bit set during its lifespan, as long as extent_state is not merged, each extent buffer would has its own extent_state, so that set/clear_extent_bit() can reuse existing extent buffer extent_state, without allocating new memory. The cost is obvious, around 150 bytes per subpage extent buffer. But considering for subpage extent buffer, we saved 15 page pointers, this should save 120 bytes, so the net cost is just 30 bytes per subpage extent buffer, which should be acceptable. Signed-off-by: Qu Wenruo --- fs/btrfs/disk-io.c | 14 ++++++++++++-- fs/btrfs/extent-io-tree.h | 14 ++++++++++++++ fs/btrfs/extent_io.c | 19 ++++++++++++++----- 3 files changed, 40 insertions(+), 7 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 9aa68e2344e1..e466c30b52c8 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -2326,11 +2326,21 @@ static void btrfs_init_btree_inode(struct btrfs_fs_info *fs_info) /* * For subpage size support, btree inode tracks EXTENT_UPTODATE for * its IO. + * + * And never merge extent states to make all set/clear operation never + * to allocate memory, except the initial EXTENT_HAS_TREE_BLOCK bit. + * This adds extra ~150 bytes for each extent buffer. + * + * TODO: Josef's rwsem rework on tree lock would kill the leave_spining + * case, and then we can revert this behavior. */ - if (btrfs_is_subpage(fs_info)) + if (btrfs_is_subpage(fs_info)) { BTRFS_I(inode)->io_tree.track_uptodate = true; - else + BTRFS_I(inode)->io_tree.never_merge = true; + } else { BTRFS_I(inode)->io_tree.track_uptodate = false; + BTRFS_I(inode)->io_tree.never_merge = false; + } extent_map_tree_init(&BTRFS_I(inode)->extent_tree); BTRFS_I(inode)->io_tree.ops = &btree_extent_io_ops; diff --git a/fs/btrfs/extent-io-tree.h b/fs/btrfs/extent-io-tree.h index d3b21c732634..bb95c6b9ad82 100644 --- a/fs/btrfs/extent-io-tree.h +++ b/fs/btrfs/extent-io-tree.h @@ -71,6 +71,20 @@ struct extent_io_tree { u64 dirty_bytes; bool track_uptodate; + /* + * Never to merge extent_state. + * + * This allows any set/clear function to be execute in atomic context + * without allocating extra memory. + * The cost is extra memory usage. + * + * Should only be used for subpage btree io tree, which mostly adds per + * extent buffer memory usage. + * + * Default: false. + */ + bool never_merge; + /* Who owns this io tree, should be one of IO_TREE_* */ u8 owner; diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 278154d405ea..f67d88586d05 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -286,6 +286,7 @@ void extent_io_tree_init(struct btrfs_fs_info *fs_info, spin_lock_init(&tree->lock); tree->private_data = private_data; tree->owner = owner; + tree->never_merge = false; if (owner == IO_TREE_INODE_FILE_EXTENT) lockdep_set_class(&tree->lock, &file_extent_tree_class); } @@ -481,11 +482,18 @@ static inline struct rb_node *tree_search(struct extent_io_tree *tree, } /* - * utility function to look for merge candidates inside a given range. + * Utility function to look for merge candidates inside a given range. * Any extents with matching state are merged together into a single - * extent in the tree. Extents with EXTENT_IO in their state field - * are not merged because the end_io handlers need to be able to do - * operations on them without sleeping (or doing allocations/splits). + * extent in the tree. + * + * Except the following cases: + * - extent_state with EXTENT_LOCK or EXTENT_BOUNDARY bit set + * Those extents are not merged because end_io handlers need to be able + * to do operations on them without sleeping (or doing allocations/splits) + * + * - extent_io_tree with never_merge bit set + * Same reason as above, but extra call sites may have spinlock/rwlock hold, + * and we don't want to abuse GFP_ATOMIC. * * This should be called with the tree lock held. */ @@ -495,7 +503,8 @@ static void merge_state(struct extent_io_tree *tree, struct extent_state *other; struct rb_node *other_node; - if (state->state & (EXTENT_LOCKED | EXTENT_BOUNDARY)) + if (state->state & (EXTENT_LOCKED | EXTENT_BOUNDARY) || + tree->never_merge) return; other_node = rb_prev(&state->rb_node); From patchwork Wed Oct 21 06:25:32 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848403 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0FB60C561F8 for ; Wed, 21 Oct 2020 06:27:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id AB26F21D43 for ; Wed, 21 Oct 2020 06:27:45 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="RH2sRL+h" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440830AbgJUG1o (ORCPT ); Wed, 21 Oct 2020 02:27:44 -0400 Received: from mx2.suse.de ([195.135.220.15]:44246 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2436568AbgJUG1o (ORCPT ); Wed, 21 Oct 2020 02:27:44 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261662; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XgjENHHmLth06O1Zk0SfixXcVngmFirYjT/+wPiPKZc=; b=RH2sRL+h32TbblP7cFK+qof5j5LL9pBynOgWoJswHqYHNltT5z6hAovSq67RtJU80v6yqp Yk06TR70pERAr2MuFjSrduVVxJgU6NmPVXr5tLhvLjF/2Oba+Kdma7yabrcbqc/ZM2jKR8 DKw3yu4EteQHpZ3u9fmYlAbr9/BbfeM= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id AF022AC12 for ; Wed, 21 Oct 2020 06:27:42 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 46/68] btrfs: extent_io: make set_extent_buffer_dirty() to support subpage sized metadata Date: Wed, 21 Oct 2020 14:25:32 +0800 Message-Id: <20201021062554.68132-47-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org For set_extent_buffer_dirty() to support subpage sized metadata, we only need to call set_extent_dirty(). As any dirty extent buffer in the page would make the whole page dirty, we can re-use the existing routine without problem, just need to add above call of set_extent_buffer_dirty(). Now since a page is dirty if any extent buffer in it is dirty, the WARN_ON() in alloc_extent_buffer() can be falsely triggered, also update the WARN_ON(PageDirty()) check into assert_eb_range_not_dirty() to support subpage case. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 37 ++++++++++++++++++++++++++++++++++++- 1 file changed, 36 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index f67d88586d05..2cb9abdb0d60 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -5494,6 +5494,22 @@ struct extent_buffer *alloc_test_extent_buffer(struct btrfs_fs_info *fs_info, } #endif +static void assert_eb_range_not_dirty(struct extent_buffer *eb, + struct page *page) +{ + struct btrfs_fs_info *fs_info = eb->fs_info; + + if (btrfs_is_subpage(fs_info) && page->mapping) { + struct extent_io_tree *io_tree = info_to_btree_io_tree(fs_info); + + WARN_ON(test_range_bit(io_tree, eb->start, + eb->start + eb->len - 1, EXTENT_DIRTY, 0, + NULL)); + } else { + WARN_ON(PageDirty(page)); + } +} + struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, u64 start) { @@ -5566,12 +5582,13 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, * drop the ref the old guy had. */ ClearPagePrivate(p); + assert_eb_range_not_dirty(eb, p); WARN_ON(PageDirty(p)); put_page(p); } attach_extent_buffer_page(eb, p); spin_unlock(&mapping->private_lock); - WARN_ON(PageDirty(p)); + assert_eb_range_not_dirty(eb, p); eb->pages[i] = p; if (!PageUptodate(p)) uptodate = 0; @@ -5791,6 +5808,24 @@ bool set_extent_buffer_dirty(struct extent_buffer *eb) for (i = 0; i < num_pages; i++) set_page_dirty(eb->pages[i]); + /* + * For subpage size, also set the sector aligned EXTENT_DIRTY range for + * btree io tree + */ + if (btrfs_is_subpage(eb->fs_info)) { + struct extent_io_tree *io_tree = + info_to_btree_io_tree(eb->fs_info); + + /* + * set_extent_buffer_dirty() can be called with + * path->leave_spinning == 1, in that case we can't sleep. + */ + set_extent_dirty(io_tree, eb->start, eb->start + eb->len - 1, + GFP_ATOMIC); + set_page_dirty(eb->pages[0]); + return was_dirty; + } + #ifdef CONFIG_BTRFS_DEBUG for (i = 0; i < num_pages; i++) ASSERT(PageDirty(eb->pages[i])); From patchwork Wed Oct 21 06:25:33 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848401 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3768BC561F8 for ; Wed, 21 Oct 2020 06:27:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D5D7F21D43 for ; Wed, 21 Oct 2020 06:27:47 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="sp1w9n8l" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440833AbgJUG1q (ORCPT ); Wed, 21 Oct 2020 02:27:46 -0400 Received: from mx2.suse.de ([195.135.220.15]:44308 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2436568AbgJUG1q (ORCPT ); Wed, 21 Oct 2020 02:27:46 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261664; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=MLH/ETAZLXoXbqKFrQv8hJtA0ThpJ2qJcyW+qv6f3Zk=; b=sp1w9n8lgXoU5DY6PtSr9XIvUSPK9SBtQR7yxGh0vq6CE5vl+rnrTyRv9zMCME4JduBpGM tf9z4zZNv5JbyGs1DbsNmFYA/jWxle4XFqZrcMdgArCbgvmlSzu0xKGq+VZgVb/2wdCan9 A3w22KjtWCz/74FCQ+9gS/qWbuDeHqA= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 9DFA3AC8C for ; Wed, 21 Oct 2020 06:27:44 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 47/68] btrfs: extent_io: add subpage support for clear_extent_buffer_dirty() Date: Wed, 21 Oct 2020 14:25:33 +0800 Message-Id: <20201021062554.68132-48-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org To support subpage metadata, clear_extent_buffer_dirty() needs to clear the page dirty if and only if all extent buffers in the page range are no longer dirty. This is pretty different from the exist clear_extent_buffer_dirty() routine, so add a new helper function, clear_subpage_extent_buffer_dirty() to do this for subpage metadata. Also since the main part of clearing page dirty code is still the same, extract that into btree_clear_page_dirty() so that it can be utilized for both cases. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 47 +++++++++++++++++++++++++++++++++----------- 1 file changed, 35 insertions(+), 12 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 2cb9abdb0d60..76123d0f416a 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -5762,30 +5762,53 @@ void free_extent_buffer_stale(struct extent_buffer *eb) release_extent_buffer(eb); } +static void btree_clear_page_dirty(struct page *page) +{ + ASSERT(PageDirty(page)); + + lock_page(page); + clear_page_dirty_for_io(page); + xa_lock_irq(&page->mapping->i_pages); + if (!PageDirty(page)) + __xa_clear_mark(&page->mapping->i_pages, + page_index(page), PAGECACHE_TAG_DIRTY); + xa_unlock_irq(&page->mapping->i_pages); + ClearPageError(page); + unlock_page(page); +} + +static void clear_subpage_extent_buffer_dirty(const struct extent_buffer *eb) +{ + struct btrfs_fs_info *fs_info = eb->fs_info; + struct extent_io_tree *io_tree = info_to_btree_io_tree(fs_info); + struct page *page = eb->pages[0]; + u64 page_start = page_offset(page); + u64 page_end = page_start + PAGE_SIZE - 1; + int ret; + + clear_extent_dirty(io_tree, eb->start, eb->start + eb->len - 1, NULL); + ret = test_range_bit(io_tree, page_start, page_end, EXTENT_DIRTY, 0, NULL); + /* All extent buffers in the page range is cleared now */ + if (ret == 0 && PageDirty(page)) + btree_clear_page_dirty(page); + WARN_ON(atomic_read(&eb->refs) == 0); +} + void clear_extent_buffer_dirty(const struct extent_buffer *eb) { int i; int num_pages; struct page *page; + if (btrfs_is_subpage(eb->fs_info)) + return clear_subpage_extent_buffer_dirty(eb); num_pages = num_extent_pages(eb); for (i = 0; i < num_pages; i++) { page = eb->pages[i]; if (!PageDirty(page)) continue; - - lock_page(page); - WARN_ON(!PagePrivate(page)); - - clear_page_dirty_for_io(page); - xa_lock_irq(&page->mapping->i_pages); - if (!PageDirty(page)) - __xa_clear_mark(&page->mapping->i_pages, - page_index(page), PAGECACHE_TAG_DIRTY); - xa_unlock_irq(&page->mapping->i_pages); - ClearPageError(page); - unlock_page(page); + btree_clear_page_dirty(page); } WARN_ON(atomic_read(&eb->refs) == 0); } From patchwork Wed Oct 21 06:25:34 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848405 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2DDE0C561F8 for ; Wed, 21 Oct 2020 06:27:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D08B521D43 for ; Wed, 21 Oct 2020 06:27:49 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="HZlL99z6" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440835AbgJUG1t (ORCPT ); Wed, 21 Oct 2020 02:27:49 -0400 Received: from mx2.suse.de ([195.135.220.15]:44348 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2436568AbgJUG1s (ORCPT ); Wed, 21 Oct 2020 02:27:48 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261666; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=IxhoKgeEr3UTnU1OJxt8dgSF5UFeU09NqjuBNYOI1Yg=; b=HZlL99z6qs13cz6qyyePZPKi+A5S1RtyovIoiCKGs3mW1ZmrH0okV4dzhknbw3O8dZeDoR FTNSWe5yT6/J9OYPl2QJOarzCwfgGXqB5cI0WtBA7MuGXKqFBaakeWdZ8aDNXg12pX38RR l/6TaKWN7sLZTUatIZA8QGhbho5OoqA= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id C2AD8AC12 for ; Wed, 21 Oct 2020 06:27:46 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 48/68] btrfs: extent_io: make set_btree_ioerr() accept extent buffer Date: Wed, 21 Oct 2020 14:25:34 +0800 Message-Id: <20201021062554.68132-49-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Current set_btree_ioerr() only accepts @page parameter and grabs extent buffer from page::private. This works fine for sector size == PAGE_SIZE case, but not for subpage case. Adds an extra parameter, @eb, for callers to pass extent buffer to this function, so that subpage code can reuse this function. Also since we are here, change how we grab "fs_info->flags" by using the fs_info directly. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 16 +++++++--------- 1 file changed, 7 insertions(+), 9 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 76123d0f416a..1e182dfbb499 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -4047,10 +4047,9 @@ static noinline_for_stack int lock_extent_buffer_for_io(struct extent_buffer *eb return ret; } -static void set_btree_ioerr(struct page *page) +static void set_btree_ioerr(struct page *page, struct extent_buffer *eb) { - struct extent_buffer *eb = (struct extent_buffer *)page->private; - struct btrfs_fs_info *fs_info; + struct btrfs_fs_info *fs_info = eb->fs_info; SetPageError(page); if (test_and_set_bit(EXTENT_BUFFER_WRITE_ERR, &eb->bflags)) @@ -4060,7 +4059,6 @@ static void set_btree_ioerr(struct page *page) * If we error out, we should add back the dirty_metadata_bytes * to make it consistent. */ - fs_info = eb->fs_info; percpu_counter_add_batch(&fs_info->dirty_metadata_bytes, eb->len, fs_info->dirty_metadata_batch); @@ -4104,13 +4102,13 @@ static void set_btree_ioerr(struct page *page) */ switch (eb->log_index) { case -1: - set_bit(BTRFS_FS_BTREE_ERR, &eb->fs_info->flags); + set_bit(BTRFS_FS_BTREE_ERR, &fs_info->flags); break; case 0: - set_bit(BTRFS_FS_LOG1_ERR, &eb->fs_info->flags); + set_bit(BTRFS_FS_LOG1_ERR, &fs_info->flags); break; case 1: - set_bit(BTRFS_FS_LOG2_ERR, &eb->fs_info->flags); + set_bit(BTRFS_FS_LOG2_ERR, &fs_info->flags); break; default: BUG(); /* unexpected, logic error */ @@ -4135,7 +4133,7 @@ static void end_bio_extent_buffer_writepage(struct bio *bio) if (bio->bi_status || test_bit(EXTENT_BUFFER_WRITE_ERR, &eb->bflags)) { ClearPageUptodate(page); - set_btree_ioerr(page); + set_btree_ioerr(page, eb); } end_page_writeback(page); @@ -4191,7 +4189,7 @@ static noinline_for_stack int write_one_eb(struct extent_buffer *eb, end_bio_extent_buffer_writepage, 0, 0, 0, false); if (ret) { - set_btree_ioerr(p); + set_btree_ioerr(p, eb); if (PageWriteback(p)) end_page_writeback(p); if (atomic_sub_and_test(num_pages - i, &eb->io_pages)) From patchwork Wed Oct 21 06:25:35 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848411 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AEF6FC4363A for ; Wed, 21 Oct 2020 06:27:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4F59621D43 for ; Wed, 21 Oct 2020 06:27:51 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="pniMUdzv" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440838AbgJUG1u (ORCPT ); Wed, 21 Oct 2020 02:27:50 -0400 Received: from mx2.suse.de ([195.135.220.15]:44368 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2436568AbgJUG1u (ORCPT ); Wed, 21 Oct 2020 02:27:50 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261668; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=WQAI4oB7rxe6+gCGzPioqPaOAB3p6pckvyThWz2PNz4=; b=pniMUdzvWR+Pod6BnFnXxofUCqEHrq0r8H3AimKdRDfPhbTp8B7NdWH4Ps9VMSbxOHw5lb agPFF6+uEchOeio7pggfXhy4Ry91jtlp74unqe6ItTp8Os4l3wzVDAH54visXljTBEFYXf t72cVWMWdHgb527/MmQb9hKTL4TPug8= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 9866CAC1D for ; Wed, 21 Oct 2020 06:27:48 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 49/68] btrfs: extent_io: introduce write_one_subpage_eb() function Date: Wed, 21 Oct 2020 14:25:35 +0800 Message-Id: <20201021062554.68132-50-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org The new function, write_one_subpage_eb(), as a subroutine for subpage metadata write, will handle the extent buffer bio submission. The main difference between the new write_one_subpage_eb() and write_one_eb() is: - No page locking When entering write_one_subpage_eb() the page is no longer locked. We only lock the page for its status update, and unlock immeidately. Now we completely rely on extent io tree locking. - Extra EXTENT_* bits along with page status update New EXTENT_WRITEBACK bit is introduced to trace extent buffer write back. For page dirty bit, it will only be cleared if all dirty extent buffers in the page range has been cleaned. For page writeback bit, it will be set anyway, and cleared in the error path if no other extent buffers are under writeback. Signed-off-by: Qu Wenruo --- fs/btrfs/extent-io-tree.h | 3 ++ fs/btrfs/extent_io.c | 79 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 82 insertions(+) diff --git a/fs/btrfs/extent-io-tree.h b/fs/btrfs/extent-io-tree.h index bb95c6b9ad82..1658854efd70 100644 --- a/fs/btrfs/extent-io-tree.h +++ b/fs/btrfs/extent-io-tree.h @@ -35,6 +35,9 @@ struct io_failure_record; */ #define EXTENT_READ_SUBMITTED (1U << 16) +/* For subpage btree io tree, indicates the range is under writeback */ +#define EXTENT_WRITEBACK (1U << 17) + #define EXTENT_DO_ACCOUNTING (EXTENT_CLEAR_META_RESV | \ EXTENT_CLEAR_DATA_RESV) #define EXTENT_CTLBITS (EXTENT_DO_ACCOUNTING) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 1e182dfbb499..a1e039848539 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -3243,6 +3243,7 @@ static int submit_extent_page(unsigned int opf, ASSERT(bio_ret); if (*bio_ret) { + bool force_merge = false; bool contig; bool can_merge = true; @@ -3268,6 +3269,7 @@ static int submit_extent_page(unsigned int opf, if (prev_bio_flags != bio_flags || !contig || !can_merge || force_bio_submit || bio_add_page(bio, page, io_size, pg_offset) < io_size) { + ASSERT(!force_merge); ret = submit_one_bio(bio, mirror_num, prev_bio_flags); if (ret < 0) { *bio_ret = NULL; @@ -4147,6 +4149,80 @@ static void end_bio_extent_buffer_writepage(struct bio *bio) bio_put(bio); } +/* + * Unlike the work in write_one_eb(), we won't unlock the page even we + * succeeded submitting the extent buffer. + * It's callers responsibility to unlock the page after all extent + * + * Caller should still call write_one_eb() other than this function directly. + * As write_one_eb() has extra prepration before submitting the extent buffer. + */ +static int write_one_subpage_eb(struct extent_buffer *eb, + struct writeback_control *wbc, + struct extent_page_data *epd) +{ + struct btrfs_fs_info *fs_info = eb->fs_info; + struct extent_state *cached = NULL; + struct extent_io_tree *io_tree = info_to_btree_io_tree(fs_info); + struct page *page = eb->pages[0]; + u64 page_start = page_offset(page); + u64 page_end = page_start + PAGE_SIZE - 1; + unsigned int write_flags = wbc_to_write_flags(wbc) | REQ_META; + bool no_dirty_ebs = false; + int ret; + + /* Convert the EXTENT_DIRTY to EXTENT_WRITEBACK for this eb */ + ret = convert_extent_bit(io_tree, eb->start, eb->start + eb->len - 1, + EXTENT_WRITEBACK, EXTENT_DIRTY, &cached); + if (ret < 0) + return ret; + /* + * Only clear page dirty if there is no dirty extent buffer in the + * page range + * + * Also since clear_page_dirty_for_io() needs page locked, here we lock + * the page just to shut up the MM code. + */ + lock_page(page); + if (!test_range_bit(io_tree, page_start, page_end, EXTENT_DIRTY, 0, + cached)) { + clear_page_dirty_for_io(page); + no_dirty_ebs = true; + } + /* Any extent buffer writeback will mark the full page writeback */ + set_page_writeback(page); + + ret = submit_extent_page(REQ_OP_WRITE | write_flags, wbc, page, + eb->start, eb->len, eb->start - page_offset(page), + &epd->bio, end_bio_extent_buffer_writepage, 0, 0, 0, + false); + if (ret) { + clear_extent_bit(io_tree, eb->start, eb->start + eb->len - 1, + EXTENT_WRITEBACK, 0, 0, &cached); + set_btree_ioerr(page, eb); + if (PageWriteback(page) && + !test_range_bit(io_tree, page_start, page_end, + EXTENT_WRITEBACK, 0, cached)) + end_page_writeback(page); + unlock_page(page); + + if (atomic_dec_and_test(&eb->io_pages)) + end_extent_buffer_writeback(eb); + free_extent_state(cached); + return -EIO; + } + unlock_page(page); + free_extent_state(cached); + /* + * Submission finishes without problem, if no eb is dirty anymore, we + * have submitted a page. + * Update the nr_written in wbc. + */ + if (no_dirty_ebs) + update_nr_written(wbc, 1); + return ret; +} + static noinline_for_stack int write_one_eb(struct extent_buffer *eb, struct writeback_control *wbc, struct extent_page_data *epd) @@ -4178,6 +4254,9 @@ static noinline_for_stack int write_one_eb(struct extent_buffer *eb, memzero_extent_buffer(eb, start, end - start); } + if (btrfs_is_subpage(eb->fs_info)) + return write_one_subpage_eb(eb, wbc, epd); + for (i = 0; i < num_pages; i++) { struct page *p = eb->pages[i]; From patchwork Wed Oct 21 06:25:36 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848415 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AD81AC561F8 for ; Wed, 21 Oct 2020 06:27:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 482AD21D43 for ; Wed, 21 Oct 2020 06:27:53 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="ioV7EDrU" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440839AbgJUG1w (ORCPT ); Wed, 21 Oct 2020 02:27:52 -0400 Received: from mx2.suse.de ([195.135.220.15]:44422 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2436568AbgJUG1w (ORCPT ); Wed, 21 Oct 2020 02:27:52 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261670; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=RLtAXF9WHSiEw0hv457BhohOUxodBdrQq/83YGtEv2w=; b=ioV7EDrUBGlMcHP1vpzSFgIWxlnCSsj8i9EdN695JK1zW8p5PlkQPfE3mJILqz0Wy+BnRP 5lyJN5Ot36DAih2i7JrJ01MzTR92WbVRUiwBuFI/uBF4wHchxAz0/G009886/bNKYygLew gPkssw3H6G4+g0O6UGOD398+qFYIDuY= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 576F2AC1D for ; Wed, 21 Oct 2020 06:27:50 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 50/68] btrfs: extent_io: make lock_extent_buffer_for_io() subpage compatible Date: Wed, 21 Oct 2020 14:25:36 +0800 Message-Id: <20201021062554.68132-51-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org To support subpage metadata locking, the following aspects are modified: - Locking sequence For regular sectorsize, we lock extent buffer first, then lock each page. For subpage sectorsize, we only lock extent buffer, but not to lock the page as one page can contain multiple extent buffers. - Extent io tree locking For subpage metadata, we also lock the range in btree io tree. This allow the endio function to get unmerged extent_state, so that in endio function we don't need to allocate memory in atomic context. This also follows the behavior in metadata read path. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 44 ++++++++++++++++++++++++++++++++++++++------ 1 file changed, 38 insertions(+), 6 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index a1e039848539..d07972f94c40 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -3943,6 +3943,9 @@ static void end_extent_buffer_writeback(struct extent_buffer *eb) * Lock extent buffer status and pages for write back. * * May try to flush write bio if we can't get the lock. + * For subpage extent buffer, caller is responsible to lock the page, we won't + * flush write bio, which can cause extent buffers in the same page submitted + * to different bios. * * Return 0 if the extent buffer doesn't need to be submitted. * (E.g. the extent buffer is not dirty) @@ -3953,26 +3956,41 @@ static noinline_for_stack int lock_extent_buffer_for_io(struct extent_buffer *eb struct extent_page_data *epd) { struct btrfs_fs_info *fs_info = eb->fs_info; + struct extent_io_tree *io_tree = info_to_btree_io_tree(fs_info); int i, num_pages, failed_page_nr; + bool extent_locked = false; int flush = 0; int ret = 0; + if (btrfs_is_subpage(fs_info)) { + /* + * Also lock the range so that endio can always get unmerged + * extent_state. + */ + ret = lock_extent(io_tree, eb->start, eb->start + eb->len - 1); + if (ret < 0) + goto out; + extent_locked = true; + } + if (!btrfs_try_tree_write_lock(eb)) { ret = flush_write_bio(epd); if (ret < 0) - return ret; + goto out; flush = 1; btrfs_tree_lock(eb); } if (test_bit(EXTENT_BUFFER_WRITEBACK, &eb->bflags)) { btrfs_tree_unlock(eb); - if (!epd->sync_io) - return 0; + if (!epd->sync_io) { + ret = 0; + goto out; + } if (!flush) { ret = flush_write_bio(epd); if (ret < 0) - return ret; + goto out; flush = 1; } while (1) { @@ -3998,13 +4016,22 @@ static noinline_for_stack int lock_extent_buffer_for_io(struct extent_buffer *eb -eb->len, fs_info->dirty_metadata_batch); ret = 1; + btrfs_tree_unlock(eb); } else { spin_unlock(&eb->refs_lock); + btrfs_tree_unlock(eb); + if (extent_locked) + unlock_extent(io_tree, eb->start, + eb->start + eb->len - 1); } - btrfs_tree_unlock(eb); - if (!ret) + /* + * Either the tree does not need to be submitted, or we're + * submitting subpage extent buffer. + * Either we we don't need to lock the page(s). + */ + if (!ret || btrfs_is_subpage(fs_info)) return ret; num_pages = num_extent_pages(eb); @@ -4046,6 +4073,11 @@ static noinline_for_stack int lock_extent_buffer_for_io(struct extent_buffer *eb fs_info->dirty_metadata_batch); btrfs_clear_header_flag(eb, BTRFS_HEADER_FLAG_WRITTEN); btrfs_tree_unlock(eb); + /* Subpage should never reach this routine */ + ASSERT(!btrfs_is_subpage(fs_info)); +out: + if (extent_locked) + unlock_extent(io_tree, eb->start, eb->start + eb->len - 1); return ret; } From patchwork Wed Oct 21 06:25:37 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848407 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A3479C561F8 for ; Wed, 21 Oct 2020 06:27:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5D7E621D43 for ; Wed, 21 Oct 2020 06:27:56 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="IK3OzQME" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440841AbgJUG1z (ORCPT ); Wed, 21 Oct 2020 02:27:55 -0400 Received: from mx2.suse.de ([195.135.220.15]:44512 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2436568AbgJUG1z (ORCPT ); Wed, 21 Oct 2020 02:27:55 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261673; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=wvPHj1HR6QDL6G35iS2CD16teJ9ajGyv+jbBB1iGOKA=; b=IK3OzQMELEzCkRrK1iTdnMV+wyDkODwedjVYbULMr/ogCcQ9FwwsO7YNp2ry3B9g2gkly+ 4YS/vAXuxcuWSyh6WSRqptW9Nfe/OduRjrpE/UX7UjL3HbxMfGxdFLmSfcMc6p3iNcfjEb lpceQKCU/1pABnx7w0OJ744Fvop0duY= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 519A9AC12 for ; Wed, 21 Oct 2020 06:27:53 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 51/68] btrfs: extent_io: introduce submit_btree_subpage() to submit a page for subpage metadata write Date: Wed, 21 Oct 2020 14:25:37 +0800 Message-Id: <20201021062554.68132-52-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org The new function, submit_btree_subpage(), will submit all the dirty extent buffers in the page. The major difference between submit_btree_page() is: - Page locking sequence Now we lock page first then lock extent buffers, thus we don't need to unlock the page just after writting one extent buffer. The page get unlocked after we have submitted all extent buffers. - Bio submission Since one extent buffer is ensured to be contained into one page, we call submit_extent_page() directly. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 64 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 64 insertions(+) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index d07972f94c40..3a2bb2656067 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -4324,6 +4324,67 @@ static noinline_for_stack int write_one_eb(struct extent_buffer *eb, return ret; } +/* + * A helper to submit one subpage btree page. + * + * The main difference between submit_btree_page() is: + * - Page locking sequence + * Page are locked first, then lock extent buffers + * + * - Flush write bio + * We only flush bio if we may be unable to fit current extent buffers into + * current bio. + * + * Return >=0 for the number of submitted extent buffers. + * Return <0 for fatal error. + */ +static int submit_btree_subpage(struct page *page, + struct writeback_control *wbc, + struct extent_page_data *epd) +{ + struct btrfs_fs_info *fs_info = page_to_fs_info(page); + int submitted = 0; + u64 page_start = page_offset(page); + u64 page_end = page_start + PAGE_SIZE - 1; + u64 cur = page_start; + int ret; + + /* Lock and write each extent buffers in the range */ + while (cur <= page_end) { + struct extent_buffer *eb; + + ret = btrfs_find_first_subpage_eb(fs_info, &eb, cur, page_end, + EXTENT_DIRTY); + if (ret > 0) + break; + ret = atomic_inc_not_zero(&eb->refs); + if (!ret) + continue; + + cur = eb->start + eb->len; + ret = lock_extent_buffer_for_io(eb, epd); + if (ret == 0) { + free_extent_buffer(eb); + continue; + } + if (ret < 0) { + free_extent_buffer(eb); + goto cleanup; + } + ret = write_one_eb(eb, wbc, epd); + free_extent_buffer(eb); + if (ret < 0) + goto cleanup; + submitted++; + } + return submitted; + +cleanup: + /* We hit error, end bio for the submitted extent buffers */ + end_write_bio(epd, ret); + return ret; +} + /* * A helper to submit a btree page. * @@ -4349,6 +4410,9 @@ static int submit_btree_page(struct page *page, struct writeback_control *wbc, if (!PagePrivate(page)) return 0; + if (btrfs_is_subpage(page_to_fs_info(page))) + return submit_btree_subpage(page, wbc, epd); + spin_lock(&mapping->private_lock); if (!PagePrivate(page)) { spin_unlock(&mapping->private_lock); From patchwork Wed Oct 21 06:25:38 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848399 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E59BEC4363A for ; Wed, 21 Oct 2020 06:27:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8DDD921D43 for ; Wed, 21 Oct 2020 06:27:59 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="RtGCL3zA" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440845AbgJUG16 (ORCPT ); Wed, 21 Oct 2020 02:27:58 -0400 Received: from mx2.suse.de ([195.135.220.15]:44552 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2436568AbgJUG16 (ORCPT ); Wed, 21 Oct 2020 02:27:58 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261676; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=idpZRZ1OBipN/QCRLaxG4YKvRobevBpJzIzLrSQyf+4=; b=RtGCL3zAIYg6uhAWjb0LAkWfiABIIM0aNnnkfBx7KYVHjTg2j7EPUrwbc4lWylMf9ZUl7r 47paCXhA/IePzurfS3yJi+sa/n8HFZdIspfG7Ea05Y+eN25tNDE3y9vTPhRpQRLRzPBoBU cBjfNccnJQFl/tfrmRPpZztW7FoPXR0= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id C8BA7AC12 for ; Wed, 21 Oct 2020 06:27:56 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 52/68] btrfs: extent_io: introduce end_bio_subpage_eb_writepage() function Date: Wed, 21 Oct 2020 14:25:38 +0800 Message-Id: <20201021062554.68132-53-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org The new function, end_bio_subpage_eb_writepage(), will handle the metadata writeback endio. The major difference involved is: - Page Writeback clear We will only clear the page writeback bit after all extent buffers in the same page has finished their writeback. This means we need to check the EXTENT_WRITEBACK bit for the page range. - Clear EXTENT_WRITEBACK bit for btree inode This is the new bit for btree inode io tree. It emulates the same page status, but in sector size aligned range. The new bit is remapped from EXTENT_DEFRAG, as defrag is impossible for btree inode, it should be pretty safe to use. Also since the new endio function needs quite some extent io tree operations, change btree_submit_bio_hook() to queue the endio work into metadata endio workqueue. Signed-off-by: Qu Wenruo --- fs/btrfs/disk-io.c | 21 ++++++++++++- fs/btrfs/extent_io.c | 73 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 93 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index e466c30b52c8..2ac980f739dc 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -961,6 +961,7 @@ blk_status_t btrfs_wq_submit_bio(struct inode *inode, struct bio *bio, async->mirror_num = mirror_num; async->submit_bio_start = submit_bio_start; + btrfs_init_work(&async->work, run_one_async_start, run_one_async_done, run_one_async_free); @@ -1031,7 +1032,25 @@ static blk_status_t btree_submit_bio_hook(struct inode *inode, struct bio *bio, if (ret) goto out_w_error; ret = btrfs_map_bio(fs_info, bio, mirror_num); - } else if (!async) { + if (ret < 0) + goto out_w_error; + return ret; + } + + /* + * For subpage metadata write, the endio involes several + * extent_io_tree operations, which is not suitable for endio + * context. + * Thus we need to queue them into endio workqueue. + */ + if (btrfs_is_subpage(fs_info)) { + ret = btrfs_bio_wq_end_io(fs_info, bio, + BTRFS_WQ_ENDIO_METADATA); + if (ret) + goto out_w_error; + } + + if (!async) { ret = btree_csum_one_bio(bio); if (ret) goto out_w_error; diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 3a2bb2656067..2a66bfae3414 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -4149,6 +4149,76 @@ static void set_btree_ioerr(struct page *page, struct extent_buffer *eb) } } +/* + * The endio function for subpage extent buffer write. + * + * Unlike end_bio_extent_buffer_writepage(), we only call end_page_writeback() + * after all extent buffers in the page has finished their writeback. + */ +static void end_bio_subpage_eb_writepage(struct bio *bio) +{ + struct bio_vec *bvec; + struct bvec_iter_all iter_all; + + ASSERT(!bio_flagged(bio, BIO_CLONED)); + bio_for_each_segment_all(bvec, bio, iter_all) { + struct page *page = bvec->bv_page; + struct btrfs_fs_info *fs_info = page_to_fs_info(page); + struct extent_buffer *eb; + u64 page_start = page_offset(page); + u64 page_end = page_start + PAGE_SIZE - 1; + u64 bvec_start = page_offset(page) + bvec->bv_offset; + u64 bvec_end = bvec_start + bvec->bv_len - 1; + u64 cur_bytenr = bvec_start; + + ASSERT(IS_ALIGNED(bvec->bv_len, fs_info->nodesize)); + + /* Iterate through all extent buffers in the range */ + while (cur_bytenr <= bvec_end) { + struct extent_state *cached = NULL; + struct extent_io_tree *io_tree = + info_to_btree_io_tree(fs_info); + int done; + int ret; + + ret = btrfs_find_first_subpage_eb(fs_info, &eb, + cur_bytenr, bvec_end, 0); + if (ret > 0) + break; + + cur_bytenr = eb->start + eb->len; + + ASSERT(test_bit(EXTENT_BUFFER_WRITEBACK, &eb->bflags)); + done = atomic_dec_and_test(&eb->io_pages); + ASSERT(done); + + if (bio->bi_status || + test_bit(EXTENT_BUFFER_WRITE_ERR, &eb->bflags)) { + ClearPageUptodate(page); + set_btree_ioerr(page, eb); + } + + clear_extent_bit(io_tree, eb->start, + eb->start + eb->len - 1, + EXTENT_WRITEBACK | EXTENT_LOCKED, 1, 0, + &cached); + lock_page(page); + /* + * Only end the page writeback if there is no extent + * buffer under writeback in the page anymore + */ + if (!test_range_bit(io_tree, page_start, page_end, + EXTENT_WRITEBACK, 0, cached) && + PageWriteback(page)) + end_page_writeback(page); + unlock_page(page); + free_extent_state(cached); + end_extent_buffer_writeback(eb); + } + } + bio_put(bio); +} + static void end_bio_extent_buffer_writepage(struct bio *bio) { struct bio_vec *bvec; @@ -4156,6 +4226,9 @@ static void end_bio_extent_buffer_writepage(struct bio *bio) int done; struct bvec_iter_all iter_all; + if (btrfs_is_subpage(page_to_fs_info(bio_first_page_all(bio)))) + return end_bio_subpage_eb_writepage(bio); + ASSERT(!bio_flagged(bio, BIO_CLONED)); bio_for_each_segment_all(bvec, bio, iter_all) { struct page *page = bvec->bv_page; From patchwork Wed Oct 21 06:25:39 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848413 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 351FAC4363A for ; Wed, 21 Oct 2020 06:28:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CC45B21D43 for ; Wed, 21 Oct 2020 06:28:02 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="WNqxVjlQ" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440846AbgJUG2C (ORCPT ); Wed, 21 Oct 2020 02:28:02 -0400 Received: from mx2.suse.de ([195.135.220.15]:44598 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2408802AbgJUG2B (ORCPT ); Wed, 21 Oct 2020 02:28:01 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261680; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fUouxSvb0OqwwI1yfTd5VmIiA/R20YUcjnuNaFs3tHc=; b=WNqxVjlQQa74OUIR+9YwNEVS+WlXiNgxdwrWu0QM7iyKMlzPQQmtbUDP7/FaWWU8ey8rZ6 7rdnhZ6MuQEa7oK9DtxVXoArmvH/h4EdBeuUm/0htnS0dXg90JUQjIPN7hjdRy+oCcNElq T8RqfDO6FK+Towgek6lab8Ph3uyznTA= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 07E82AC12 for ; Wed, 21 Oct 2020 06:28:00 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 53/68] btrfs: inode: make can_nocow_extent() check only return 1 if the range is no smaller than PAGE_SIZE Date: Wed, 21 Oct 2020 14:25:39 +0800 Message-Id: <20201021062554.68132-54-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org For subpage, we can still get sector aligned extent mapper, thus it could lead to the following case: 0 16K 32K 48K 64K |///////| | | \- Hole \- NODATACOW extent If we want to dirty page range [0, 64K) for new write, and we need to check the nocow status, can_nocow_extent() would return 1, with length 16K. But for current subpage data write support, we can only write a full page, but the range [16K, 64K) is hole where writes must be COWed. To solve the problem, just make can_nocow_extent() do extra returned length check. If the result is smaller than one page, we return 0. This behavior change won't affect regular sector size support since in that case num_bytes should already be page aligned. Also modify the callers to always pass page aligned offset for subpage support. Signed-off-by: Qu Wenruo --- fs/btrfs/file.c | 7 +++---- fs/btrfs/inode.c | 15 +++++++++++++++ 2 files changed, 18 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index d3766d2bb8d6..a2009127ef96 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -1535,8 +1535,8 @@ lock_and_cleanup_extent_if_need(struct btrfs_inode *inode, struct page **pages, static int check_can_nocow(struct btrfs_inode *inode, loff_t pos, size_t *write_bytes, bool nowait) { - struct btrfs_fs_info *fs_info = inode->root->fs_info; struct btrfs_root *root = inode->root; + u32 blocksize = PAGE_SIZE; u64 lockstart, lockend; u64 num_bytes; int ret; @@ -1547,9 +1547,8 @@ static int check_can_nocow(struct btrfs_inode *inode, loff_t pos, if (!nowait && !btrfs_drew_try_write_lock(&root->snapshot_lock)) return -EAGAIN; - lockstart = round_down(pos, fs_info->sectorsize); - lockend = round_up(pos + *write_bytes, - fs_info->sectorsize) - 1; + lockstart = round_down(pos, blocksize); + lockend = round_up(pos + *write_bytes, blocksize) - 1; num_bytes = lockend - lockstart + 1; if (nowait) { diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index f22ee5d3c105..8551815c4d65 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -7006,6 +7006,11 @@ noinline int can_nocow_extent(struct inode *inode, u64 offset, u64 *len, int found_type; bool nocow = (BTRFS_I(inode)->flags & BTRFS_INODE_NODATACOW); + /* + * We should only do full page write even for subpage. Thus the offset + * should always be page aligned. + */ + ASSERT(IS_ALIGNED(offset, PAGE_SIZE)); path = btrfs_alloc_path(); if (!path) return -ENOMEM; @@ -7121,6 +7126,16 @@ noinline int can_nocow_extent(struct inode *inode, u64 offset, u64 *len, disk_bytenr += offset - key.offset; if (csum_exist_in_range(fs_info, disk_bytenr, num_bytes)) goto out; + + /* + * If the nocow range is smaller than one page, it doesn't make any + * sense for subpage case, as we can only submit full page write yet. + */ + if (num_bytes < PAGE_SIZE) { + ret = 0; + goto out; + } + /* * all of the above have passed, it is safe to overwrite this extent * without cow From patchwork Wed Oct 21 06:25:40 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848417 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0CF43C4363A for ; Wed, 21 Oct 2020 06:28:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A85ED21D43 for ; Wed, 21 Oct 2020 06:28:05 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="GfzUBEAV" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2501868AbgJUG2F (ORCPT ); Wed, 21 Oct 2020 02:28:05 -0400 Received: from mx2.suse.de ([195.135.220.15]:44634 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2408802AbgJUG2E (ORCPT ); Wed, 21 Oct 2020 02:28:04 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261683; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jVZud4cj5zrtV0B8485SsIbYPaDs9+bZZ+IwDCUFdMY=; b=GfzUBEAV948niymGwjycOkLqayShf/FCI1DzIA0bJBZ4K6/qsumwu++IWsxfZSj/59/BRu 6hXqUlvxeQ5jVxOAYbe4O/hCSyjihGWsZgkieZKO3XkKk1grjuB7ohU+J5/ZfS8kegN8Sx T+3Dby7NYzFHrtUbq6Fnw4oQvEQEO2Q= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 319B7AC12 for ; Wed, 21 Oct 2020 06:28:03 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 54/68] btrfs: file: calculate reserve space based on PAGE_SIZE for buffered write Date: Wed, 21 Oct 2020 14:25:40 +0800 Message-Id: <20201021062554.68132-55-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org In theory btrfs_buffered_write() should reserve space using sector size. But for now let's base all reserve on PAGE_SIZE, this would make later subpage support to always submit full page write. This would cause more data space usage, but greatly simplify the subpage data write support. Signed-off-by: Qu Wenruo --- fs/btrfs/file.c | 38 +++++++++++--------------------------- 1 file changed, 11 insertions(+), 27 deletions(-) diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index a2009127ef96..564784a5c0c0 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -1650,7 +1650,6 @@ static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb, while (iov_iter_count(i) > 0) { struct extent_state *cached_state = NULL; size_t offset = offset_in_page(pos); - size_t sector_offset; size_t write_bytes = min(iov_iter_count(i), nrptrs * (size_t)PAGE_SIZE - offset); @@ -1659,7 +1658,6 @@ static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb, size_t reserve_bytes; size_t dirty_pages; size_t copied; - size_t dirty_sectors; size_t num_sectors; int extents_locked; @@ -1675,9 +1673,7 @@ static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb, } only_release_metadata = false; - sector_offset = pos & (fs_info->sectorsize - 1); - reserve_bytes = round_up(write_bytes + sector_offset, - fs_info->sectorsize); + reserve_bytes = round_up(write_bytes + offset, PAGE_SIZE); extent_changeset_release(data_reserved); ret = btrfs_check_data_free_space(BTRFS_I(inode), @@ -1697,9 +1693,8 @@ static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb, */ num_pages = DIV_ROUND_UP(write_bytes + offset, PAGE_SIZE); - reserve_bytes = round_up(write_bytes + - sector_offset, - fs_info->sectorsize); + reserve_bytes = round_up(write_bytes + offset, + PAGE_SIZE); } else { break; } @@ -1750,9 +1745,6 @@ static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb, copied = btrfs_copy_from_user(pos, write_bytes, pages, i); num_sectors = BTRFS_BYTES_TO_BLKS(fs_info, reserve_bytes); - dirty_sectors = round_up(copied + sector_offset, - fs_info->sectorsize); - dirty_sectors = BTRFS_BYTES_TO_BLKS(fs_info, dirty_sectors); /* * if we have trouble faulting in the pages, fall @@ -1763,35 +1755,29 @@ static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb, if (copied == 0) { force_page_uptodate = true; - dirty_sectors = 0; dirty_pages = 0; } else { force_page_uptodate = false; - dirty_pages = DIV_ROUND_UP(copied + offset, - PAGE_SIZE); + dirty_pages = DIV_ROUND_UP(copied + offset, PAGE_SIZE); } - if (num_sectors > dirty_sectors) { + if (num_pages > dirty_pages) { /* release everything except the sectors we dirtied */ - release_bytes -= dirty_sectors << - fs_info->sb->s_blocksize_bits; + release_bytes -= dirty_pages << PAGE_SHIFT; if (only_release_metadata) { btrfs_delalloc_release_metadata(BTRFS_I(inode), release_bytes, true); } else { u64 __pos; - __pos = round_down(pos, - fs_info->sectorsize) + + __pos = round_down(pos, PAGE_SIZE) + (dirty_pages << PAGE_SHIFT); btrfs_delalloc_release_space(BTRFS_I(inode), data_reserved, __pos, release_bytes, true); } } - - release_bytes = round_up(copied + sector_offset, - fs_info->sectorsize); + release_bytes = round_up(copied + offset, PAGE_SIZE); if (copied > 0) ret = btrfs_dirty_pages(BTRFS_I(inode), pages, @@ -1822,10 +1808,8 @@ static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb, btrfs_check_nocow_unlock(BTRFS_I(inode)); if (only_release_metadata && copied > 0) { - lockstart = round_down(pos, - fs_info->sectorsize); - lockend = round_up(pos + copied, - fs_info->sectorsize) - 1; + lockstart = round_down(pos, PAGE_SIZE); + lockend = round_up(pos + copied, PAGE_SIZE) - 1; set_extent_bit(&BTRFS_I(inode)->io_tree, lockstart, lockend, EXTENT_NORESERVE, NULL, @@ -1852,7 +1836,7 @@ static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb, } else { btrfs_delalloc_release_space(BTRFS_I(inode), data_reserved, - round_down(pos, fs_info->sectorsize), + round_down(pos, PAGE_SIZE), release_bytes, true); } } From patchwork Wed Oct 21 06:25:41 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848423 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 931D2C561F8 for ; Wed, 21 Oct 2020 06:28:08 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3BBA421D43 for ; Wed, 21 Oct 2020 06:28:08 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="ax5G4nk0" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440850AbgJUG2H (ORCPT ); Wed, 21 Oct 2020 02:28:07 -0400 Received: from mx2.suse.de ([195.135.220.15]:44668 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2408802AbgJUG2G (ORCPT ); Wed, 21 Oct 2020 02:28:06 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261685; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BFzCJG/dqdXVRC4Qbca7pddtH8GvWutEoImar3oxE1E=; b=ax5G4nk0p8+fjbcr837n2QgA8K3rTsqVtoYMZn2L89opp79aVkt8ViLhW98iVypkcu7L+1 5xw5ljo+viVNzsK2vSbs0kcdnuCcfQlonQQUWvo/Q+0EkUBhqzP0kk3kVps3LLnZIMAxVg PHG8EUQNOY/K0KqyfwpvHYAyuWwEEuk= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 0A653AC12 for ; Wed, 21 Oct 2020 06:28:05 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 55/68] btrfs: file: make hole punching page aligned for subpage Date: Wed, 21 Oct 2020 14:25:41 +0800 Message-Id: <20201021062554.68132-56-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Since current subpage data write only support full page write, make hole punching to follow page size instead of sector size. Also there is an optimization branch which will skip any existing holes, but since we can still have subpage holes in the hole punching range, the optimization needs to be disabled in subpage case. Update the related comment for subpage support, explaining why we don't want that optimization. Signed-off-by: Qu Wenruo --- fs/btrfs/file.c | 28 ++++++++++++++++------------ 1 file changed, 16 insertions(+), 12 deletions(-) diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 564784a5c0c0..cb8f2b04ccd8 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -2802,6 +2802,7 @@ static int btrfs_punch_hole(struct inode *inode, loff_t offset, loff_t len) u64 tail_start; u64 tail_len; u64 orig_start = offset; + u32 blocksize = PAGE_SIZE; int ret = 0; bool same_block; u64 ino_size; @@ -2813,7 +2814,7 @@ static int btrfs_punch_hole(struct inode *inode, loff_t offset, loff_t len) return ret; inode_lock(inode); - ino_size = round_up(inode->i_size, fs_info->sectorsize); + ino_size = round_up(inode->i_size, block_size); ret = find_first_non_hole(inode, &offset, &len); if (ret < 0) goto out_only_mutex; @@ -2823,11 +2824,10 @@ static int btrfs_punch_hole(struct inode *inode, loff_t offset, loff_t len) goto out_only_mutex; } - lockstart = round_up(offset, btrfs_inode_sectorsize(inode)); - lockend = round_down(offset + len, - btrfs_inode_sectorsize(inode)) - 1; - same_block = (BTRFS_BYTES_TO_BLKS(fs_info, offset)) - == (BTRFS_BYTES_TO_BLKS(fs_info, offset + len - 1)); + lockstart = round_up(offset, blocksize); + lockend = round_down(offset + len, blocksize) - 1; + same_block = round_down(offset, blocksize) == + round_down(offset + len - 1, blocksize); /* * We needn't truncate any block which is beyond the end of the file * because we are sure there is no data there. @@ -2836,7 +2836,7 @@ static int btrfs_punch_hole(struct inode *inode, loff_t offset, loff_t len) * Only do this if we are in the same block and we aren't doing the * entire block. */ - if (same_block && len < fs_info->sectorsize) { + if (same_block && len < blocksize) { if (offset < ino_size) { truncated_block = true; ret = btrfs_truncate_block(inode, offset, len, 0); @@ -2856,10 +2856,13 @@ static int btrfs_punch_hole(struct inode *inode, loff_t offset, loff_t len) } } - /* Check the aligned pages after the first unaligned page, - * if offset != orig_start, which means the first unaligned page - * including several following pages are already in holes, - * the extra check can be skipped */ + /* + * Optimization to check if we can skip any already existing holes. + * + * If offset != orig_start, which means the first unaligned page + * and several following pages are already holes, thus can skip the + * check. + */ if (offset == orig_start) { /* after truncate page, check hole again */ len = offset + len - lockstart; @@ -2871,7 +2874,8 @@ static int btrfs_punch_hole(struct inode *inode, loff_t offset, loff_t len) ret = 0; goto out_only_mutex; } - lockstart = offset; + lockstart = max_t(u64, lockstart, + round_down(offset, blocksize)); } /* Check the tail unaligned part is in a hole */ From patchwork Wed Oct 21 06:25:42 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848421 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AD44CC4363A for ; Wed, 21 Oct 2020 06:28:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4974421D43 for ; Wed, 21 Oct 2020 06:28:09 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="VzI5KJzo" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440852AbgJUG2I (ORCPT ); Wed, 21 Oct 2020 02:28:08 -0400 Received: from mx2.suse.de ([195.135.220.15]:44718 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2440851AbgJUG2I (ORCPT ); Wed, 21 Oct 2020 02:28:08 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261686; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=FOZr88Po7inmI4ttt7ESAbG8o9zUqzpmk8rxtRYGKzY=; b=VzI5KJzoD3gT+2+AQRG6B5+vhFSnKx1vm+aAEG3r9HH5x3DTNkXX8z+87scTyiyOOUA6KT 9D+sdhz35glXugFy/DgQh2d1n0C8fkctcesX/pQ5PvZ/eVYKUAa2//8LhH6ayNdJghkVu5 mItKjCL9D3uMZ6OVi7D4eLWuGlF5fm0= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id D0EE5AC1D for ; Wed, 21 Oct 2020 06:28:06 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 56/68] btrfs: file: make btrfs_dirty_pages() follow page size to mark extent io tree Date: Wed, 21 Oct 2020 14:25:42 +0800 Message-Id: <20201021062554.68132-57-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Currently btrfs_dirty_pages() follows sector size to mark extent io tree, but since we currently don't follow subpage data writeback, this could cause extra problem for subpage support. Change it to do page alignement. Signed-off-by: Qu Wenruo --- fs/btrfs/file.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index cb8f2b04ccd8..30b22303ad2c 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -504,9 +504,9 @@ int btrfs_dirty_pages(struct btrfs_inode *inode, struct page **pages, size_t num_pages, loff_t pos, size_t write_bytes, struct extent_state **cached) { - struct btrfs_fs_info *fs_info = inode->root->fs_info; int err = 0; int i; + u32 blocksize = PAGE_SIZE; u64 num_bytes; u64 start_pos; u64 end_of_last_block; @@ -514,9 +514,8 @@ int btrfs_dirty_pages(struct btrfs_inode *inode, struct page **pages, loff_t isize = i_size_read(&inode->vfs_inode); unsigned int extra_bits = 0; - start_pos = pos & ~((u64) fs_info->sectorsize - 1); - num_bytes = round_up(write_bytes + pos - start_pos, - fs_info->sectorsize); + start_pos = round_down(pos, blocksize); + num_bytes = round_up(write_bytes + pos - start_pos, blocksize); end_of_last_block = start_pos + num_bytes - 1; From patchwork Wed Oct 21 06:25:43 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848419 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A021BC561F8 for ; Wed, 21 Oct 2020 06:28:11 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4723E21D43 for ; Wed, 21 Oct 2020 06:28:11 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="hEuM7hz+" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440854AbgJUG2K (ORCPT ); Wed, 21 Oct 2020 02:28:10 -0400 Received: from mx2.suse.de ([195.135.220.15]:44768 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2440851AbgJUG2J (ORCPT ); Wed, 21 Oct 2020 02:28:09 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261688; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=uhw1e4cdmOvNkm2CkvHJAuB4TB7nxWbxVf9YWD+poho=; b=hEuM7hz+/UsHBva+HodhhUTPZBauBc+f9SG/6U1vJvEQ2tt/nwDL9dwFls+OlvCbiSjiA7 fvsndQddCqPUbEDwShjbi8/GLZ4E/5ZIVW0tM2WH7fdJFtD35SDZqNUlrBxCxo3h7H9Rne mgMrUGO2uPLgNKta2m8+iw3g7odomq8= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 85848AC1D for ; Wed, 21 Oct 2020 06:28:08 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 57/68] btrfs: file: make btrfs_file_write_iter() to be page aligned Date: Wed, 21 Oct 2020 14:25:43 +0800 Message-Id: <20201021062554.68132-58-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org This is mostly for subpage write support, as we don't support to submit subpage sized write yet, so we have to submit the full page write. Signed-off-by: Qu Wenruo --- fs/btrfs/file.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 30b22303ad2c..8f44bde1d04e 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -1909,6 +1909,7 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb, struct inode *inode = file_inode(file); struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); struct btrfs_root *root = BTRFS_I(inode)->root; + u32 blocksize = PAGE_SIZE; u64 start_pos; u64 end_pos; ssize_t num_written = 0; @@ -1988,18 +1989,17 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb, */ update_time_for_write(inode); - start_pos = round_down(pos, fs_info->sectorsize); + start_pos = round_down(pos, blocksize); oldsize = i_size_read(inode); if (start_pos > oldsize) { /* Expand hole size to cover write data, preventing empty gap */ - end_pos = round_up(pos + count, - fs_info->sectorsize); + end_pos = round_up(pos + count, blocksize); err = btrfs_cont_expand(inode, oldsize, end_pos); if (err) { inode_unlock(inode); goto out; } - if (start_pos > round_up(oldsize, fs_info->sectorsize)) + if (start_pos > round_up(oldsize, blocksize)) clean_page = 1; } From patchwork Wed Oct 21 06:25:44 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848429 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3DA6FC4363A for ; Wed, 21 Oct 2020 06:28:13 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DBF4821D43 for ; Wed, 21 Oct 2020 06:28:12 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="SiwvaJcg" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440856AbgJUG2M (ORCPT ); Wed, 21 Oct 2020 02:28:12 -0400 Received: from mx2.suse.de ([195.135.220.15]:44794 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2440851AbgJUG2L (ORCPT ); Wed, 21 Oct 2020 02:28:11 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261690; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ulyjZ/QrDUjz2gNm2G7sW6yN4nI979QJhUmOmX/VziY=; b=SiwvaJcgYHL4lT0GBU4GCSQTOJgpuHcapBzL6qQKywTDDWMR5dcy+bkyFsBcdwayvgMx/H kSFrujX9scRz92lHQ7MdpAidQsjMFVTNKvwGZJrh/kS+GKrQOfGJmyPL3VmxzaHansaFsY CaQjfUaNMeAAJ9ODDsVpjtlSSp+vGcM= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 596CAAC1D for ; Wed, 21 Oct 2020 06:28:10 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 58/68] btrfs: output extra info for space info update underflow Date: Wed, 21 Oct 2020 14:25:44 +0800 Message-Id: <20201021062554.68132-59-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Signed-off-by: Qu Wenruo --- fs/btrfs/space-info.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/space-info.h b/fs/btrfs/space-info.h index c3c64019950a..f7c3fc3a8541 100644 --- a/fs/btrfs/space-info.h +++ b/fs/btrfs/space-info.h @@ -106,6 +106,8 @@ btrfs_space_info_update_##name(struct btrfs_fs_info *fs_info, \ sinfo->flags, abs_bytes, \ bytes > 0); \ if (bytes < 0 && sinfo->name < -bytes) { \ + btrfs_warn(fs_info, "bytes_%s have %llu diff %lld\n", \ + trace_name, sinfo->name, bytes); \ WARN_ON(1); \ sinfo->name = 0; \ return; \ @@ -113,7 +115,7 @@ btrfs_space_info_update_##name(struct btrfs_fs_info *fs_info, \ sinfo->name += bytes; \ } -DECLARE_SPACE_INFO_UPDATE(bytes_may_use, "space_info"); +DECLARE_SPACE_INFO_UPDATE(bytes_may_use, "may_use"); DECLARE_SPACE_INFO_UPDATE(bytes_pinned, "pinned"); int btrfs_init_space_info(struct btrfs_fs_info *fs_info); From patchwork Wed Oct 21 06:25:45 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848425 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6F03FC561F8 for ; Wed, 21 Oct 2020 06:28:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1F67021D43 for ; Wed, 21 Oct 2020 06:28:15 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="cvzhwFP0" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440857AbgJUG2O (ORCPT ); Wed, 21 Oct 2020 02:28:14 -0400 Received: from mx2.suse.de ([195.135.220.15]:44814 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2440851AbgJUG2N (ORCPT ); Wed, 21 Oct 2020 02:28:13 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261692; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=d11CJsT4jQd84lTlqKCsSk+MdjWRruj4TBWwolM+36o=; b=cvzhwFP0jHVavxImGub9QoTIQktOa6rCrt6EJjwYmFwO5HlkRKSKdrXzhgg43ERzlZKTyl 4XBYBSsiUWy6KYbDNa/qCx5ylN5EL78V2WYknFYuulXVipLyzOFnbPFeQlCVZaR5U1snHa zexRzDhU9wVm3yDpHX680b/kuuPQ7N8= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 3B657AC1D for ; Wed, 21 Oct 2020 06:28:12 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 59/68] btrfs: delalloc-space: make data space reservation to be page aligned Date: Wed, 21 Oct 2020 14:25:45 +0800 Message-Id: <20201021062554.68132-60-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org This is for initial subpage data write support. Currently we don't yet support full subpage data write, but still full page data writeback. Thus change data reserve and release code to be page aligned. Signed-off-by: Qu Wenruo --- fs/btrfs/delalloc-space.c | 19 ++++++++++--------- 1 file changed, 10 insertions(+), 9 deletions(-) diff --git a/fs/btrfs/delalloc-space.c b/fs/btrfs/delalloc-space.c index 0e354e9e57d0..1f2b324485f5 100644 --- a/fs/btrfs/delalloc-space.c +++ b/fs/btrfs/delalloc-space.c @@ -116,13 +116,14 @@ int btrfs_alloc_data_chunk_ondemand(struct btrfs_inode *inode, u64 bytes) struct btrfs_root *root = inode->root; struct btrfs_fs_info *fs_info = root->fs_info; struct btrfs_space_info *data_sinfo = fs_info->data_sinfo; + u32 blocksize = PAGE_SIZE; u64 used; int ret = 0; int need_commit = 2; int have_pinned_space; - /* Make sure bytes are sectorsize aligned */ - bytes = ALIGN(bytes, fs_info->sectorsize); + /* Make sure bytes are aligned */ + bytes = round_up(bytes, blocksize); if (btrfs_is_free_space_inode(inode)) { need_commit = 0; @@ -241,12 +242,12 @@ int btrfs_check_data_free_space(struct btrfs_inode *inode, struct extent_changeset **reserved, u64 start, u64 len) { struct btrfs_fs_info *fs_info = inode->root->fs_info; + u32 blocksize = PAGE_SIZE; int ret; /* align the range */ - len = round_up(start + len, fs_info->sectorsize) - - round_down(start, fs_info->sectorsize); - start = round_down(start, fs_info->sectorsize); + len = round_up(start + len, blocksize) - round_down(start, blocksize); + start = round_down(start, blocksize); ret = btrfs_alloc_data_chunk_ondemand(inode, len); if (ret < 0) @@ -293,11 +294,11 @@ void btrfs_free_reserved_data_space(struct btrfs_inode *inode, struct extent_changeset *reserved, u64 start, u64 len) { struct btrfs_fs_info *fs_info = inode->root->fs_info; + u32 blocksize = PAGE_SIZE; - /* Make sure the range is aligned to sectorsize */ - len = round_up(start + len, fs_info->sectorsize) - - round_down(start, fs_info->sectorsize); - start = round_down(start, fs_info->sectorsize); + /* Make sure the range is aligned */ + len = round_up(start + len, blocksize) - round_down(start, blocksize); + start = round_down(start, blocksize); btrfs_free_reserved_data_space_noquota(fs_info, len); btrfs_qgroup_free_data(inode, reserved, start, len); From patchwork Wed Oct 21 06:25:46 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848427 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DD839C56201 for ; Wed, 21 Oct 2020 06:28:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9147A21D43 for ; Wed, 21 Oct 2020 06:28:16 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="Uct/hxKY" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440860AbgJUG2P (ORCPT ); Wed, 21 Oct 2020 02:28:15 -0400 Received: from mx2.suse.de ([195.135.220.15]:44834 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2440851AbgJUG2P (ORCPT ); Wed, 21 Oct 2020 02:28:15 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261694; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+V7xTM+bicWnG+rnFsGWgbovgTAljsJQMEONgX30nzk=; b=Uct/hxKYzG9e+QpHQyNA61FGuOSvwUb19ZDO7tggcmK7oK35dYUYwMG2MHWR+GSRMzMSBM tbYQ+w+5zDmI4Y8BbTSdmiVVm+UDWJxxjzHxZdRgrCA/gj+vklXjt3zGRkLZ2NUwdUowXX ihCoOlXjN1Y/cIf5qPxOjWw6oZnnZpk= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id E4465AC35 for ; Wed, 21 Oct 2020 06:28:13 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 60/68] btrfs: scrub: allow scrub to work with subpage sectorsize Date: Wed, 21 Oct 2020 14:25:46 +0800 Message-Id: <20201021062554.68132-61-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Signed-off-by: Qu Wenruo --- fs/btrfs/scrub.c | 8 -------- 1 file changed, 8 deletions(-) diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c index 354ab9985a34..806523515d2f 100644 --- a/fs/btrfs/scrub.c +++ b/fs/btrfs/scrub.c @@ -3821,14 +3821,6 @@ int btrfs_scrub_dev(struct btrfs_fs_info *fs_info, u64 devid, u64 start, return -EINVAL; } - if (fs_info->sectorsize != PAGE_SIZE) { - /* not supported for data w/o checksums */ - btrfs_err_rl(fs_info, - "scrub: size assumption sectorsize != PAGE_SIZE (%d != %lu) fails", - fs_info->sectorsize, PAGE_SIZE); - return -EINVAL; - } - if (fs_info->nodesize > PAGE_SIZE * SCRUB_MAX_PAGES_PER_BLOCK || fs_info->sectorsize > PAGE_SIZE * SCRUB_MAX_PAGES_PER_BLOCK) { From patchwork Wed Oct 21 06:25:47 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848433 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 348EAC4363A for ; Wed, 21 Oct 2020 06:28:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C26B221D43 for ; Wed, 21 Oct 2020 06:28:18 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="DjU8G9hm" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440863AbgJUG2R (ORCPT ); Wed, 21 Oct 2020 02:28:17 -0400 Received: from mx2.suse.de ([195.135.220.15]:44864 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2440851AbgJUG2R (ORCPT ); Wed, 21 Oct 2020 02:28:17 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261695; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=T/F2x1+by7YMZaOSu1fPNAzmJOL+4oCazbpqrhg3Rag=; b=DjU8G9hmQ+5t3Zp0ZuExdv6PYh4+2hp9g8rNsOFmrMTATzc5U7Nrk3FJsg9ptkCm6wVfGA rohaQgb1gyLjfWLDCuPEhXp1ki0xna/+APnUOVMkNa9KonlnXSeGcRRGtWZwaA76kckdnn wS+f1L1RkhB2o4uMDOwTGuqjw3+7PuY= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id AE5BCAC1D for ; Wed, 21 Oct 2020 06:28:15 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 61/68] btrfs: inode: make btrfs_truncate_block() to do page alignment Date: Wed, 21 Oct 2020 14:25:47 +0800 Message-Id: <20201021062554.68132-62-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org This is mostly for subpage write back, as we still can only submit full page write, we can't truncate the subpage sector. Thus here we truncate the whole page other than each sector. Signed-off-by: Qu Wenruo --- fs/btrfs/inode.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 8551815c4d65..f3bc894611e0 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -4529,7 +4529,6 @@ int btrfs_truncate_inode_items(struct btrfs_trans_handle *trans, int btrfs_truncate_block(struct inode *inode, loff_t from, loff_t len, int front) { - struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); struct address_space *mapping = inode->i_mapping; struct extent_io_tree *io_tree = &BTRFS_I(inode)->io_tree; struct btrfs_ordered_extent *ordered; @@ -4537,7 +4536,7 @@ int btrfs_truncate_block(struct inode *inode, loff_t from, loff_t len, struct extent_changeset *data_reserved = NULL; char *kaddr; bool only_release_metadata = false; - u32 blocksize = fs_info->sectorsize; + u32 blocksize = PAGE_SIZE; pgoff_t index = from >> PAGE_SHIFT; unsigned offset = from & (blocksize - 1); struct page *page; From patchwork Wed Oct 21 06:25:48 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848431 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AE9ABC4363A for ; Wed, 21 Oct 2020 06:28:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 58A4321D43 for ; Wed, 21 Oct 2020 06:28:21 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="pk+tRktb" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440864AbgJUG2U (ORCPT ); Wed, 21 Oct 2020 02:28:20 -0400 Received: from mx2.suse.de ([195.135.220.15]:44898 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2408668AbgJUG2U (ORCPT ); Wed, 21 Oct 2020 02:28:20 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261698; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=KumKuUaRwWqWy2QYBREe/xkuwU22J4PNsV1XesEWk8w=; b=pk+tRktbvDh7NGkF/1Ut5uChd9D4uuvFC0zpKAw8itWxASUFq6Y4DVqdCAUhJ3EqKL63po DNkCFoOuIKrCMBZ9sqiLrdcA17bVlN7hgPGQk1wQ4koTlIW+DHtm85SgFdcMN/ufaEbDg+ 1RsSk3hLh6Czh4MLrRl/eVK5B3hjVjA= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 86636AC1D for ; Wed, 21 Oct 2020 06:28:18 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 62/68] btrfs: file: make hole punch and zero range to be page aligned Date: Wed, 21 Oct 2020 14:25:48 +0800 Message-Id: <20201021062554.68132-63-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org To workaround the fact that we can't yet submit subpage write bio. Signed-off-by: Qu Wenruo --- fs/btrfs/file.c | 42 +++++++++++++++++++++--------------------- 1 file changed, 21 insertions(+), 21 deletions(-) diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 8f44bde1d04e..6e342c466fdf 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -2455,6 +2455,8 @@ static int btrfs_punch_hole_lock_range(struct inode *inode, const u64 lockend, struct extent_state **cached_state) { + ASSERT(IS_ALIGNED(lockstart, PAGE_SIZE) && + IS_ALIGNED(lockend + 1, PAGE_SIZE)); while (1) { struct btrfs_ordered_extent *ordered; int ret; @@ -3033,12 +3035,12 @@ enum { static int btrfs_zero_range_check_range_boundary(struct inode *inode, u64 offset) { - const u64 sectorsize = btrfs_inode_sectorsize(inode); + const u32 blocksize = PAGE_SIZE; struct extent_map *em; int ret; - offset = round_down(offset, sectorsize); - em = btrfs_get_extent(BTRFS_I(inode), NULL, 0, offset, sectorsize); + offset = round_down(offset, blocksize); + em = btrfs_get_extent(BTRFS_I(inode), NULL, 0, offset, blocksize); if (IS_ERR(em)) return PTR_ERR(em); @@ -3058,14 +3060,13 @@ static int btrfs_zero_range(struct inode *inode, loff_t len, const int mode) { - struct btrfs_fs_info *fs_info = BTRFS_I(inode)->root->fs_info; struct extent_map *em; struct extent_changeset *data_reserved = NULL; int ret; + const u32 blocksize = PAGE_SIZE; u64 alloc_hint = 0; - const u64 sectorsize = btrfs_inode_sectorsize(inode); - u64 alloc_start = round_down(offset, sectorsize); - u64 alloc_end = round_up(offset + len, sectorsize); + u64 alloc_start = round_down(offset, blocksize); + u64 alloc_end = round_up(offset + len, blocksize); u64 bytes_to_reserve = 0; bool space_reserved = false; @@ -3105,18 +3106,17 @@ static int btrfs_zero_range(struct inode *inode, * Part of the range is already a prealloc extent, so operate * only on the remaining part of the range. */ - alloc_start = em_end; - ASSERT(IS_ALIGNED(alloc_start, sectorsize)); + alloc_start = round_down(em_end, blocksize); len = offset + len - alloc_start; offset = alloc_start; alloc_hint = em->block_start + em->len; } free_extent_map(em); - if (BTRFS_BYTES_TO_BLKS(fs_info, offset) == - BTRFS_BYTES_TO_BLKS(fs_info, offset + len - 1)) { + if (round_down(offset, blocksize) == + round_down(offset + len - 1, blocksize)) { em = btrfs_get_extent(BTRFS_I(inode), NULL, 0, alloc_start, - sectorsize); + blocksize); if (IS_ERR(em)) { ret = PTR_ERR(em); goto out; @@ -3128,7 +3128,7 @@ static int btrfs_zero_range(struct inode *inode, mode); goto out; } - if (len < sectorsize && em->block_start != EXTENT_MAP_HOLE) { + if (len < blocksize && em->block_start != EXTENT_MAP_HOLE) { free_extent_map(em); ret = btrfs_truncate_block(inode, offset, len, 0); if (!ret) @@ -3138,13 +3138,13 @@ static int btrfs_zero_range(struct inode *inode, return ret; } free_extent_map(em); - alloc_start = round_down(offset, sectorsize); - alloc_end = alloc_start + sectorsize; + alloc_start = round_down(offset, blocksize); + alloc_end = alloc_start + blocksize; goto reserve_space; } - alloc_start = round_up(offset, sectorsize); - alloc_end = round_down(offset + len, sectorsize); + alloc_start = round_up(offset, blocksize); + alloc_end = round_down(offset + len, blocksize); /* * For unaligned ranges, check the pages at the boundaries, they might @@ -3152,12 +3152,12 @@ static int btrfs_zero_range(struct inode *inode, * they might map to a hole, in which case we need our allocation range * to cover them. */ - if (!IS_ALIGNED(offset, sectorsize)) { + if (!IS_ALIGNED(offset, blocksize)) { ret = btrfs_zero_range_check_range_boundary(inode, offset); if (ret < 0) goto out; if (ret == RANGE_BOUNDARY_HOLE) { - alloc_start = round_down(offset, sectorsize); + alloc_start = round_down(offset, blocksize); ret = 0; } else if (ret == RANGE_BOUNDARY_WRITTEN_EXTENT) { ret = btrfs_truncate_block(inode, offset, 0, 0); @@ -3168,13 +3168,13 @@ static int btrfs_zero_range(struct inode *inode, } } - if (!IS_ALIGNED(offset + len, sectorsize)) { + if (!IS_ALIGNED(offset + len, blocksize)) { ret = btrfs_zero_range_check_range_boundary(inode, offset + len); if (ret < 0) goto out; if (ret == RANGE_BOUNDARY_HOLE) { - alloc_end = round_up(offset + len, sectorsize); + alloc_end = round_up(offset + len, blocksize); ret = 0; } else if (ret == RANGE_BOUNDARY_WRITTEN_EXTENT) { ret = btrfs_truncate_block(inode, offset + len, 0, 1); From patchwork Wed Oct 21 06:25:49 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848443 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3058EC4363A for ; Wed, 21 Oct 2020 06:28:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BCF7421D43 for ; Wed, 21 Oct 2020 06:28:23 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="KVwr1NAo" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2436731AbgJUG2W (ORCPT ); Wed, 21 Oct 2020 02:28:22 -0400 Received: from mx2.suse.de ([195.135.220.15]:44956 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2440865AbgJUG2W (ORCPT ); Wed, 21 Oct 2020 02:28:22 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261701; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=OItWooDGL4k0RSoRQgCHLgFdXZYa1ij6bZkR/Pdmuyg=; b=KVwr1NAowp4osaQJ3JulkYW9sNktGJ4yUtU8oluR70usSdz9r634TJC+d+lREnoGVsEGvP DKHBdN0KOm17pBFrmpw8r8/h1ow7lRtN2OWPUVCJzu2mtgOA2jFYi5cYWX5WrfVuDu5uwD b+9axRIYT8coY0Ajag1PgHY//xKstJM= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 1DA70ACC5 for ; Wed, 21 Oct 2020 06:28:21 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 63/68] btrfs: file: make btrfs_fallocate() to use PAGE_SIZE as blocksize Date: Wed, 21 Oct 2020 14:25:49 +0800 Message-Id: <20201021062554.68132-64-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org In theory, we can still allow subpage sector size to be utilized in such case, but since btrfs_truncate_block() now operates in page unit, we should also change btrfs_fallocate() to honor PAGE_SIZE as blocksize. Signed-off-by: Qu Wenruo --- fs/btrfs/file.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 6e342c466fdf..f7122f71b791 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -3244,7 +3244,7 @@ static long btrfs_fallocate(struct file *file, int mode, u64 locked_end; u64 actual_end = 0; struct extent_map *em; - int blocksize = btrfs_inode_sectorsize(inode); + int blocksize = PAGE_SIZE; int ret; alloc_start = round_down(offset, blocksize); @@ -3401,7 +3401,7 @@ static long btrfs_fallocate(struct file *file, int mode, if (!ret) ret = btrfs_prealloc_file_range(inode, mode, range->start, - range->len, i_blocksize(inode), + range->len, blocksize, offset + len, &alloc_hint); else btrfs_free_reserved_data_space(BTRFS_I(inode), From patchwork Wed Oct 21 06:25:50 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848445 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 04FE7C561F8 for ; Wed, 21 Oct 2020 06:28:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id ACFB321D43 for ; Wed, 21 Oct 2020 06:28:26 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="lPUmP/nK" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440868AbgJUG2Y (ORCPT ); Wed, 21 Oct 2020 02:28:24 -0400 Received: from mx2.suse.de ([195.135.220.15]:45010 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2440865AbgJUG2Y (ORCPT ); Wed, 21 Oct 2020 02:28:24 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261703; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0MUiLHBYuqzeygIT+LCuQeHxWkfmHFqeFvMamaqBBa4=; b=lPUmP/nK3XtFJHxF5MungpAjnPlLUBeci//2Z6r4vSI0btwlTRj1kZgqJgS0BfN2QsC4Li fxvr8CYIfudeXQlvjnCAhSaMT/0dzd7wlsyv5Bi9JedIoZy0tHOCiOW1UN4kkIa8CImsZX OjvFLjuFAk8oQr3zI1mfralWFeA/WYI= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 05703AC35 for ; Wed, 21 Oct 2020 06:28:23 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 64/68] btrfs: inode: always mark the full page range delalloc for btrfs_page_mkwrite() Date: Wed, 21 Oct 2020 14:25:50 +0800 Message-Id: <20201021062554.68132-65-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org So that we won't get subpage sized EXTENT_DELALLOC, which could easily screwup the PAGE aligned write space reservation for subpage support. Signed-off-by: Qu Wenruo --- fs/btrfs/inode.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index f3bc894611e0..0da6c91db0bc 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -8323,8 +8323,7 @@ vm_fault_t btrfs_page_mkwrite(struct vm_fault *vmf) } if (page->index == ((size - 1) >> PAGE_SHIFT)) { - reserved_space = round_up(size - page_start, - fs_info->sectorsize); + reserved_space = round_up(size - page_start, PAGE_SIZE); if (reserved_space < PAGE_SIZE) { end = page_start + reserved_space - 1; btrfs_delalloc_release_space(BTRFS_I(inode), From patchwork Wed Oct 21 06:25:51 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848441 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 233D9C561F8 for ; Wed, 21 Oct 2020 06:28:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BAC2421D43 for ; Wed, 21 Oct 2020 06:28:28 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="bSIrpmzO" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2436577AbgJUG22 (ORCPT ); Wed, 21 Oct 2020 02:28:28 -0400 Received: from mx2.suse.de ([195.135.220.15]:45042 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2440865AbgJUG20 (ORCPT ); Wed, 21 Oct 2020 02:28:26 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261704; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=HxipyaoC8soMha9pnnrNqX9B72Ucj/YKwiAwl9hQfcM=; b=bSIrpmzOGLUBJ3DXd7C9bJpKCr+gAehGiZd4Z8VwBl9JAkoFiC/23roACBYCnEf0wuyF0z m/9H7AI8HJPIyfoOulRc3JbV4i0Z2r6AO+Lejxio+F6yWJihx2kBua9L5zfTcYXLPeCBR1 4VPliAmVVH1x+BM/QAyKg6fFKPAKICQ= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id C213BAC1D for ; Wed, 21 Oct 2020 06:28:24 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 65/68] btrfs: inode: require page alignement for direct io Date: Wed, 21 Oct 2020 14:25:51 +0800 Message-Id: <20201021062554.68132-66-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org For incoming subpage support, we still can only submit full page write, thus the requirement for direct IO alignment should still be page size, not sector size. Signed-off-by: Qu Wenruo --- fs/btrfs/inode.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 0da6c91db0bc..625950258c87 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -7894,7 +7894,7 @@ static ssize_t check_direct_IO(struct btrfs_fs_info *fs_info, { int seg; int i; - unsigned int blocksize_mask = fs_info->sectorsize - 1; + unsigned int blocksize_mask = PAGE_SIZE - 1; ssize_t retval = -EINVAL; if (offset & blocksize_mask) From patchwork Wed Oct 21 06:25:52 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848439 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C2F7CC4363A for ; Wed, 21 Oct 2020 06:28:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6CAAE21D43 for ; Wed, 21 Oct 2020 06:28:30 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="Yhbhhq/r" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440871AbgJUG23 (ORCPT ); Wed, 21 Oct 2020 02:28:29 -0400 Received: from mx2.suse.de ([195.135.220.15]:45128 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2440869AbgJUG22 (ORCPT ); Wed, 21 Oct 2020 02:28:28 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261707; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=hgqFXSV+EQG9z7Dz/N0MFyo/ikf3WEwRtGSfAt4G7i8=; b=Yhbhhq/rueRU/OkVcO80MvCMlHuV953eHjeI6Vjbet33J8bYznJMIFYhNBxtSL1nioWfPW Po/JlgnURTlnSktzC2f/PKveDvsEYrc64QpP4xLQp8OYnyZnVHGCvtq9ZxxncdlKYtq6Ge D9nN8Ct5jJbYqi4ae/ArOe18ULcFsCU= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 0C3EFAC8C for ; Wed, 21 Oct 2020 06:28:27 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 66/68] btrfs: inode: only do NOCOW write for page aligned extent Date: Wed, 21 Oct 2020 14:25:52 +0800 Message-Id: <20201021062554.68132-67-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Another workaround for the inability to submit real subpage sized write bio. For NOCOW, if a range ends at sector boundary but no page boundary, we can't submit a subpage NOCOW write bio. To workaround this, we skip any extent which is not page aligned, and fall back to COW. Signed-off-by: Qu Wenruo --- fs/btrfs/inode.c | 30 +++++++++++++++++++++++++++--- 1 file changed, 27 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 625950258c87..c3d32f4858d5 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -1451,6 +1451,12 @@ static int fallback_to_cow(struct btrfs_inode *inode, struct page *locked_page, * * If no cow copies or snapshots exist, we write directly to the existing * blocks on disk + * the full page. Or we fall back to COW, as we don't yet support subpage + * write. + * + * For subpage case, since we can't submit subpage data write yet, we have + * more restrict condition for NOCOW (the extent must contain the full page). + * Or we fall back to COW the full page. */ static noinline int run_delalloc_nocow(struct btrfs_inode *inode, struct page *locked_page, @@ -1592,6 +1598,20 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode, btrfs_file_extent_encryption(leaf, fi) || btrfs_file_extent_other_encoding(leaf, fi)) goto out_check; + /* + * If the file offset/extent offset/extent end is not + * page aligned, we skip it and fallback to COW. + * This is mostly overkilled, but to make subpage NOCOW + * write easier, we only allow write into page aligned + * extent. + * + * TODO: Remove this when full subpage write is + * supported. + */ + if (!IS_ALIGNED(found_key.offset, PAGE_SIZE) || + !IS_ALIGNED(extent_end, PAGE_SIZE) || + !IS_ALIGNED(extent_offset, PAGE_SIZE)) + goto out_check; /* * If extent is created before the last volume's snapshot * this implies the extent is shared, hence we can't do @@ -1676,8 +1696,8 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode, */ if (!nocow) { if (cow_start == (u64)-1) - cow_start = cur_offset; - cur_offset = extent_end; + cow_start = round_down(cur_offset, PAGE_SIZE); + cur_offset = round_up(extent_end, PAGE_SIZE); if (cur_offset > end) break; path->slots[0]++; @@ -1692,6 +1712,7 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode, * NOCOW, following one which needs to be COW'ed */ if (cow_start != (u64)-1) { + ASSERT(IS_ALIGNED(cow_start, PAGE_SIZE)); ret = fallback_to_cow(inode, locked_page, cow_start, found_key.offset - 1, page_started, nr_written); @@ -1700,6 +1721,9 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode, cow_start = (u64)-1; } + ASSERT(IS_ALIGNED(cur_offset, PAGE_SIZE) && + IS_ALIGNED(num_bytes, PAGE_SIZE) && + IS_ALIGNED(found_key.offset, PAGE_SIZE)); if (extent_type == BTRFS_FILE_EXTENT_PREALLOC) { u64 orig_start = found_key.offset - extent_offset; struct extent_map *em; @@ -1774,7 +1798,7 @@ static noinline int run_delalloc_nocow(struct btrfs_inode *inode, cow_start = cur_offset; if (cow_start != (u64)-1) { - cur_offset = end; + cur_offset = round_up(end, PAGE_SIZE) - 1; ret = fallback_to_cow(inode, locked_page, cow_start, end, page_started, nr_written); if (ret) From patchwork Wed Oct 21 06:25:53 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848437 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 18B4EC4363A for ; Wed, 21 Oct 2020 06:28:33 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8D0D521D43 for ; Wed, 21 Oct 2020 06:28:32 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="hA1KDPxO" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440874AbgJUG2b (ORCPT ); Wed, 21 Oct 2020 02:28:31 -0400 Received: from mx2.suse.de ([195.135.220.15]:45196 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2440872AbgJUG2a (ORCPT ); Wed, 21 Oct 2020 02:28:30 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261709; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=WwtEX9GctKS04n+aWQNMJNQy7nnQbIwcQUC6nbpgzR8=; b=hA1KDPxObv8cefecz5Nks60lBgQvRCqbhh2QCUQRdwPYg2GSmCtSaJ1fcoQzbj7Qygx3BJ CYj84ORr6aBLvgZLC+0+Xu0d7QOqTHQiBl5mihHHWFrBN1lLpE5wvg3BFYeGCoH5mq5QYz ZyStmyl3eYjJ5PXIxXYkNS7bunUtx4w= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 0A53EAC1D for ; Wed, 21 Oct 2020 06:28:29 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 67/68] btrfs: reflink: do full page writeback for reflink prepare Date: Wed, 21 Oct 2020 14:25:53 +0800 Message-Id: <20201021062554.68132-68-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Since we don't support subpage writeback yet, let btrfs_remap_file_range_prep() to do full page writeback. This only affects subpage support, as the regular sectorsize support already has its sectorsize == PAGE_SIZE. Signed-off-by: Qu Wenruo --- fs/btrfs/reflink.c | 36 ++++++++++++++++++++++++++---------- 1 file changed, 26 insertions(+), 10 deletions(-) diff --git a/fs/btrfs/reflink.c b/fs/btrfs/reflink.c index 5cd02514cf4d..e8023c1dcb5d 100644 --- a/fs/btrfs/reflink.c +++ b/fs/btrfs/reflink.c @@ -700,9 +700,15 @@ static int btrfs_remap_file_range_prep(struct file *file_in, loff_t pos_in, { struct inode *inode_in = file_inode(file_in); struct inode *inode_out = file_inode(file_out); - u64 bs = BTRFS_I(inode_out)->root->fs_info->sb->s_blocksize; + /* + * We don't support subpage write yet, thus for data writeback we + * must use PAGE_SIZE here. But for reflink we still support proper + * sector alignment. + */ + u32 wb_bs = PAGE_SIZE; bool same_inode = inode_out == inode_in; - u64 wb_len; + u64 in_wb_len; + u64 out_wb_len; int ret; if (!(remap_flags & REMAP_FILE_DEDUP)) { @@ -735,11 +741,21 @@ static int btrfs_remap_file_range_prep(struct file *file_in, loff_t pos_in, * waits for the writeback to complete, i.e. for IO to be done, and * not for the ordered extents to complete. We need to wait for them * to complete so that new file extent items are in the fs tree. + * + * Also for subpage case, since at different offset the same length can + * cover different number of pages, we have to calculate the wb_len for + * each file. */ - if (*len == 0 && !(remap_flags & REMAP_FILE_DEDUP)) - wb_len = ALIGN(inode_in->i_size, bs) - ALIGN_DOWN(pos_in, bs); - else - wb_len = ALIGN(*len, bs); + if (*len == 0 && !(remap_flags & REMAP_FILE_DEDUP)) { + in_wb_len = round_up(inode_in->i_size, wb_bs) - + round_down(pos_in, wb_bs); + out_wb_len = in_wb_len; + } else { + in_wb_len = round_up(pos_in + *len, wb_bs) - + round_down(pos_in, wb_bs); + out_wb_len = round_up(pos_out + *len, wb_bs) - + round_down(pos_out, wb_bs); + } /* * Since we don't lock ranges, wait for ongoing lockless dio writes (as @@ -771,12 +787,12 @@ static int btrfs_remap_file_range_prep(struct file *file_in, loff_t pos_in, if (ret < 0) return ret; - ret = btrfs_wait_ordered_range(inode_in, ALIGN_DOWN(pos_in, bs), - wb_len); + ret = btrfs_wait_ordered_range(inode_in, round_down(pos_in, wb_bs), + in_wb_len); if (ret < 0) return ret; - ret = btrfs_wait_ordered_range(inode_out, ALIGN_DOWN(pos_out, bs), - wb_len); + ret = btrfs_wait_ordered_range(inode_out, round_down(pos_out, wb_bs), + out_wb_len); if (ret < 0) return ret; From patchwork Wed Oct 21 06:25:54 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11848435 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5C187C56201 for ; Wed, 21 Oct 2020 06:28:34 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E30EF21D43 for ; Wed, 21 Oct 2020 06:28:33 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="NLJfVCe9" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2440877AbgJUG2d (ORCPT ); Wed, 21 Oct 2020 02:28:33 -0400 Received: from mx2.suse.de ([195.135.220.15]:45216 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2440875AbgJUG2c (ORCPT ); Wed, 21 Oct 2020 02:28:32 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1603261711; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Z2q8lfHQMsSCrANMA4TRMEc6S6kXPwGn1Fi0nngPG7M=; b=NLJfVCe9XXtQ1cvjpO4d0ZeOyMB83s05t2fckfBlesEO/BcKccL6lmGl3c6+qjnrd5Lsit l7hicidjws25KYFUZv6//kVlyfMjuoIbk7OybKLqYjDSS42QZQNnoL5OgcECt99kjgdm/4 yyL22F+NcxCawiE44Gw8D0WJvHYvxNc= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 0FA0AAC35 for ; Wed, 21 Oct 2020 06:28:31 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v4 68/68] btrfs: support subpage read write for test Date: Wed, 21 Oct 2020 14:25:54 +0800 Message-Id: <20201021062554.68132-69-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20201021062554.68132-1-wqu@suse.com> References: <20201021062554.68132-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Signed-off-by: Qu Wenruo --- fs/btrfs/disk-io.c | 10 ---------- fs/btrfs/super.c | 7 ------- 2 files changed, 17 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 2ac980f739dc..8b5f65e6c5fa 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -3335,16 +3335,6 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device goto fail_alloc; } - /* For 4K sector size support, it's only read-only yet */ - if (PAGE_SIZE == SZ_64K && sectorsize == SZ_4K) { - if (!sb_rdonly(sb) || btrfs_super_log_root(disk_super)) { - btrfs_err(fs_info, - "subpage sector size only support RO yet"); - err = -EINVAL; - goto fail_alloc; - } - } - ret = btrfs_init_workqueues(fs_info, fs_devices); if (ret) { err = ret; diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 743a2fadf4ee..25967ecaaf0a 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -1922,13 +1922,6 @@ static int btrfs_remount(struct super_block *sb, int *flags, char *data) ret = -EINVAL; goto restore; } - if (btrfs_is_subpage(fs_info)) { - btrfs_warn(fs_info, - "read-write mount is not yet allowed for sector size %u page size %lu", - fs_info->sectorsize, PAGE_SIZE); - ret = -EINVAL; - goto restore; - } ret = btrfs_cleanup_fs_roots(fs_info); if (ret)