From patchwork Wed Sep 30 01:54:51 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807563 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 864FB6CB for ; Wed, 30 Sep 2020 01:55:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5C7192145D for ; Wed, 30 Sep 2020 01:55:48 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="pE83HhHs" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729716AbgI3Bzr (ORCPT ); Tue, 29 Sep 2020 21:55:47 -0400 Received: from mx2.suse.de ([195.135.220.15]:49510 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729322AbgI3Bzr (ORCPT ); Tue, 29 Sep 2020 21:55:47 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601430945; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=DrRbpVF/5vthz1rTFsGhZ0kkdAn5jEC49CjFpBGtrEo=; b=pE83HhHsme0QIrRqDjkwCRawbKtiy8jRj7Xa2o9Un/IZe9y/vF7+kfB5m1ROoyvWZuSXdd NFbWCbIPA1ncsS5qGYQLs/6Domg2n9EnGM63rK2rDbjeXyvLgjyFimWJd6CjQ0omLeKQBm 40rqO9qatWqqW+wlNO2cHABEHivyJLY= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 93830AF95 for ; Wed, 30 Sep 2020 01:55:45 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v3 01/49] btrfs: extent-io-tests: remove invalid tests Date: Wed, 30 Sep 2020 09:54:51 +0800 Message-Id: <20200930015539.48867-2-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org In extent-io-test, there are two invalid tests: - Invalid nodesize for test_eb_bitmaps() Instead of the sectorsize and nodesize combination passed in, we're always using hand-crafted nodesize. Although it has some extra check for 64K page size, we can still hit a case where PAGE_SIZE == 32K, then we got 128K nodesize which is larger than max valid node size. Thankfully most machines are either 4K or 64K page size, thus we haven't yet hit such case. - Invalid extent buffer bytenr For 64K page size, the only combination we're going to test is sectorsize = nodesize = 64K. In that case, we'll try to create an extent buffer with 32K bytenr, which is not aligned to sectorsize thus invalid. This patch will fix both problems by: - Honor the sectorsize/nodesize combination Now we won't bother to hand-craft a strange length and use it as nodesize. - Use sectorsize as the 2nd run extent buffer start This would test the case where extent buffer is aligned to sectorsize but not always aligned to nodesize. Signed-off-by: Qu Wenruo --- fs/btrfs/tests/extent-io-tests.c | 26 +++++++++++--------------- 1 file changed, 11 insertions(+), 15 deletions(-) diff --git a/fs/btrfs/tests/extent-io-tests.c b/fs/btrfs/tests/extent-io-tests.c index df7ce874a74b..73e96d505f4f 100644 --- a/fs/btrfs/tests/extent-io-tests.c +++ b/fs/btrfs/tests/extent-io-tests.c @@ -379,54 +379,50 @@ static int __test_eb_bitmaps(unsigned long *bitmap, struct extent_buffer *eb, static int test_eb_bitmaps(u32 sectorsize, u32 nodesize) { struct btrfs_fs_info *fs_info; - unsigned long len; unsigned long *bitmap = NULL; struct extent_buffer *eb = NULL; int ret; test_msg("running extent buffer bitmap tests"); - /* - * In ppc64, sectorsize can be 64K, thus 4 * 64K will be larger than - * BTRFS_MAX_METADATA_BLOCKSIZE. - */ - len = (sectorsize < BTRFS_MAX_METADATA_BLOCKSIZE) - ? sectorsize * 4 : sectorsize; - - fs_info = btrfs_alloc_dummy_fs_info(len, len); + fs_info = btrfs_alloc_dummy_fs_info(nodesize, sectorsize); if (!fs_info) { test_std_err(TEST_ALLOC_FS_INFO); return -ENOMEM; } - bitmap = kmalloc(len, GFP_KERNEL); + bitmap = kmalloc(nodesize, GFP_KERNEL); if (!bitmap) { test_err("couldn't allocate test bitmap"); ret = -ENOMEM; goto out; } - eb = __alloc_dummy_extent_buffer(fs_info, 0, len); + eb = __alloc_dummy_extent_buffer(fs_info, 0, nodesize); if (!eb) { test_std_err(TEST_ALLOC_ROOT); ret = -ENOMEM; goto out; } - ret = __test_eb_bitmaps(bitmap, eb, len); + ret = __test_eb_bitmaps(bitmap, eb, nodesize); if (ret) goto out; - /* Do it over again with an extent buffer which isn't page-aligned. */ free_extent_buffer(eb); - eb = __alloc_dummy_extent_buffer(fs_info, nodesize / 2, len); + + /* + * Test again for case where the tree block is sectorsize aligned but + * not nodesize aligned. + */ + eb = __alloc_dummy_extent_buffer(fs_info, sectorsize, nodesize); if (!eb) { test_std_err(TEST_ALLOC_ROOT); ret = -ENOMEM; goto out; } - ret = __test_eb_bitmaps(bitmap, eb, len); + ret = __test_eb_bitmaps(bitmap, eb, nodesize); out: free_extent_buffer(eb); kfree(bitmap); From patchwork Wed Sep 30 01:54:52 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807565 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5A699618 for ; Wed, 30 Sep 2020 01:55:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2E67E2074B for ; Wed, 30 Sep 2020 01:55:51 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="Wm+6sgTU" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729722AbgI3Bzu (ORCPT ); Tue, 29 Sep 2020 21:55:50 -0400 Received: from mx2.suse.de ([195.135.220.15]:49540 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729322AbgI3Bzu (ORCPT ); Tue, 29 Sep 2020 21:55:50 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601430948; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4PpVl209z3KEQlPi7nxsOc6ZqHyPbggMt6Dqzr4kkZc=; b=Wm+6sgTUSh6nnnhCx1cUNnLhaPK5UKfMY6sthBKyDkWH30fRZlORMUTV5id1up7m0LDRAC gbwyVgSkWGTYaI6gGuetVL9KR0XyRmAt0cA35jsNkUJWZeaZSEqCUtTJKHERnaQ4lO1mou 6zjGB0ObabQOL0xp0UNEda88Xr+DQqA= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 5A61DAE07; Wed, 30 Sep 2020 01:55:48 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Cc: Goldwyn Rodrigues , Goldwyn Rodrigues Subject: [PATCH v3 02/49] btrfs: use iosize while reading compressed pages Date: Wed, 30 Sep 2020 09:54:52 +0800 Message-Id: <20200930015539.48867-3-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org From: Goldwyn Rodrigues While using compression, a submitted bio is mapped with a compressed bio which performs the read from disk, decompresses and returns uncompressed data to original bio. The original bio must reflect the uncompressed size (iosize) of the I/O to be performed, or else the page just gets the decompressed I/O length of data (disk_io_size). The compressed bio checks the extent map and get the correct length while performing the I/O from disk. This came up in subpage work when only compressed length of the original bio was filled in the page. This worked correctly for pagesize == sectorsize because both compressed and uncompressed data are at pagesize boundaries, and would end up filling the requested page. Signed-off-by: Goldwyn Rodrigues --- fs/btrfs/extent_io.c | 10 +++------- 1 file changed, 3 insertions(+), 7 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index a940edb1e64f..64f7f61ce718 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -3162,7 +3162,6 @@ static int __do_readpage(struct page *page, int nr = 0; size_t pg_offset = 0; size_t iosize; - size_t disk_io_size; size_t blocksize = inode->i_sb->s_blocksize; unsigned long this_bio_flag = 0; struct extent_io_tree *tree = &BTRFS_I(inode)->io_tree; @@ -3228,13 +3227,10 @@ static int __do_readpage(struct page *page, iosize = min(extent_map_end(em) - cur, end - cur + 1); cur_end = min(extent_map_end(em) - 1, end); iosize = ALIGN(iosize, blocksize); - if (this_bio_flag & EXTENT_BIO_COMPRESSED) { - disk_io_size = em->block_len; + if (this_bio_flag & EXTENT_BIO_COMPRESSED) offset = em->block_start; - } else { + else offset = em->block_start + extent_offset; - disk_io_size = iosize; - } block_start = em->block_start; if (test_bit(EXTENT_FLAG_PREALLOC, &em->flags)) block_start = EXTENT_MAP_HOLE; @@ -3323,7 +3319,7 @@ static int __do_readpage(struct page *page, } ret = submit_extent_page(REQ_OP_READ | read_flags, NULL, - page, offset, disk_io_size, + page, offset, iosize, pg_offset, bio, end_bio_extent_readpage, mirror_num, *bio_flags, From patchwork Wed Sep 30 01:54:53 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807567 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B81E3618 for ; Wed, 30 Sep 2020 01:55:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 92E8F2145D for ; Wed, 30 Sep 2020 01:55:53 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="IaxbDooK" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729739AbgI3Bzw (ORCPT ); Tue, 29 Sep 2020 21:55:52 -0400 Received: from mx2.suse.de ([195.135.220.15]:49564 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729322AbgI3Bzw (ORCPT ); Tue, 29 Sep 2020 21:55:52 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601430950; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CkscHhzEETPJWamFzUduOECkO2dzdk7UoVoARX8lg0Q=; b=IaxbDooKUVAdrvi/Y12QNljoBD0BekixPR8gYGMOu06qZBhc2phdbJOV+UgSycjdd1EH6g KgiAygIIr+lFAyei7PLwZcHcc3DFAS+oesU0m8TMcLRdyexkV6WmZLdZGhidnWm3JjcRWf H2m8cFJn/3U/MdDAWr8BsXvP9W0dX6E= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 95E53AE07 for ; Wed, 30 Sep 2020 01:55:50 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v3 03/49] btrfs: extent_io: fix the comment on lock_extent_buffer_for_io(). Date: Wed, 30 Sep 2020 09:54:53 +0800 Message-Id: <20200930015539.48867-4-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org The return value of that function is completely wrong. That function only return 0 if the the extent buffer doesn't need to be submitted. The "ret = 1" and "ret = 0" are determined by the return value of "test_and_clear_bit(EXTENT_BUFFER_DIRTY, &eb->bflags)". And if we get ret == 1, it's because the extent buffer is dirty, and we set its status to EXTENT_BUFFER_WRITE_BACK, and continue to page locking. While if we get ret == 0, it means the extent is not dirty from the beginning, so we don't need to write it back. The caller also follows this, in btree_write_cache_pages(), if lock_extent_buffer_for_io() return 0, we just skip the extent buffer completely. So the comment is completely wrong. Since we're here, also change the description a little. The write bio flushing won't be visible to the caller, thus it's not an major feature. In the main decription, only describe the locking part to make the point more clear. Fixes: 2e3c25136adf ("btrfs: extent_io: add proper error handling to lock_extent_buffer_for_io()") Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 64f7f61ce718..a64d88163f3b 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -3688,11 +3688,14 @@ static void end_extent_buffer_writeback(struct extent_buffer *eb) } /* - * Lock eb pages and flush the bio if we can't the locks + * Lock extent buffer status and pages for write back. * - * Return 0 if nothing went wrong - * Return >0 is same as 0, except bio is not submitted - * Return <0 if something went wrong, no page is locked + * May try to flush write bio if we can't get the lock. + * + * Return 0 if the extent buffer doesn't need to be submitted. + * (E.g. the extent buffer is not dirty) + * Return >0 is the extent buffer is submitted to bio. + * Return <0 if something went wrong, no page is locked. */ static noinline_for_stack int lock_extent_buffer_for_io(struct extent_buffer *eb, struct extent_page_data *epd) From patchwork Wed Sep 30 01:54:54 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807569 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AD16A618 for ; Wed, 30 Sep 2020 01:55:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 869712145D for ; Wed, 30 Sep 2020 01:55:55 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="rC/l1732" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729760AbgI3Bzy (ORCPT ); Tue, 29 Sep 2020 21:55:54 -0400 Received: from mx2.suse.de ([195.135.220.15]:49598 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729322AbgI3Bzy (ORCPT ); Tue, 29 Sep 2020 21:55:54 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601430953; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ubi6dxuYF1ruxqVWZUNpXQle0M+Nuln4OPBkB7debVE=; b=rC/l1732RtnQryElaRPc4JvzJU6ilx/NhAyn9y4axU26vd0YyXXDXzjnjmCV/11GixSXyn mI+y8X0DhYH5+S24plykdyz/mpZZYtQILmxC2GWhBV228LUj5xe+749IuDUBsrjeyNqZ3g NM8EPSwk+h74kAXMhalw1jLeGe3clfE= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 6295BAE07 for ; Wed, 30 Sep 2020 01:55:53 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v3 04/49] btrfs: extent_io: update the comment for find_first_extent_bit() Date: Wed, 30 Sep 2020 09:54:54 +0800 Message-Id: <20200930015539.48867-5-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org The pitfall here is, if the parameter @bits has multiple bits set, we will return the first range which just has one of the specified bits set. This is a little tricky if we want an exact match. Anyway, update the comment to inform the callers. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index a64d88163f3b..2980e8384e74 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -1554,11 +1554,12 @@ find_first_extent_bit_state(struct extent_io_tree *tree, } /* - * find the first offset in the io tree with 'bits' set. zero is - * returned if we find something, and *start_ret and *end_ret are - * set to reflect the state struct that was found. + * Find the first offset in the io tree with one or more @bits set. * - * If nothing was found, 1 is returned. If found something, return 0. + * NOTE: If @bits are multiple bits, any bit of @bits will meet the match. + * + * Return 0 if we find something, and update @start_ret and @end_ret. + * Return 1 if we found nothing. */ int find_first_extent_bit(struct extent_io_tree *tree, u64 start, u64 *start_ret, u64 *end_ret, unsigned bits, From patchwork Wed Sep 30 01:54:55 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807571 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 15DDA6CB for ; Wed, 30 Sep 2020 01:55:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E00B32145D for ; Wed, 30 Sep 2020 01:55:58 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="pWMZ9WzR" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729765AbgI3Bz6 (ORCPT ); Tue, 29 Sep 2020 21:55:58 -0400 Received: from mx2.suse.de ([195.135.220.15]:49630 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729322AbgI3Bz5 (ORCPT ); Tue, 29 Sep 2020 21:55:57 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601430956; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Y+G2sGC0w9FY5CXQ37yGwGh8U0WjUxV4C//ehhZOP4k=; b=pWMZ9WzROXDDQaUQd4qgqDR4kVkMUYdFglp6U8/Jm0z7YwvfXLdZwfRqUBlIjwGQCvS2fp w1Qkhx+4GB8tCS9IdBXFWPMgpiQsHDLl5CJ+4Kr7gI4cG4+2Rpws5anx2g8ncKFuJeQZ2p mLOWmpaUu8i5MOczNgePsYYoYVNzWPw= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 44992AE07 for ; Wed, 30 Sep 2020 01:55:56 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v3 05/49] btrfs: make btree inode io_tree has its special owner Date: Wed, 30 Sep 2020 09:54:55 +0800 Message-Id: <20200930015539.48867-6-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Btree inode is pretty special compared to all other inode extent io tree, although it has a btrfs inode, it doesn't have the track_uptodate bit set to true, and never has ordered extent. Since it's so special, adds a new owner value for it to make debuging a little easier. Signed-off-by: Qu Wenruo --- fs/btrfs/disk-io.c | 2 +- fs/btrfs/extent-io-tree.h | 1 + include/trace/events/btrfs.h | 1 + 3 files changed, 3 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index f6bba7eb1fa1..be6edbd34934 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -2116,7 +2116,7 @@ static void btrfs_init_btree_inode(struct btrfs_fs_info *fs_info) RB_CLEAR_NODE(&BTRFS_I(inode)->rb_node); extent_io_tree_init(fs_info, &BTRFS_I(inode)->io_tree, - IO_TREE_INODE_IO, inode); + IO_TREE_BTREE_INODE_IO, inode); BTRFS_I(inode)->io_tree.track_uptodate = false; extent_map_tree_init(&BTRFS_I(inode)->extent_tree); diff --git a/fs/btrfs/extent-io-tree.h b/fs/btrfs/extent-io-tree.h index 219a09a2b734..960d4a24f13e 100644 --- a/fs/btrfs/extent-io-tree.h +++ b/fs/btrfs/extent-io-tree.h @@ -40,6 +40,7 @@ struct io_failure_record; enum { IO_TREE_FS_PINNED_EXTENTS, IO_TREE_FS_EXCLUDED_EXTENTS, + IO_TREE_BTREE_INODE_IO, IO_TREE_INODE_IO, IO_TREE_INODE_IO_FAILURE, IO_TREE_RELOC_BLOCKS, diff --git a/include/trace/events/btrfs.h b/include/trace/events/btrfs.h index 863335ecb7e8..89397605e465 100644 --- a/include/trace/events/btrfs.h +++ b/include/trace/events/btrfs.h @@ -79,6 +79,7 @@ struct btrfs_space_info; #define IO_TREE_OWNER \ EM( IO_TREE_FS_PINNED_EXTENTS, "PINNED_EXTENTS") \ EM( IO_TREE_FS_EXCLUDED_EXTENTS, "EXCLUDED_EXTENTS") \ + EM( IO_TREE_BTREE_INODE_IO, "BTRFS_INODE_IO") \ EM( IO_TREE_INODE_IO, "INODE_IO") \ EM( IO_TREE_INODE_IO_FAILURE, "INODE_IO_FAILURE") \ EM( IO_TREE_RELOC_BLOCKS, "RELOC_BLOCKS") \ From patchwork Wed Sep 30 01:54:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807573 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 35C486CB for ; Wed, 30 Sep 2020 01:56:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0FB322074B for ; Wed, 30 Sep 2020 01:56:01 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="BnLvyBFP" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729768AbgI3B4A (ORCPT ); Tue, 29 Sep 2020 21:56:00 -0400 Received: from mx2.suse.de ([195.135.220.15]:49660 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729322AbgI3Bz7 (ORCPT ); Tue, 29 Sep 2020 21:55:59 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601430958; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9qwYOpGiFv4UAj3z+FgIDuOK6halmkGEViP5Q2QNedI=; b=BnLvyBFP3c/XwEikfdS7xqUv1b9Qqg9mFcdvQbvXgyE8SCPUSMiTfkXOVWeCsjJsJ2EMTY QmMY1JaETnlix/Yc6mQZN+sAZNFKM1qWXlddPH9bpJMR33EIB3WCCU1aSDd+Dr1n0k8szx BGF+a7r9BxnS8+5DiiPgi/cZ6XfDCg8= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 20CF6AE07 for ; Wed, 30 Sep 2020 01:55:58 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v3 06/49] btrfs: disk-io: replace @fs_info and @private_data with @inode for btrfs_wq_submit_bio() Date: Wed, 30 Sep 2020 09:54:56 +0800 Message-Id: <20200930015539.48867-7-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org All callers for btrfs_wq_submit_bio() passes struct inode as @private_data, so there is no need for @private_data to be (void *), just replace it with "struct inode *inode". While we can extra fs_info from struct inode, also remove the @fs_info parameter. Since we're here, also replace all the (void *private_data) into (struct inode *inode). Signed-off-by: Qu Wenruo --- fs/btrfs/disk-io.c | 21 +++++++++++---------- fs/btrfs/disk-io.h | 8 ++++---- fs/btrfs/extent_io.h | 2 +- fs/btrfs/inode.c | 21 +++++++++------------ 4 files changed, 25 insertions(+), 27 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index be6edbd34934..b7436ab7bba9 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -110,7 +110,7 @@ static void btrfs_free_csum_hash(struct btrfs_fs_info *fs_info) * just before they are sent down the IO stack. */ struct async_submit_bio { - void *private_data; + struct inode *inode; struct bio *bio; extent_submit_bio_start_t *submit_bio_start; int mirror_num; @@ -746,7 +746,7 @@ static void run_one_async_start(struct btrfs_work *work) blk_status_t ret; async = container_of(work, struct async_submit_bio, work); - ret = async->submit_bio_start(async->private_data, async->bio, + ret = async->submit_bio_start(async->inode, async->bio, async->bio_offset); if (ret) async->status = ret; @@ -767,7 +767,7 @@ static void run_one_async_done(struct btrfs_work *work) blk_status_t ret; async = container_of(work, struct async_submit_bio, work); - inode = async->private_data; + inode = async->inode; /* If an error occurred we just want to clean up the bio and move on */ if (async->status) { @@ -797,18 +797,19 @@ static void run_one_async_free(struct btrfs_work *work) kfree(async); } -blk_status_t btrfs_wq_submit_bio(struct btrfs_fs_info *fs_info, struct bio *bio, +blk_status_t btrfs_wq_submit_bio(struct inode *inode, struct bio *bio, int mirror_num, unsigned long bio_flags, - u64 bio_offset, void *private_data, + u64 bio_offset, extent_submit_bio_start_t *submit_bio_start) { + struct btrfs_fs_info *fs_info = BTRFS_I(inode)->root->fs_info; struct async_submit_bio *async; async = kmalloc(sizeof(*async), GFP_NOFS); if (!async) return BLK_STS_RESOURCE; - async->private_data = private_data; + async->inode = inode; async->bio = bio; async->mirror_num = mirror_num; async->submit_bio_start = submit_bio_start; @@ -845,8 +846,8 @@ static blk_status_t btree_csum_one_bio(struct bio *bio) return errno_to_blk_status(ret); } -static blk_status_t btree_submit_bio_start(void *private_data, struct bio *bio, - u64 bio_offset) +static blk_status_t btree_submit_bio_start(struct inode *inode, struct bio *bio, + u64 bio_offset) { /* * when we're called for a write, we're already in the async @@ -893,8 +894,8 @@ static blk_status_t btree_submit_bio_hook(struct inode *inode, struct bio *bio, * kthread helpers are used to submit writes so that * checksumming can happen in parallel across all CPUs */ - ret = btrfs_wq_submit_bio(fs_info, bio, mirror_num, 0, - 0, inode, btree_submit_bio_start); + ret = btrfs_wq_submit_bio(inode, bio, mirror_num, 0, + 0, btree_submit_bio_start); } if (ret) diff --git a/fs/btrfs/disk-io.h b/fs/btrfs/disk-io.h index 00dc39d47ed3..2d564e9223e2 100644 --- a/fs/btrfs/disk-io.h +++ b/fs/btrfs/disk-io.h @@ -105,10 +105,10 @@ int btrfs_read_buffer(struct extent_buffer *buf, u64 parent_transid, int level, struct btrfs_key *first_key); blk_status_t btrfs_bio_wq_end_io(struct btrfs_fs_info *info, struct bio *bio, enum btrfs_wq_endio_type metadata); -blk_status_t btrfs_wq_submit_bio(struct btrfs_fs_info *fs_info, struct bio *bio, - int mirror_num, unsigned long bio_flags, - u64 bio_offset, void *private_data, - extent_submit_bio_start_t *submit_bio_start); +blk_status_t btrfs_wq_submit_bio(struct inode *inode, struct bio *bio, + int mirror_num, unsigned long bio_flags, + u64 bio_offset, + extent_submit_bio_start_t *submit_bio_start); blk_status_t btrfs_submit_bio_done(void *private_data, struct bio *bio, int mirror_num); int btrfs_init_log_root_tree(struct btrfs_trans_handle *trans, diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h index 30794ae58498..3c9252b429e0 100644 --- a/fs/btrfs/extent_io.h +++ b/fs/btrfs/extent_io.h @@ -71,7 +71,7 @@ typedef blk_status_t (submit_bio_hook_t)(struct inode *inode, struct bio *bio, int mirror_num, unsigned long bio_flags); -typedef blk_status_t (extent_submit_bio_start_t)(void *private_data, +typedef blk_status_t (extent_submit_bio_start_t)(struct inode *inode, struct bio *bio, u64 bio_offset); struct extent_io_ops { diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 9570458aa847..e5d558ef4c7f 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -2157,11 +2157,9 @@ int btrfs_bio_fits_in_stripe(struct page *page, size_t size, struct bio *bio, * At IO completion time the cums attached on the ordered extent record * are inserted into the btree */ -static blk_status_t btrfs_submit_bio_start(void *private_data, struct bio *bio, - u64 bio_offset) +static blk_status_t btrfs_submit_bio_start(struct inode *inode, struct bio *bio, + u64 bio_offset) { - struct inode *inode = private_data; - return btrfs_csum_one_bio(BTRFS_I(inode), bio, 0, 0); } @@ -2221,8 +2219,8 @@ static blk_status_t btrfs_submit_bio_hook(struct inode *inode, struct bio *bio, if (root->root_key.objectid == BTRFS_DATA_RELOC_TREE_OBJECTID) goto mapit; /* we're doing a write, do the async checksumming */ - ret = btrfs_wq_submit_bio(fs_info, bio, mirror_num, bio_flags, - 0, inode, btrfs_submit_bio_start); + ret = btrfs_wq_submit_bio(inode, bio, mirror_num, bio_flags, + 0, btrfs_submit_bio_start); goto out; } else if (!skip_sum) { ret = btrfs_csum_one_bio(BTRFS_I(inode), bio, 0, 0); @@ -7616,11 +7614,10 @@ static void __endio_write_update_ordered(struct btrfs_inode *inode, } } -static blk_status_t btrfs_submit_bio_start_direct_io(void *private_data, - struct bio *bio, u64 offset) +static blk_status_t btrfs_submit_bio_start_direct_io(struct inode *inode, + struct bio *bio, + u64 offset) { - struct inode *inode = private_data; - return btrfs_csum_one_bio(BTRFS_I(inode), bio, offset, 1); } @@ -7671,8 +7668,8 @@ static inline blk_status_t btrfs_submit_dio_bio(struct bio *bio, goto map; if (write && async_submit) { - ret = btrfs_wq_submit_bio(fs_info, bio, 0, 0, - file_offset, inode, + ret = btrfs_wq_submit_bio(inode, bio, 0, 0, + file_offset, btrfs_submit_bio_start_direct_io); goto err; } else if (write) { From patchwork Wed Sep 30 01:54:57 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807575 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 44DC7618 for ; Wed, 30 Sep 2020 01:56:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1D82C2145D for ; Wed, 30 Sep 2020 01:56:04 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="O7MpL1PG" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729779AbgI3B4C (ORCPT ); Tue, 29 Sep 2020 21:56:02 -0400 Received: from mx2.suse.de ([195.135.220.15]:49696 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729322AbgI3B4C (ORCPT ); Tue, 29 Sep 2020 21:56:02 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601430960; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=KFZDA0ErJ2fdfe7ernxUM9QVz0rHd5+wQcecA2oUE2k=; b=O7MpL1PG88Vmm/nFMnz1HjMeiJn5ESzVqmPVXBQSnYc45FRPwEH7W5eYD5w6vggOwuCINP cRXv/OexlA8NEJdvVnc6HIDhdtBg4dS7zDWAHiq37peGcCjhM8DepsMbN/i9pngrUqI46W P8S47PFltykSQBpPZCfDoqzATgzhTxo= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 3240CAF95 for ; Wed, 30 Sep 2020 01:56:00 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v3 07/49] btrfs: inode: sink parameter @start and @len for check_data_csum() Date: Wed, 30 Sep 2020 09:54:57 +0800 Message-Id: <20200930015539.48867-8-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org For check_data_csum(), the page we're using is directly from inode mapping, thus it has valid page_offset(). We can use (page_offset() + pg_off) to replace @start parameter completely, while the @len should always be sectorsize. Since we're here, also add some comment, as there are quite some confusion in words like start/offset, without explaining whether it's file_offset or logical bytenr. This should not affect the existing behavior, as for current sectorsize == PAGE_SIZE case, @pgoff should always be 0, and len is always PAGE_SIZE (or sectorsize from the dio read path). Signed-off-by: Qu Wenruo --- fs/btrfs/inode.c | 27 +++++++++++++++++++-------- 1 file changed, 19 insertions(+), 8 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index e5d558ef4c7f..10ea6a92685b 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -2791,17 +2791,30 @@ void btrfs_writepage_endio_finish_ordered(struct page *page, u64 start, btrfs_queue_work(wq, &ordered_extent->work); } +/* + * Verify the checksum of one sector of uncompressed data. + * + * @inode: The inode. + * @io_bio: The btrfs_io_bio which contains the csum. + * @icsum: The csum offset (by number of sectors). + * @page: The page where the data to be verified is. + * @pgoff: The offset inside the page. + * + * The length of such check is always one sector size. + */ static int check_data_csum(struct inode *inode, struct btrfs_io_bio *io_bio, - int icsum, struct page *page, int pgoff, u64 start, - size_t len) + int icsum, struct page *page, int pgoff) { struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); SHASH_DESC_ON_STACK(shash, fs_info->csum_shash); char *kaddr; + u32 len = fs_info->sectorsize; u16 csum_size = btrfs_super_csum_size(fs_info->super_copy); u8 *csum_expected; u8 csum[BTRFS_CSUM_SIZE]; + ASSERT(pgoff + len <= PAGE_SIZE); + csum_expected = ((u8 *)io_bio->csum) + icsum * csum_size; kaddr = kmap_atomic(page); @@ -2815,8 +2828,8 @@ static int check_data_csum(struct inode *inode, struct btrfs_io_bio *io_bio, kunmap_atomic(kaddr); return 0; zeroit: - btrfs_print_data_csum_error(BTRFS_I(inode), start, csum, csum_expected, - io_bio->mirror_num); + btrfs_print_data_csum_error(BTRFS_I(inode), page_offset(page) + pgoff, + csum, csum_expected, io_bio->mirror_num); if (io_bio->device) btrfs_dev_stat_inc_and_print(io_bio->device, BTRFS_DEV_STAT_CORRUPTION_ERRS); @@ -2855,8 +2868,7 @@ static int btrfs_readpage_end_io_hook(struct btrfs_io_bio *io_bio, } phy_offset >>= inode->i_sb->s_blocksize_bits; - return check_data_csum(inode, io_bio, phy_offset, page, offset, start, - (size_t)(end - start + 1)); + return check_data_csum(inode, io_bio, phy_offset, page, offset); } /* @@ -7543,8 +7555,7 @@ static blk_status_t btrfs_check_read_dio_bio(struct inode *inode, ASSERT(pgoff < PAGE_SIZE); if (uptodate && (!csum || !check_data_csum(inode, io_bio, icsum, - bvec.bv_page, pgoff, - start, sectorsize))) { + bvec.bv_page, pgoff))) { clean_io_failure(fs_info, failure_tree, io_tree, start, bvec.bv_page, btrfs_ino(BTRFS_I(inode)), From patchwork Wed Sep 30 01:54:58 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807577 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5EB856CB for ; Wed, 30 Sep 2020 01:56:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4286F21531 for ; Wed, 30 Sep 2020 01:56:05 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="h8n9ZXAx" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729784AbgI3B4E (ORCPT ); Tue, 29 Sep 2020 21:56:04 -0400 Received: from mx2.suse.de ([195.135.220.15]:49750 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729322AbgI3B4D (ORCPT ); Tue, 29 Sep 2020 21:56:03 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601430962; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=T88ZxgyZASDN2sG2X2N5+Ua2OFyqDTZt0AQSeJsOKSM=; b=h8n9ZXAxc6c9Ao1CXjg0+ybP3/FTx0/ovu5M5nCInFNn0DGEsFr1i/AmaPnUDxeXIM+lzO lyYCpKNvqXuXp48GuROfgPkestfgJ00zXF+3bLHudJgaDS2WxDCyPYt5LQf6QECh50RYM3 WZhbmZuk12h/mOeYSTWnSiFNUyUlrfQ= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 0F508AF99 for ; Wed, 30 Sep 2020 01:56:02 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v3 08/49] btrfs: extent_io: unexport extent_invalidatepage() Date: Wed, 30 Sep 2020 09:54:58 +0800 Message-Id: <20200930015539.48867-9-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Function extent_invalidatepage() has a single caller, btree_invalidatepage(). Just unexport this function and move it disk-io.c. Signed-off-by: Qu Wenruo --- fs/btrfs/disk-io.c | 23 +++++++++++++++++++++++ fs/btrfs/extent-io-tree.h | 2 -- fs/btrfs/extent_io.c | 24 ------------------------ 3 files changed, 23 insertions(+), 26 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index b7436ab7bba9..c81b7e53149c 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -966,6 +966,29 @@ static int btree_releasepage(struct page *page, gfp_t gfp_flags) return try_release_extent_buffer(page); } +/* + * basic invalidatepage code, this waits on any locked or writeback + * ranges corresponding to the page, and then deletes any extent state + * records from the tree + */ +static void extent_invalidatepage(struct extent_io_tree *tree, + struct page *page, unsigned long offset) +{ + struct extent_state *cached_state = NULL; + u64 start = page_offset(page); + u64 end = start + PAGE_SIZE - 1; + size_t blocksize = page->mapping->host->i_sb->s_blocksize; + + start += ALIGN(offset, blocksize); + if (start > end) + return; + + lock_extent_bits(tree, start, end, &cached_state); + wait_on_page_writeback(page); + clear_extent_bit(tree, start, end, EXTENT_LOCKED | EXTENT_DELALLOC | + EXTENT_DO_ACCOUNTING, 1, 1, &cached_state); +} + static void btree_invalidatepage(struct page *page, unsigned int offset, unsigned int length) { diff --git a/fs/btrfs/extent-io-tree.h b/fs/btrfs/extent-io-tree.h index 960d4a24f13e..5927338c74a2 100644 --- a/fs/btrfs/extent-io-tree.h +++ b/fs/btrfs/extent-io-tree.h @@ -229,8 +229,6 @@ void find_first_clear_extent_bit(struct extent_io_tree *tree, u64 start, u64 *start_ret, u64 *end_ret, unsigned bits); int find_contiguous_extent_bit(struct extent_io_tree *tree, u64 start, u64 *start_ret, u64 *end_ret, unsigned bits); -int extent_invalidatepage(struct extent_io_tree *tree, - struct page *page, unsigned long offset); bool btrfs_find_delalloc_range(struct extent_io_tree *tree, u64 *start, u64 *end, u64 max_bytes, struct extent_state **cached_state); diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 2980e8384e74..02c3518afa82 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -4405,30 +4405,6 @@ void extent_readahead(struct readahead_control *rac) } } -/* - * basic invalidatepage code, this waits on any locked or writeback - * ranges corresponding to the page, and then deletes any extent state - * records from the tree - */ -int extent_invalidatepage(struct extent_io_tree *tree, - struct page *page, unsigned long offset) -{ - struct extent_state *cached_state = NULL; - u64 start = page_offset(page); - u64 end = start + PAGE_SIZE - 1; - size_t blocksize = page->mapping->host->i_sb->s_blocksize; - - start += ALIGN(offset, blocksize); - if (start > end) - return 0; - - lock_extent_bits(tree, start, end, &cached_state); - wait_on_page_writeback(page); - clear_extent_bit(tree, start, end, EXTENT_LOCKED | EXTENT_DELALLOC | - EXTENT_DO_ACCOUNTING, 1, 1, &cached_state); - return 0; -} - /* * a helper for releasepage, this tests for areas of the page that * are locked or under IO and drops the related state bits if it is safe From patchwork Wed Sep 30 01:54:59 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807579 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C2A8D6CB for ; Wed, 30 Sep 2020 01:56:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A2AD621531 for ; Wed, 30 Sep 2020 01:56:07 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="fDCNEogy" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729789AbgI3B4G (ORCPT ); Tue, 29 Sep 2020 21:56:06 -0400 Received: from mx2.suse.de ([195.135.220.15]:49782 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729322AbgI3B4G (ORCPT ); Tue, 29 Sep 2020 21:56:06 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601430964; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jP9FwPA9Ej1+1IIt4+e8jeWQM8EIJBbCYcmANbkFg+U=; b=fDCNEogy0twg2PQfvD08xBL1TacfLbRCPz76CZIiH3lsDHTLX/xHlXdQheUptxn9xDK5lx vE7ruqE4nK5skSQ2cvP2onlr3LGFkfA/VJtfum4sUluRJfHst72po515om0kRaJDmXdfXY Dl1V5pYa0Wz6J9pU0SeYz7uGWzco55c= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 7292AAF95 for ; Wed, 30 Sep 2020 01:56:04 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v3 09/49] btrfs: extent_io: remove the forward declaration and rename __process_pages_contig Date: Wed, 30 Sep 2020 09:54:59 +0800 Message-Id: <20200930015539.48867-10-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org There is no need to do forward declaration for __process_pages_contig(), so move it before it get first called. Since we are here, also remove the "__" prefix since there is no special meaning for it. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 180 +++++++++++++++++++++++-------------------- 1 file changed, 95 insertions(+), 85 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 02c3518afa82..9f46d7f17a9c 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -1810,10 +1810,98 @@ bool btrfs_find_delalloc_range(struct extent_io_tree *tree, u64 *start, return found; } -static int __process_pages_contig(struct address_space *mapping, - struct page *locked_page, - pgoff_t start_index, pgoff_t end_index, - unsigned long page_ops, pgoff_t *index_ret); +/* + * A helper to update contiguous pages status according to @page_ops. + * + * @mapping: The address space of the pages + * @locked_page: The already locked page. Mostly for inline extent + * handling + * @start_index: The start page index. + * @end_inde: The last page index. + * @pages_opts: The operations to be done + * @index_ret: The last handled page index (for error case) + * + * Return 0 if every page is handled properly. + * Return <0 if something wrong happened, and update @index_ret. + */ +static int process_pages_contig(struct address_space *mapping, + struct page *locked_page, + pgoff_t start_index, pgoff_t end_index, + unsigned long page_ops, pgoff_t *index_ret) +{ + unsigned long nr_pages = end_index - start_index + 1; + unsigned long pages_locked = 0; + pgoff_t index = start_index; + struct page *pages[16]; + unsigned ret; + int err = 0; + int i; + + if (page_ops & PAGE_LOCK) { + ASSERT(page_ops == PAGE_LOCK); + ASSERT(index_ret && *index_ret == start_index); + } + + if ((page_ops & PAGE_SET_ERROR) && nr_pages > 0) + mapping_set_error(mapping, -EIO); + + while (nr_pages > 0) { + ret = find_get_pages_contig(mapping, index, + min_t(unsigned long, + nr_pages, ARRAY_SIZE(pages)), pages); + if (ret == 0) { + /* + * Only if we're going to lock these pages, + * can we find nothing at @index. + */ + ASSERT(page_ops & PAGE_LOCK); + err = -EAGAIN; + goto out; + } + + for (i = 0; i < ret; i++) { + if (page_ops & PAGE_SET_PRIVATE2) + SetPagePrivate2(pages[i]); + + if (locked_page && pages[i] == locked_page) { + put_page(pages[i]); + pages_locked++; + continue; + } + if (page_ops & PAGE_CLEAR_DIRTY) + clear_page_dirty_for_io(pages[i]); + if (page_ops & PAGE_SET_WRITEBACK) + set_page_writeback(pages[i]); + if (page_ops & PAGE_SET_ERROR) + SetPageError(pages[i]); + if (page_ops & PAGE_END_WRITEBACK) + end_page_writeback(pages[i]); + if (page_ops & PAGE_UNLOCK) + unlock_page(pages[i]); + if (page_ops & PAGE_LOCK) { + lock_page(pages[i]); + if (!PageDirty(pages[i]) || + pages[i]->mapping != mapping) { + unlock_page(pages[i]); + for (; i < ret; i++) + put_page(pages[i]); + err = -EAGAIN; + goto out; + } + } + put_page(pages[i]); + pages_locked++; + } + nr_pages -= ret; + index += ret; + cond_resched(); + } +out: + if (err && index_ret) + *index_ret = start_index + pages_locked - 1; + return err; +} + static noinline void __unlock_for_delalloc(struct inode *inode, struct page *locked_page, @@ -1826,7 +1914,7 @@ static noinline void __unlock_for_delalloc(struct inode *inode, if (index == locked_page->index && end_index == index) return; - __process_pages_contig(inode->i_mapping, locked_page, index, end_index, + process_pages_contig(inode->i_mapping, locked_page, index, end_index, PAGE_UNLOCK, NULL); } @@ -1844,7 +1932,7 @@ static noinline int lock_delalloc_pages(struct inode *inode, if (index == locked_page->index && index == end_index) return 0; - ret = __process_pages_contig(inode->i_mapping, locked_page, index, + ret = process_pages_contig(inode->i_mapping, locked_page, index, end_index, PAGE_LOCK, &index_ret); if (ret == -EAGAIN) __unlock_for_delalloc(inode, locked_page, delalloc_start, @@ -1941,84 +2029,6 @@ noinline_for_stack bool find_lock_delalloc_range(struct inode *inode, return found; } -static int __process_pages_contig(struct address_space *mapping, - struct page *locked_page, - pgoff_t start_index, pgoff_t end_index, - unsigned long page_ops, pgoff_t *index_ret) -{ - unsigned long nr_pages = end_index - start_index + 1; - unsigned long pages_locked = 0; - pgoff_t index = start_index; - struct page *pages[16]; - unsigned ret; - int err = 0; - int i; - - if (page_ops & PAGE_LOCK) { - ASSERT(page_ops == PAGE_LOCK); - ASSERT(index_ret && *index_ret == start_index); - } - - if ((page_ops & PAGE_SET_ERROR) && nr_pages > 0) - mapping_set_error(mapping, -EIO); - - while (nr_pages > 0) { - ret = find_get_pages_contig(mapping, index, - min_t(unsigned long, - nr_pages, ARRAY_SIZE(pages)), pages); - if (ret == 0) { - /* - * Only if we're going to lock these pages, - * can we find nothing at @index. - */ - ASSERT(page_ops & PAGE_LOCK); - err = -EAGAIN; - goto out; - } - - for (i = 0; i < ret; i++) { - if (page_ops & PAGE_SET_PRIVATE2) - SetPagePrivate2(pages[i]); - - if (locked_page && pages[i] == locked_page) { - put_page(pages[i]); - pages_locked++; - continue; - } - if (page_ops & PAGE_CLEAR_DIRTY) - clear_page_dirty_for_io(pages[i]); - if (page_ops & PAGE_SET_WRITEBACK) - set_page_writeback(pages[i]); - if (page_ops & PAGE_SET_ERROR) - SetPageError(pages[i]); - if (page_ops & PAGE_END_WRITEBACK) - end_page_writeback(pages[i]); - if (page_ops & PAGE_UNLOCK) - unlock_page(pages[i]); - if (page_ops & PAGE_LOCK) { - lock_page(pages[i]); - if (!PageDirty(pages[i]) || - pages[i]->mapping != mapping) { - unlock_page(pages[i]); - for (; i < ret; i++) - put_page(pages[i]); - err = -EAGAIN; - goto out; - } - } - put_page(pages[i]); - pages_locked++; - } - nr_pages -= ret; - index += ret; - cond_resched(); - } -out: - if (err && index_ret) - *index_ret = start_index + pages_locked - 1; - return err; -} - void extent_clear_unlock_delalloc(struct btrfs_inode *inode, u64 start, u64 end, struct page *locked_page, unsigned clear_bits, @@ -2026,7 +2036,7 @@ void extent_clear_unlock_delalloc(struct btrfs_inode *inode, u64 start, u64 end, { clear_extent_bit(&inode->io_tree, start, end, clear_bits, 1, 0, NULL); - __process_pages_contig(inode->vfs_inode.i_mapping, locked_page, + process_pages_contig(inode->vfs_inode.i_mapping, locked_page, start >> PAGE_SHIFT, end >> PAGE_SHIFT, page_ops, NULL); } From patchwork Wed Sep 30 01:55:00 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807581 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CE070618 for ; Wed, 30 Sep 2020 01:56:08 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A864C21531 for ; Wed, 30 Sep 2020 01:56:08 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="H32ltjCs" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729799AbgI3B4H (ORCPT ); Tue, 29 Sep 2020 21:56:07 -0400 Received: from mx2.suse.de ([195.135.220.15]:49806 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729322AbgI3B4H (ORCPT ); Tue, 29 Sep 2020 21:56:07 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601430966; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=k1lefI4FzGWvrTB6CT1o1gko9ovuGLF7cpjECplRzkw=; b=H32ltjCsGuRpmrlOJvjDfyRps+XDxogPHFw7fQCKl8vpx7w8B2fYpYCoXcio5oLjcZJZMI rJCDwQEXSeUrsqqjVGR/DsnHX2b3Aj3vpZzQWXUHUv3hly3sie4BYjqmDDzUhPtek7DRCq khrrEbLwbzbNMzdrKIfagcJAhi6eFmo= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 2F795AF99 for ; Wed, 30 Sep 2020 01:56:06 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v3 10/49] btrfs: extent_io: rename pages_locked in process_pages_contig() Date: Wed, 30 Sep 2020 09:55:00 +0800 Message-Id: <20200930015539.48867-11-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Function process_pages_contig() does not only handle page locking but also other operations. So rename the local variable pages_locked to pages_processed to reduce confusion. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 9f46d7f17a9c..07f8117ddbb4 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -1830,7 +1830,7 @@ static int process_pages_contig(struct address_space *mapping, unsigned long page_ops, pgoff_t *index_ret) { unsigned long nr_pages = end_index - start_index + 1; - unsigned long pages_locked = 0; + unsigned long pages_processed = 0; pgoff_t index = start_index; struct page *pages[16]; unsigned ret; @@ -1865,7 +1865,7 @@ static int process_pages_contig(struct address_space *mapping, if (locked_page && pages[i] == locked_page) { put_page(pages[i]); - pages_locked++; + pages_processed++; continue; } if (page_ops & PAGE_CLEAR_DIRTY) @@ -1890,7 +1890,7 @@ static int process_pages_contig(struct address_space *mapping, } } put_page(pages[i]); - pages_locked++; + pages_processed++; } nr_pages -= ret; index += ret; @@ -1898,7 +1898,7 @@ static int process_pages_contig(struct address_space *mapping, } out: if (err && index_ret) - *index_ret = start_index + pages_locked - 1; + *index_ret = start_index + pages_processed - 1; return err; } From patchwork Wed Sep 30 01:55:01 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807583 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1EB24618 for ; Wed, 30 Sep 2020 01:56:11 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F382221531 for ; Wed, 30 Sep 2020 01:56:10 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="VInqcMND" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729811AbgI3B4K (ORCPT ); Tue, 29 Sep 2020 21:56:10 -0400 Received: from mx2.suse.de ([195.135.220.15]:49840 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729322AbgI3B4J (ORCPT ); Tue, 29 Sep 2020 21:56:09 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601430968; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=GaDb83mY1Pgo031yykI/EvSI0K3g87Yh/Tx9r1dlER4=; b=VInqcMNDN2/WENS9Z3OdZDxqIQ+kNPLJg89VMYaKQNL/bDoZfxp9SNzLnFoHJH/Kgc+NC5 1mq8afqOfTLb04sCnSlxJtVuODKeo39FAgeEjM6lkDtsyM3ReTmf8oRp0r1mVN8gpCw4mf ib/IqVLSPg+LIKdkqOZw9g7lpDvDPdg= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 10BEBAF99 for ; Wed, 30 Sep 2020 01:56:08 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v3 11/49] btrfs: extent_io: make process_pages_contig() to accept bytenr directly Date: Wed, 30 Sep 2020 09:55:01 +0800 Message-Id: <20200930015539.48867-12-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Instead of page index, accept bytenr directly for process_pages_contig(). This allows process_pages_contig() to accept ranges which is not aligned to page size, while still report accurate @end_ret. Currently we still only accept page aligned values, but this provides the basis for later subpage support. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 78 ++++++++++++++++++++++++-------------------- 1 file changed, 43 insertions(+), 35 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 07f8117ddbb4..d35eae29bc80 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -1810,46 +1810,58 @@ bool btrfs_find_delalloc_range(struct extent_io_tree *tree, u64 *start, return found; } +static int calc_bytes_processed(struct page *page, u64 range_start) +{ + u64 page_start = page_offset(page); + u64 real_start = max(range_start, page_start); + + return page_start + PAGE_SIZE - real_start; +} + /* * A helper to update contiguous pages status according to @page_ops. * * @mapping: The address space of the pages * @locked_page: The already locked page. Mostly for inline extent * handling - * @start_index: The start page index. - * @end_inde: The last page index. + * @start: The start file offset + * @end: The end file offset (inclusive) * @pages_opts: The operations to be done - * @index_ret: The last handled page index (for error case) + * @end_ret: The last handled inclusive file offset (for error case) * * Return 0 if every page is handled properly. - * Return <0 if something wrong happened, and update @index_ret. + * Return <0 if something wrong happened, and update @end_ret. */ static int process_pages_contig(struct address_space *mapping, struct page *locked_page, - pgoff_t start_index, pgoff_t end_index, - unsigned long page_ops, pgoff_t *index_ret) + u64 start, u64 end, + unsigned long page_ops, u64 *end_ret) { - unsigned long nr_pages = end_index - start_index + 1; - unsigned long pages_processed = 0; + pgoff_t start_index = start >> PAGE_SHIFT; + pgoff_t end_index = end >> PAGE_SHIFT; pgoff_t index = start_index; + u64 processed_end = start - 1; + unsigned long nr_pages = end_index - start_index + 1; struct page *pages[16]; - unsigned ret; int err = 0; int i; + ASSERT(IS_ALIGNED(start, PAGE_SIZE) && IS_ALIGNED(end + 1, PAGE_SIZE)); if (page_ops & PAGE_LOCK) { ASSERT(page_ops == PAGE_LOCK); - ASSERT(index_ret && *index_ret == start_index); + ASSERT(end_ret && *end_ret == start - 1); } if ((page_ops & PAGE_SET_ERROR) && nr_pages > 0) mapping_set_error(mapping, -EIO); while (nr_pages > 0) { - ret = find_get_pages_contig(mapping, index, + unsigned found_pages; + + found_pages = find_get_pages_contig(mapping, index, min_t(unsigned long, nr_pages, ARRAY_SIZE(pages)), pages); - if (ret == 0) { + if (found_pages == 0) { /* * Only if we're going to lock these pages, * can we find nothing at @index. @@ -1859,13 +1871,14 @@ static int process_pages_contig(struct address_space *mapping, goto out; } - for (i = 0; i < ret; i++) { + for (i = 0; i < found_pages; i++) { if (page_ops & PAGE_SET_PRIVATE2) SetPagePrivate2(pages[i]); if (locked_page && pages[i] == locked_page) { put_page(pages[i]); - pages_processed++; + processed_end += + calc_bytes_processed(pages[i], start); continue; } if (page_ops & PAGE_CLEAR_DIRTY) @@ -1883,22 +1896,22 @@ static int process_pages_contig(struct address_space *mapping, if (!PageDirty(pages[i]) || pages[i]->mapping != mapping) { unlock_page(pages[i]); - for (; i < ret; i++) + for (; i < found_pages; i++) put_page(pages[i]); err = -EAGAIN; goto out; } } put_page(pages[i]); - pages_processed++; + processed_end += calc_bytes_processed(pages[i], start); } - nr_pages -= ret; - index += ret; + nr_pages -= found_pages; + index += found_pages; cond_resched(); } out: - if (err && index_ret) - *index_ret = start_index + pages_processed - 1; + if (err && end_ret) + *end_ret = processed_end; return err; } @@ -1907,15 +1920,12 @@ static noinline void __unlock_for_delalloc(struct inode *inode, struct page *locked_page, u64 start, u64 end) { - unsigned long index = start >> PAGE_SHIFT; - unsigned long end_index = end >> PAGE_SHIFT; - ASSERT(locked_page); - if (index == locked_page->index && end_index == index) + if (end < start) return; - process_pages_contig(inode->i_mapping, locked_page, index, end_index, - PAGE_UNLOCK, NULL); + process_pages_contig(inode->i_mapping, locked_page, start, end, + PAGE_UNLOCK, NULL); } static noinline int lock_delalloc_pages(struct inode *inode, @@ -1923,20 +1933,19 @@ static noinline int lock_delalloc_pages(struct inode *inode, u64 delalloc_start, u64 delalloc_end) { - unsigned long index = delalloc_start >> PAGE_SHIFT; - unsigned long index_ret = index; - unsigned long end_index = delalloc_end >> PAGE_SHIFT; + u64 processed_end = delalloc_start - 1; int ret; ASSERT(locked_page); - if (index == locked_page->index && index == end_index) + if (delalloc_end < delalloc_start) return 0; - ret = process_pages_contig(inode->i_mapping, locked_page, index, - end_index, PAGE_LOCK, &index_ret); + ret = process_pages_contig(inode->i_mapping, locked_page, + delalloc_start, delalloc_end, PAGE_LOCK, + &processed_end); if (ret == -EAGAIN) __unlock_for_delalloc(inode, locked_page, delalloc_start, - (u64)index_ret << PAGE_SHIFT); + processed_end); return ret; } @@ -2037,8 +2046,7 @@ void extent_clear_unlock_delalloc(struct btrfs_inode *inode, u64 start, u64 end, clear_extent_bit(&inode->io_tree, start, end, clear_bits, 1, 0, NULL); process_pages_contig(inode->vfs_inode.i_mapping, locked_page, - start >> PAGE_SHIFT, end >> PAGE_SHIFT, - page_ops, NULL); + start, end, page_ops, NULL); } /* From patchwork Wed Sep 30 01:55:02 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807585 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 071F76CB for ; Wed, 30 Sep 2020 01:56:13 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D78392145D for ; Wed, 30 Sep 2020 01:56:12 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="iUxutkTL" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729817AbgI3B4M (ORCPT ); Tue, 29 Sep 2020 21:56:12 -0400 Received: from mx2.suse.de ([195.135.220.15]:49886 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729322AbgI3B4L (ORCPT ); Tue, 29 Sep 2020 21:56:11 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601430970; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=iasRtm5zrtMkabllbAnYb9DckTE5AphGgDVWiwEKhOo=; b=iUxutkTLCcf5nRKNCoFWP+ZsJ6gy6mFrUM4muXP/tok9/6HsDUHATCLT/2ijOU4PoOzEik Rp6x/QldDWMwdaUptCHiXG8AIO9CpTENiKa0h0IzWfSceS3ZvozvYwjAAb6v9oqn+p3QnP hfF+SMRIN92NFZCCqDwpHZw8zkcvNyU= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 30F20AF95 for ; Wed, 30 Sep 2020 01:56:10 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v3 12/49] btrfs: extent_io: only require sector size alignment for page read Date: Wed, 30 Sep 2020 09:55:02 +0800 Message-Id: <20200930015539.48867-13-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org If we're reading partial page, btrfs will warn about this as our read/write are always done in sector size, which equals page size. But for the incoming subpage RO support, our data read is only aligned to sectorsize, which can be smaller than page size. Thus here we change the warning condition to check it against sectorsize, thus the behavior is not changed for regular sectorsize == PAGE_SIZE case, and won't report error for subpage read. Also, pass the proper start/end with bv_offset for check_data_csum() to handle. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 34 ++++++++++++++++++---------------- 1 file changed, 18 insertions(+), 16 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index d35eae29bc80..1da7897a799e 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -2838,6 +2838,7 @@ static void end_bio_extent_readpage(struct bio *bio) struct page *page = bvec->bv_page; struct inode *inode = page->mapping->host; struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); + u32 sectorsize = fs_info->sectorsize; bool data_inode = btrfs_ino(BTRFS_I(inode)) != BTRFS_BTREE_INODE_OBJECTID; @@ -2848,24 +2849,25 @@ static void end_bio_extent_readpage(struct bio *bio) tree = &BTRFS_I(inode)->io_tree; failure_tree = &BTRFS_I(inode)->io_failure_tree; - /* We always issue full-page reads, but if some block + /* + * We always issue full-sector reads, but if some block * in a page fails to read, blk_update_request() will * advance bv_offset and adjust bv_len to compensate. - * Print a warning for nonzero offsets, and an error - * if they don't add up to a full page. */ - if (bvec->bv_offset || bvec->bv_len != PAGE_SIZE) { - if (bvec->bv_offset + bvec->bv_len != PAGE_SIZE) - btrfs_err(fs_info, - "partial page read in btrfs with offset %u and length %u", - bvec->bv_offset, bvec->bv_len); - else - btrfs_info(fs_info, - "incomplete page read in btrfs with offset %u and length %u", - bvec->bv_offset, bvec->bv_len); - } - - start = page_offset(page); - end = start + bvec->bv_offset + bvec->bv_len - 1; + * Print a warning for unaligned offsets, and an error + * if they don't add up to a full sector. + */ + if (!IS_ALIGNED(bvec->bv_offset, sectorsize)) + btrfs_err(fs_info, + "partial page read in btrfs with offset %u and length %u", + bvec->bv_offset, bvec->bv_len); + else if (!IS_ALIGNED(bvec->bv_offset + bvec->bv_len, + sectorsize)) + btrfs_info(fs_info, + "incomplete page read in btrfs with offset %u and length %u", + bvec->bv_offset, bvec->bv_len); + + start = page_offset(page) + bvec->bv_offset; + end = start + bvec->bv_len - 1; len = bvec->bv_len; mirror = io_bio->mirror_num; From patchwork Wed Sep 30 01:55:03 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807587 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 84363618 for ; Wed, 30 Sep 2020 01:56:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6662E21531 for ; Wed, 30 Sep 2020 01:56:15 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="KIsgdrC+" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729821AbgI3B4O (ORCPT ); Tue, 29 Sep 2020 21:56:14 -0400 Received: from mx2.suse.de ([195.135.220.15]:49918 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729322AbgI3B4O (ORCPT ); Tue, 29 Sep 2020 21:56:14 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601430972; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3iS2307cMCGnMHbFC7R/ImaTgJTDPotZVzQDeD1EvuQ=; b=KIsgdrC+ASRa59vbuA0IK6DFonW+blnksWPB421gNTmd2MSsTOHdAfqXuDSfWbvUkZaUPQ CVHwmlbaY5sNKTWsvunDW3bjFsZym7jCDAvYDxl3xqMUFG9coxkpsf67l53dlTKiKInDh3 U21ZQikS4CD3NK0qzWZBZGo+jcY9Yk0= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 79FD6AF99 for ; Wed, 30 Sep 2020 01:56:12 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v3 13/49] btrfs: extent_io: remove the extent_start/extent_len for end_bio_extent_readpage() Date: Wed, 30 Sep 2020 09:55:03 +0800 Message-Id: <20200930015539.48867-14-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org In end_bio_extent_readpage() we had a strange dance around extent_start/extent_len. The truth is, no matter what we're doing using those two variable, the end result is just the same, clear the EXTENT_LOCKED bit and if needed set the EXTENT_UPTODATE bit for the io_tree. This doesn't need the complex dance, we can do it pretty easily by just calling endio_readpage_release_extent() for each bvec. This greatly streamlines the code. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 30 ++---------------------------- 1 file changed, 2 insertions(+), 28 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 1da7897a799e..395fa52ed2f9 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -2795,11 +2795,10 @@ static void end_bio_extent_writepage(struct bio *bio) } static void -endio_readpage_release_extent(struct extent_io_tree *tree, u64 start, u64 len, +endio_readpage_release_extent(struct extent_io_tree *tree, u64 start, u64 end, int uptodate) { struct extent_state *cached = NULL; - u64 end = start + len - 1; if (uptodate && tree->track_uptodate) set_extent_uptodate(tree, start, end, &cached, GFP_ATOMIC); @@ -2827,8 +2826,6 @@ static void end_bio_extent_readpage(struct bio *bio) u64 start; u64 end; u64 len; - u64 extent_start = 0; - u64 extent_len = 0; int mirror; int ret; struct bvec_iter_all iter_all; @@ -2936,32 +2933,9 @@ static void end_bio_extent_readpage(struct bio *bio) unlock_page(page); offset += len; - if (unlikely(!uptodate)) { - if (extent_len) { - endio_readpage_release_extent(tree, - extent_start, - extent_len, 1); - extent_start = 0; - extent_len = 0; - } - endio_readpage_release_extent(tree, start, - end - start + 1, 0); - } else if (!extent_len) { - extent_start = start; - extent_len = end + 1 - start; - } else if (extent_start + extent_len == start) { - extent_len += end + 1 - start; - } else { - endio_readpage_release_extent(tree, extent_start, - extent_len, uptodate); - extent_start = start; - extent_len = end + 1 - start; - } + endio_readpage_release_extent(tree, start, end, uptodate); } - if (extent_len) - endio_readpage_release_extent(tree, extent_start, extent_len, - uptodate); btrfs_io_bio_free_csum(io_bio); bio_put(bio); } From patchwork Wed Sep 30 01:55:04 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807589 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 74FC56CB for ; Wed, 30 Sep 2020 01:56:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 523D02145D for ; Wed, 30 Sep 2020 01:56:17 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="Z0p3Epzt" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729833AbgI3B4Q (ORCPT ); Tue, 29 Sep 2020 21:56:16 -0400 Received: from mx2.suse.de ([195.135.220.15]:49946 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729322AbgI3B4Q (ORCPT ); Tue, 29 Sep 2020 21:56:16 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601430974; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7N1lEz3GNECxQnamSUKY5iUbvutBfHM130bIrLdMV8Q=; b=Z0p3EpztVjWZLl7K8i1ysDH55cZZkP3zYmJteUTo92bktJzOHpGM0iEThlJjOJaJN4Puau HW0y2CS+7EnFxSRmAevLil2bgxnDogEd4Mu9TnkgtA/YlVemkWpgB2b+tkFGBndIzhLCcX iBDlECRQZhMWiXcieDakGA7O1e+cWqQ= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 53B80AFBC for ; Wed, 30 Sep 2020 01:56:14 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v3 14/49] btrfs: extent_io: integrate page status update into endio_readpage_release_extent() Date: Wed, 30 Sep 2020 09:55:04 +0800 Message-Id: <20200930015539.48867-15-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org In end_bio_extent_readpage(), we set page uptodate or error according to the bio status. However that assumes all submitted read are in page size. To support case like subpage read, we should only set the whole page uptodate if all data in the page has been read from disk. This patch will integrate the page status update into endio_readpage_release_extent() for end_bio_extent_readpage(). Now in endio_readpage_release_extent() we will set the page uptodate if either: - start/end covers the full page This is the existing behavior already. - all the page range is already uptodate This adds the support for subpage read. And for the error path, we always clear the page uptodate and set the page error. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 39 +++++++++++++++++++++++++++++---------- 1 file changed, 29 insertions(+), 10 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 395fa52ed2f9..af86289f465e 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -2795,13 +2795,36 @@ static void end_bio_extent_writepage(struct bio *bio) } static void -endio_readpage_release_extent(struct extent_io_tree *tree, u64 start, u64 end, - int uptodate) +endio_readpage_release_extent(struct extent_io_tree *tree, struct page *page, + u64 start, u64 end, int uptodate) { struct extent_state *cached = NULL; - if (uptodate && tree->track_uptodate) - set_extent_uptodate(tree, start, end, &cached, GFP_ATOMIC); + if (uptodate) { + u64 page_start = page_offset(page); + u64 page_end = page_offset(page) + PAGE_SIZE - 1; + + if (tree->track_uptodate) { + /* + * The tree has EXTENT_UPTODATE bit tracking, update + * extent io tree, and use it to update the page if + * needed. + */ + set_extent_uptodate(tree, start, end, &cached, + GFP_NOFS); + check_page_uptodate(tree, page); + } else if ((start <= page_start && end >= page_end)) { + /* We have covered the full page, set it uptodate */ + SetPageUptodate(page); + } + } else if (!uptodate){ + if (tree->track_uptodate) + clear_extent_uptodate(tree, start, end, &cached); + + /* Any error in the page range would invalid the uptodate bit */ + ClearPageUptodate(page); + SetPageError(page); + } unlock_extent_cached_atomic(tree, start, end, &cached); } @@ -2925,15 +2948,11 @@ static void end_bio_extent_readpage(struct bio *bio) off = offset_in_page(i_size); if (page->index == end_index && off) zero_user_segment(page, off, PAGE_SIZE); - SetPageUptodate(page); - } else { - ClearPageUptodate(page); - SetPageError(page); } - unlock_page(page); offset += len; - endio_readpage_release_extent(tree, start, end, uptodate); + endio_readpage_release_extent(tree, page, start, end, uptodate); + unlock_page(page); } btrfs_io_bio_free_csum(io_bio); From patchwork Wed Sep 30 01:55:05 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807591 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4494B6CB for ; Wed, 30 Sep 2020 01:56:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1D94C2145D for ; Wed, 30 Sep 2020 01:56:19 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="CxGWyKZK" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729840AbgI3B4S (ORCPT ); Tue, 29 Sep 2020 21:56:18 -0400 Received: from mx2.suse.de ([195.135.220.15]:50004 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729834AbgI3B4R (ORCPT ); Tue, 29 Sep 2020 21:56:17 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601430976; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Vv0yJtbKGnV0EDeGG517OHV3D0lR08GPoVnTyFgHN6A=; b=CxGWyKZK7DzzXEI93hHyOQOfNZtx5g6zJ6KhueX1vQNrmVfJp6ou6F9+GDIrItwrmHbhF9 JIJAh9SnVKn1qsw1uoS7PQd8bU1iKd7Mqd2BPSF5lT1+yIVwNqYVCqIpZkPQaRPtXUbw/R iI+V5WBNGxzzbZM6T0tm3DHGAc9tZ+s= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 12B35AFAB for ; Wed, 30 Sep 2020 01:56:16 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v3 15/49] btrfs: extent_io: rename page_size to io_size in submit_extent_page() Date: Wed, 30 Sep 2020 09:55:05 +0800 Message-Id: <20200930015539.48867-16-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org The variable @page_size of submit_extent_page() is not bounded to page size. It can already be smaller than PAGE_SIZE, so rename it to io_size to reduce confusion, this is especially important for later subpage support. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index af86289f465e..2edbac6c089e 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -3051,7 +3051,7 @@ static int submit_extent_page(unsigned int opf, { int ret = 0; struct bio *bio; - size_t page_size = min_t(size_t, size, PAGE_SIZE); + size_t io_size = min_t(size_t, size, PAGE_SIZE); sector_t sector = offset >> 9; struct extent_io_tree *tree = &BTRFS_I(page->mapping->host)->io_tree; @@ -3068,12 +3068,12 @@ static int submit_extent_page(unsigned int opf, contig = bio_end_sector(bio) == sector; ASSERT(tree->ops); - if (btrfs_bio_fits_in_stripe(page, page_size, bio, bio_flags)) + if (btrfs_bio_fits_in_stripe(page, io_size, bio, bio_flags)) can_merge = false; if (prev_bio_flags != bio_flags || !contig || !can_merge || force_bio_submit || - bio_add_page(bio, page, page_size, pg_offset) < page_size) { + bio_add_page(bio, page, io_size, pg_offset) < io_size) { ret = submit_one_bio(bio, mirror_num, prev_bio_flags); if (ret < 0) { *bio_ret = NULL; @@ -3082,13 +3082,13 @@ static int submit_extent_page(unsigned int opf, bio = NULL; } else { if (wbc) - wbc_account_cgroup_owner(wbc, page, page_size); + wbc_account_cgroup_owner(wbc, page, io_size); return 0; } } bio = btrfs_bio_alloc(offset); - bio_add_page(bio, page, page_size, pg_offset); + bio_add_page(bio, page, io_size, pg_offset); bio->bi_end_io = end_io_func; bio->bi_private = tree; bio->bi_write_hint = page->mapping->host->i_write_hint; @@ -3099,7 +3099,7 @@ static int submit_extent_page(unsigned int opf, bdev = BTRFS_I(page->mapping->host)->root->fs_info->fs_devices->latest_bdev; bio_set_dev(bio, bdev); wbc_init_bio(wbc, bio); - wbc_account_cgroup_owner(wbc, page, page_size); + wbc_account_cgroup_owner(wbc, page, io_size); } *bio_ret = bio; From patchwork Wed Sep 30 01:55:06 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807593 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CE426618 for ; Wed, 30 Sep 2020 01:56:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A7E6B21531 for ; Wed, 30 Sep 2020 01:56:21 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="iOY6j+Lv" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729844AbgI3B4V (ORCPT ); Tue, 29 Sep 2020 21:56:21 -0400 Received: from mx2.suse.de ([195.135.220.15]:50038 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729834AbgI3B4U (ORCPT ); Tue, 29 Sep 2020 21:56:20 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601430978; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=VOHsdqz1ZA/hXiOu0qi1TyahI2taKcOlUWmD8U1R2VU=; b=iOY6j+LvFtSF+rUVcUcpdVdU4VTV3sqIY50mONS2uGlt6aBEdy9Wq//t5bY+gJMH2fZ7Mr nC289hD8TXr0yWspFJFw4XOGko5ngCyTElsubHCtxN7YOE22RiWm9ZH7tblHarxdKKt9Pl FES2FDEGBuJhmJER3T4arnyGhoBXDfU= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 7B6D3AF95; Wed, 30 Sep 2020 01:56:18 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Cc: Nikolay Borisov Subject: [PATCH v3 16/49] btrfs: extent_io: add assert_spin_locked() for attach_extent_buffer_page() Date: Wed, 30 Sep 2020 09:55:06 +0800 Message-Id: <20200930015539.48867-17-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org When calling attach_extent_buffer_page(), either we're attaching anonymous pages, called from btrfs_clone_extent_buffer(). Or we're attaching btree_inode pages, called from alloc_extent_buffer(). For the later case, we should have page->mapping->private_lock hold to avoid race modifying page->private. Add assert_spin_locked() if we're calling from alloc_extent_buffer(). Signed-off-by: Qu Wenruo Reviewed-by: Nikolay Borisov --- fs/btrfs/extent_io.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 2edbac6c089e..e282eb63ad1b 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -3110,6 +3110,15 @@ static int submit_extent_page(unsigned int opf, static void attach_extent_buffer_page(struct extent_buffer *eb, struct page *page) { + /* + * If the page is mapped to btree inode, we should hold the private + * lock to prevent race. + * For cloned or dummy extent buffers, their pages are not mapped and + * will not race with any other ebs. + */ + if (page->mapping) + assert_spin_locked(&page->mapping->private_lock); + if (!PagePrivate(page)) attach_page_private(page, eb); else From patchwork Wed Sep 30 01:55:07 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807595 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7CBB5618 for ; Wed, 30 Sep 2020 01:56:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 594FF21531 for ; Wed, 30 Sep 2020 01:56:23 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="c07LzLwa" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729851AbgI3B4W (ORCPT ); Tue, 29 Sep 2020 21:56:22 -0400 Received: from mx2.suse.de ([195.135.220.15]:50056 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729843AbgI3B4V (ORCPT ); Tue, 29 Sep 2020 21:56:21 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601430980; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=HvFXedX2aANewbCvKqGgEfcvtnPnwaJHRwN+3JuD3Po=; b=c07LzLwaUAnr3peqTncJaUjs4GGoKBQS5JYqHW6a3jXNHmF5mRIc48yM7e0O8BJ79QzyyR qpuI+C1N0K7IJIJaz9KBtoj/yLH3rBMYkJT959N18sAClcY3y6uERjQAv5NNVTPw43IzT+ ro3Y8sKuhXGi36ZvBKg2mjy0dtWXKpk= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 53BE5AF99 for ; Wed, 30 Sep 2020 01:56:20 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v3 17/49] btrfs: extent_io: extract the btree page submission code into its own helper function Date: Wed, 30 Sep 2020 09:55:07 +0800 Message-Id: <20200930015539.48867-18-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org In btree_write_cache_pages() we have a btree page submission routine buried deeply into a nested loop. This patch will extract that part of code into a helper function, submit_btree_page(), to do the same work. Also, since submit_btree_page() now can return >0 for successfull extent buffer submission, remove the "ASSERT(ret <= 0);" line. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 116 +++++++++++++++++++++++++------------------ 1 file changed, 69 insertions(+), 47 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index e282eb63ad1b..6b925094608c 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -3988,10 +3988,75 @@ static noinline_for_stack int write_one_eb(struct extent_buffer *eb, return ret; } +/* + * A helper to submit a btree page. + * + * This function is not always submitting the page, as we only submit the full + * extent buffer in a batch. + * + * @page: The btree page + * @prev_eb: Previous extent buffer, to determine if we need to submit + * this page. + * + * Return >0 if we have submitted the extent buffer successfully. + * Return 0 if we don't need to do anything for the page. + * Return <0 for fatal error. + */ +static int submit_btree_page(struct page *page, struct writeback_control *wbc, + struct extent_page_data *epd, + struct extent_buffer **prev_eb) +{ + struct address_space *mapping = page->mapping; + struct extent_buffer *eb; + int ret; + + if (!PagePrivate(page)) + return 0; + + spin_lock(&mapping->private_lock); + if (!PagePrivate(page)) { + spin_unlock(&mapping->private_lock); + return 0; + } + + eb = (struct extent_buffer *)page->private; + + /* + * Shouldn't happen and normally this would be a BUG_ON but no sense + * in crashing the users box for something we can survive anyway. + */ + if (WARN_ON(!eb)) { + spin_unlock(&mapping->private_lock); + return 0; + } + + if (eb == *prev_eb) { + spin_unlock(&mapping->private_lock); + return 0; + } + ret = atomic_inc_not_zero(&eb->refs); + spin_unlock(&mapping->private_lock); + if (!ret) + return 0; + + *prev_eb = eb; + + ret = lock_extent_buffer_for_io(eb, epd); + if (ret <= 0) { + free_extent_buffer(eb); + return ret; + } + ret = write_one_eb(eb, wbc, epd); + free_extent_buffer(eb); + if (ret < 0) + return ret; + return 1; +} + int btree_write_cache_pages(struct address_space *mapping, struct writeback_control *wbc) { - struct extent_buffer *eb, *prev_eb = NULL; + struct extent_buffer *prev_eb = NULL; struct extent_page_data epd = { .bio = NULL, .extent_locked = 0, @@ -4037,55 +4102,13 @@ int btree_write_cache_pages(struct address_space *mapping, for (i = 0; i < nr_pages; i++) { struct page *page = pvec.pages[i]; - if (!PagePrivate(page)) - continue; - - spin_lock(&mapping->private_lock); - if (!PagePrivate(page)) { - spin_unlock(&mapping->private_lock); - continue; - } - - eb = (struct extent_buffer *)page->private; - - /* - * Shouldn't happen and normally this would be a BUG_ON - * but no sense in crashing the users box for something - * we can survive anyway. - */ - if (WARN_ON(!eb)) { - spin_unlock(&mapping->private_lock); - continue; - } - - if (eb == prev_eb) { - spin_unlock(&mapping->private_lock); - continue; - } - - ret = atomic_inc_not_zero(&eb->refs); - spin_unlock(&mapping->private_lock); - if (!ret) - continue; - - prev_eb = eb; - ret = lock_extent_buffer_for_io(eb, &epd); - if (!ret) { - free_extent_buffer(eb); + ret = submit_btree_page(page, wbc, &epd, &prev_eb); + if (ret == 0) continue; - } else if (ret < 0) { - done = 1; - free_extent_buffer(eb); - break; - } - - ret = write_one_eb(eb, wbc, &epd); - if (ret) { + if (ret < 0) { done = 1; - free_extent_buffer(eb); break; } - free_extent_buffer(eb); /* * the filesystem may choose to bump up nr_to_write. @@ -4106,7 +4129,6 @@ int btree_write_cache_pages(struct address_space *mapping, index = 0; goto retry; } - ASSERT(ret <= 0); if (ret < 0) { end_write_bio(&epd, ret); return ret; From patchwork Wed Sep 30 01:55:08 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807597 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 849796CB for ; Wed, 30 Sep 2020 01:56:26 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6117C21531 for ; Wed, 30 Sep 2020 01:56:26 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="s6wzjDf1" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729860AbgI3B4Z (ORCPT ); Tue, 29 Sep 2020 21:56:25 -0400 Received: from mx2.suse.de ([195.135.220.15]:50088 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729843AbgI3B4Z (ORCPT ); Tue, 29 Sep 2020 21:56:25 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601430983; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Gxrv4wSosYa7MrZ2ZHfsCdTF6wqjXfGIk9G0NFcshO8=; b=s6wzjDf1wf5P7M4+/RC4bOs8xyp8bxQ/Nj3FmhNJFyR09XHK5bgW3Vj8tTIm+sQp8WrdGu IML/GFHmcHZ7oDWF68MsT2WeBiQTW/bw/m2TS2pwPIyhEWfUJ0KKoC6pd9jsQGpoKuo81q MDtcMR5Z0Wq5rW0DnypDdO/ATpvbe4Q= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 20040AF95 for ; Wed, 30 Sep 2020 01:56:23 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v3 18/49] btrfs: extent_io: calculate inline extent buffer page size based on page size Date: Wed, 30 Sep 2020 09:55:08 +0800 Message-Id: <20200930015539.48867-19-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Btrfs only support 64K as max node size, thus for 4K page system, we would have at most 16 pages for one extent buffer. For a system using 64K page size, we would really have just one single page. While we always use 16 pages for extent_buffer::pages[], this means for systems using 64K pages, we are wasting memory for the 15 pages which will never be utilized. So this patch will change how the extent_buffer::pages[] array size is calclulated, now it will be calculated using BTRFS_MAX_METADATA_BLOCKSIZE and PAGE_SIZE. For systems using 4K page size, it will stay 16 pages. For systems using 64K page size, it will be just 1 page. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 6 +++--- fs/btrfs/extent_io.h | 8 +++++--- 2 files changed, 8 insertions(+), 6 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 6b925094608c..8662b27e42d6 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -5024,9 +5024,9 @@ __alloc_extent_buffer(struct btrfs_fs_info *fs_info, u64 start, /* * Sanity checks, currently the maximum is 64k covered by 16x 4k pages */ - BUILD_BUG_ON(BTRFS_MAX_METADATA_BLOCKSIZE - > MAX_INLINE_EXTENT_BUFFER_SIZE); - BUG_ON(len > MAX_INLINE_EXTENT_BUFFER_SIZE); + BUILD_BUG_ON(BTRFS_MAX_METADATA_BLOCKSIZE > + INLINE_EXTENT_BUFFER_PAGES * PAGE_SIZE); + BUG_ON(len > BTRFS_MAX_METADATA_BLOCKSIZE); #ifdef CONFIG_BTRFS_DEBUG eb->spinning_writers = 0; diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h index 3c9252b429e0..e588b3100ede 100644 --- a/fs/btrfs/extent_io.h +++ b/fs/btrfs/extent_io.h @@ -85,9 +85,11 @@ struct extent_io_ops { int mirror); }; - -#define INLINE_EXTENT_BUFFER_PAGES 16 -#define MAX_INLINE_EXTENT_BUFFER_SIZE (INLINE_EXTENT_BUFFER_PAGES * PAGE_SIZE) +/* + * The SZ_64K is BTRFS_MAX_METADATA_BLOCKSIZE, here just to avoid circle + * including "ctree.h". + */ +#define INLINE_EXTENT_BUFFER_PAGES (SZ_64K / PAGE_SIZE) struct extent_buffer { u64 start; unsigned long len; From patchwork Wed Sep 30 01:55:09 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807599 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C11F26CB for ; Wed, 30 Sep 2020 01:56:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 96E402145D for ; Wed, 30 Sep 2020 01:56:28 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="Xq/wkGCk" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729865AbgI3B41 (ORCPT ); Tue, 29 Sep 2020 21:56:27 -0400 Received: from mx2.suse.de ([195.135.220.15]:50132 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729843AbgI3B41 (ORCPT ); Tue, 29 Sep 2020 21:56:27 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601430985; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QNwjGDop0yq1npkoOSByeSBCO0oYGouY/j6QD1ffwsk=; b=Xq/wkGCkBf5mROU0thLfj8lLWazp8eyzPI/pgSNjiT/IYy5XyIO+FwInN0bsg4wZ9ht7RM oyluN3+Wt0S6JjIDMdIOv092FUucgfPn55dtjtLR4ieBOejWiZU4j+XvIoY3dehelhcFVf bRHQhYp9imSdaZ/5De2pDLrrOqjRjXk= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id A5158AF99; Wed, 30 Sep 2020 01:56:25 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Cc: Nikolay Borisov Subject: [PATCH v3 19/49] btrfs: extent_io: make btrfs_fs_info::buffer_radix to take sector size devided values Date: Wed, 30 Sep 2020 09:55:09 +0800 Message-Id: <20200930015539.48867-20-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org For subpage sized sector size support, one page can contain mutliple tree blocks, thus we can no longer use (eb->start >> PAGE_SHIFT) any more, or we can easily get extent buffer doesn't belongs to the bytenr. This patch will use (extent_buffer::start / sectorsize) as index for radix tree so that we can get correct extent buffer for subpage size support. While still keep the behavior same for regular sector size. Signed-off-by: Qu Wenruo Reviewed-by: Nikolay Borisov --- fs/btrfs/extent_io.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 8662b27e42d6..5d982441bf6e 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -5162,7 +5162,7 @@ struct extent_buffer *find_extent_buffer(struct btrfs_fs_info *fs_info, rcu_read_lock(); eb = radix_tree_lookup(&fs_info->buffer_radix, - start >> PAGE_SHIFT); + start / fs_info->sectorsize); if (eb && atomic_inc_not_zero(&eb->refs)) { rcu_read_unlock(); /* @@ -5214,7 +5214,7 @@ struct extent_buffer *alloc_test_extent_buffer(struct btrfs_fs_info *fs_info, } spin_lock(&fs_info->buffer_lock); ret = radix_tree_insert(&fs_info->buffer_radix, - start >> PAGE_SHIFT, eb); + start / fs_info->sectorsize, eb); spin_unlock(&fs_info->buffer_lock); radix_tree_preload_end(); if (ret == -EEXIST) { @@ -5322,7 +5322,7 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, spin_lock(&fs_info->buffer_lock); ret = radix_tree_insert(&fs_info->buffer_radix, - start >> PAGE_SHIFT, eb); + start / fs_info->sectorsize, eb); spin_unlock(&fs_info->buffer_lock); radix_tree_preload_end(); if (ret == -EEXIST) { @@ -5378,7 +5378,7 @@ static int release_extent_buffer(struct extent_buffer *eb) spin_lock(&fs_info->buffer_lock); radix_tree_delete(&fs_info->buffer_radix, - eb->start >> PAGE_SHIFT); + eb->start / fs_info->sectorsize); spin_unlock(&fs_info->buffer_lock); } else { spin_unlock(&eb->refs_lock); From patchwork Wed Sep 30 01:55:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807601 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3D7D0618 for ; Wed, 30 Sep 2020 01:56:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 13D2421531 for ; Wed, 30 Sep 2020 01:56:29 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="pZgKO0BX" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729870AbgI3B43 (ORCPT ); Tue, 29 Sep 2020 21:56:29 -0400 Received: from mx2.suse.de ([195.135.220.15]:50190 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729843AbgI3B42 (ORCPT ); Tue, 29 Sep 2020 21:56:28 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601430987; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ktf6xVJ3A6pAe8XjPeZRKQjD59dj9Vy8Xu06DjXEJpE=; b=pZgKO0BXP4Ov0TqDpvS7BqR539XWUn40mb1LfnENmw0K9er+OH0GQ4JheSZhFNYwQE4pvX ONNvDkTq5S/JV4np7ielYWyqyUuAb7WDRpmfUN398v19jSwg5qKoKlBfu9TvW6tRNKggsU rbjX7ZViLi3DXTDJSaNIgjq3OuF+xQg= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 83E4CAFCD for ; Wed, 30 Sep 2020 01:56:27 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v3 20/49] btrfs: disk_io: grab fs_info from extent_buffer::fs_info directly for btrfs_mark_buffer_dirty() Date: Wed, 30 Sep 2020 09:55:10 +0800 Message-Id: <20200930015539.48867-21-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Since commit f28491e0a6c4 ("Btrfs: move the extent buffer radix tree into the fs_info"), fs_info can be grabbed from extent_buffer directly. So use that extent_buffer::fs_info directly in btrfs_mark_buffer_dirty() to make things a little easier. Signed-off-by: Qu Wenruo --- fs/btrfs/disk-io.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index c81b7e53149c..58928076d08d 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -4190,8 +4190,7 @@ int btrfs_buffer_uptodate(struct extent_buffer *buf, u64 parent_transid, void btrfs_mark_buffer_dirty(struct extent_buffer *buf) { - struct btrfs_fs_info *fs_info; - struct btrfs_root *root; + struct btrfs_fs_info *fs_info = buf->fs_info; u64 transid = btrfs_header_generation(buf); int was_dirty; @@ -4204,8 +4203,6 @@ void btrfs_mark_buffer_dirty(struct extent_buffer *buf) if (unlikely(test_bit(EXTENT_BUFFER_UNMAPPED, &buf->bflags))) return; #endif - root = BTRFS_I(buf->pages[0]->mapping->host)->root; - fs_info = root->fs_info; btrfs_assert_tree_locked(buf); if (transid != fs_info->generation) WARN(1, KERN_CRIT "btrfs transid mismatch buffer %llu, found %llu running %llu\n", From patchwork Wed Sep 30 01:55:11 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807603 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 66E0A618 for ; Wed, 30 Sep 2020 01:56:33 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3B25621531 for ; Wed, 30 Sep 2020 01:56:33 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="WcHfdY6W" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729873AbgI3B4c (ORCPT ); Tue, 29 Sep 2020 21:56:32 -0400 Received: from mx2.suse.de ([195.135.220.15]:50250 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729843AbgI3B4b (ORCPT ); Tue, 29 Sep 2020 21:56:31 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601430990; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=E4RSujhSu4Qafq4unefxVwBtBbLSONfki7HDYZW7Opo=; b=WcHfdY6W738YePzCUaoTBbN3EM/P7l/RKQiZv6ftYRIHEoUNqzZAS2+B2KdwhABuzkc3so RZflmAJDMm9027CNx/JkL+6ofveJPDGdETB4Iaqs7Ro6wNBcFvAgnGMfXHiYsYYvoHi+hf y+PKkLsxP4C7OZPxggLlf0UZXECMFkk= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 448E2AF95; Wed, 30 Sep 2020 01:56:30 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Cc: Goldwyn Rodrigues , Nikolay Borisov Subject: [PATCH v3 21/49] btrfs: disk-io: make csum_tree_block() handle sectorsize smaller than page size Date: Wed, 30 Sep 2020 09:55:11 +0800 Message-Id: <20200930015539.48867-22-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org For subpage size support, we only need to handle the first page. To make the code work for both cases, we modify the following behaviors: - num_pages calcuation Instead of "nodesize >> PAGE_SHIFT", we go "DIV_ROUND_UP(nodesize, PAGE_SIZE)", this ensures we get at least one page for subpage size support, while still get the same result for regular page size. - The length for the first run Instead of PAGE_SIZE - BTRFS_CSUM_SIZE, we go min(PAGE_SIZE, nodesize) - BTRFS_CSUM_SIZE. This allows us to handle both cases well. - The start location of the first run Instead of always use BTRFS_CSUM_SIZE as csum start position, add offset_in_page(eb->start) to get proper offset for both cases. Signed-off-by: Goldwyn Rodrigues Signed-off-by: Qu Wenruo Reviewed-by: Nikolay Borisov --- fs/btrfs/disk-io.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 58928076d08d..55bb4f2def3c 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -257,16 +257,16 @@ struct extent_map *btree_get_extent(struct btrfs_inode *inode, static void csum_tree_block(struct extent_buffer *buf, u8 *result) { struct btrfs_fs_info *fs_info = buf->fs_info; - const int num_pages = fs_info->nodesize >> PAGE_SHIFT; + const int num_pages = DIV_ROUND_UP(fs_info->nodesize, PAGE_SIZE); SHASH_DESC_ON_STACK(shash, fs_info->csum_shash); char *kaddr; int i; shash->tfm = fs_info->csum_shash; crypto_shash_init(shash); - kaddr = page_address(buf->pages[0]); + kaddr = page_address(buf->pages[0]) + offset_in_page(buf->start); crypto_shash_update(shash, kaddr + BTRFS_CSUM_SIZE, - PAGE_SIZE - BTRFS_CSUM_SIZE); + min_t(u32, PAGE_SIZE, fs_info->nodesize) - BTRFS_CSUM_SIZE); for (i = 1; i < num_pages; i++) { kaddr = page_address(buf->pages[i]); From patchwork Wed Sep 30 01:55:12 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807605 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CBCF3618 for ; Wed, 30 Sep 2020 01:56:35 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A6DDE21531 for ; Wed, 30 Sep 2020 01:56:35 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="OehELAki" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729881AbgI3B4e (ORCPT ); Tue, 29 Sep 2020 21:56:34 -0400 Received: from mx2.suse.de ([195.135.220.15]:50302 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729843AbgI3B4e (ORCPT ); Tue, 29 Sep 2020 21:56:34 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601430992; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=cawPiZZ87qrJlrJzuQaVsonurC0ql6oGPeNQMYAhc00=; b=OehELAki2az64pXgNm/hXFhUIn8VK7MFCsEFktp0eCxNY3e3hnT3U5MZZS0zHGNaPPkjCd OWQi5AZEVYedku/VcLRgyP8UtDdsDyp8RqiEgH0HNM5HmbCEeK5TXQlgfiNkJDkiP8BYh9 /tvGbR52uG3W8PGu++ZoCaPBq4PyEoI= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 96C29AF99 for ; Wed, 30 Sep 2020 01:56:32 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v3 22/49] btrfs: disk-io: extract the extent buffer verification from btree_readpage_end_io_hook() Date: Wed, 30 Sep 2020 09:55:12 +0800 Message-Id: <20200930015539.48867-23-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Currently btree_readpage_end_io_hook() only needs to handle one extent buffer as currently one page only maps to one extent buffer. But for incoming subpage support, one page can be mapped to multiple extent buffers, thus we can no longer use current code. This refactor would allow us to call btrfs_check_extent_buffer() on all involved extent buffers at btree_readpage_end_io_hook() and other locations. Signed-off-by: Qu Wenruo --- fs/btrfs/disk-io.c | 78 ++++++++++++++++++++++++++-------------------- 1 file changed, 44 insertions(+), 34 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 55bb4f2def3c..ee2a6d480a7d 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -574,60 +574,37 @@ static int check_tree_block_fsid(struct extent_buffer *eb) return ret; } -static int btree_readpage_end_io_hook(struct btrfs_io_bio *io_bio, - u64 phy_offset, struct page *page, - u64 start, u64 end, int mirror) +/* Do basic extent buffer check at read time */ +static int btrfs_check_extent_buffer(struct extent_buffer *eb) { - u64 found_start; - int found_level; - struct extent_buffer *eb; - struct btrfs_fs_info *fs_info; + struct btrfs_fs_info *fs_info = eb->fs_info; u16 csum_size; - int ret = 0; + u64 found_start; + u8 found_level; u8 result[BTRFS_CSUM_SIZE]; - int reads_done; - - if (!page->private) - goto out; + int ret = 0; - eb = (struct extent_buffer *)page->private; - fs_info = eb->fs_info; csum_size = btrfs_super_csum_size(fs_info->super_copy); - /* the pending IO might have been the only thing that kept this buffer - * in memory. Make sure we have a ref for all this other checks - */ - atomic_inc(&eb->refs); - - reads_done = atomic_dec_and_test(&eb->io_pages); - if (!reads_done) - goto err; - - eb->read_mirror = mirror; - if (test_bit(EXTENT_BUFFER_READ_ERR, &eb->bflags)) { - ret = -EIO; - goto err; - } - found_start = btrfs_header_bytenr(eb); if (found_start != eb->start) { btrfs_err_rl(fs_info, "bad tree block start, want %llu have %llu", eb->start, found_start); ret = -EIO; - goto err; + goto out; } if (check_tree_block_fsid(eb)) { btrfs_err_rl(fs_info, "bad fsid on block %llu", eb->start); ret = -EIO; - goto err; + goto out; } found_level = btrfs_header_level(eb); if (found_level >= BTRFS_MAX_LEVEL) { btrfs_err(fs_info, "bad tree block level %d on %llu", (int)btrfs_header_level(eb), eb->start); ret = -EIO; - goto err; + goto out; } btrfs_set_buffer_lockdep_class(btrfs_header_owner(eb), @@ -647,7 +624,7 @@ static int btree_readpage_end_io_hook(struct btrfs_io_bio *io_bio, fs_info->sb->s_id, eb->start, val, found, btrfs_header_level(eb)); ret = -EUCLEAN; - goto err; + goto out; } /* @@ -669,6 +646,40 @@ static int btree_readpage_end_io_hook(struct btrfs_io_bio *io_bio, btrfs_err(fs_info, "block=%llu read time tree block corruption detected", eb->start); +out: + return ret; +} + +static int btree_readpage_end_io_hook(struct btrfs_io_bio *io_bio, + u64 phy_offset, struct page *page, + u64 start, u64 end, int mirror) +{ + struct extent_buffer *eb; + int ret = 0; + bool reads_done; + + /* Metadata pages that goes through IO should all have private set */ + ASSERT(PagePrivate(page) && page->private); + eb = (struct extent_buffer *)page->private; + + /* + * The pending IO might have been the only thing that kept this buffer + * in memory. Make sure we have a ref for all this other checks + */ + atomic_inc(&eb->refs); + + reads_done = atomic_dec_and_test(&eb->io_pages); + if (!reads_done) + goto err; + + eb->read_mirror = mirror; + if (test_bit(EXTENT_BUFFER_READ_ERR, &eb->bflags)) { + ret = -EIO; + goto err; + } + + ret = btrfs_check_extent_buffer(eb); + err: if (reads_done && test_and_clear_bit(EXTENT_BUFFER_READAHEAD, &eb->bflags)) @@ -684,7 +695,6 @@ static int btree_readpage_end_io_hook(struct btrfs_io_bio *io_bio, clear_extent_buffer_uptodate(eb); } free_extent_buffer(eb); -out: return ret; } From patchwork Wed Sep 30 01:55:13 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807607 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 574E66CB for ; Wed, 30 Sep 2020 01:56:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3434221531 for ; Wed, 30 Sep 2020 01:56:37 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="DzCwTNfJ" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729891AbgI3B4g (ORCPT ); Tue, 29 Sep 2020 21:56:36 -0400 Received: from mx2.suse.de ([195.135.220.15]:50346 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729843AbgI3B4g (ORCPT ); Tue, 29 Sep 2020 21:56:36 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601430995; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2grTSz56albd0BUDyalScJ+xvMEWdyaflq7+LcK8vwg=; b=DzCwTNfJq79mBtDpEjjus4neS6fOFqYZGjEeSDn4XUQvsTAH/gK5GZeTC3ByseySUurcqy zQ+wDbeHfyhDVJJrOvrPox5ULxUl4fGH1jDnG8DFyFZt8bYfpTEQ0bVQsAc5B/P2yaJ4On zx4Jc6vrVcqBPD2gQOEVQddyJyNtuxE= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 03C11AF95 for ; Wed, 30 Sep 2020 01:56:35 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v3 23/49] btrfs: disk-io: accept bvec directly for csum_dirty_buffer() Date: Wed, 30 Sep 2020 09:55:13 +0800 Message-Id: <20200930015539.48867-24-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Currently csum_dirty_buffer() uses page to grab extent buffer, but that only works for regular sector size == PAGE_SIZE case. For subpage we need page + page_offset to grab extent buffer. This patch will change csum_dirty_buffer() to accept bvec directly so that we can extract both page and page_offset for later subpage support. Signed-off-by: Qu Wenruo --- fs/btrfs/disk-io.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index ee2a6d480a7d..b34a3f312e0c 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -495,13 +495,14 @@ static int btree_read_extent_buffer_pages(struct extent_buffer *eb, * we only fill in the checksum field in the first page of a multi-page block */ -static int csum_dirty_buffer(struct btrfs_fs_info *fs_info, struct page *page) +static int csum_dirty_buffer(struct btrfs_fs_info *fs_info, struct bio_vec *bvec) { + struct extent_buffer *eb; + struct page *page = bvec->bv_page; u64 start = page_offset(page); u64 found_start; u8 result[BTRFS_CSUM_SIZE]; u16 csum_size = btrfs_super_csum_size(fs_info->super_copy); - struct extent_buffer *eb; int ret; eb = (struct extent_buffer *)page->private; @@ -848,7 +849,7 @@ static blk_status_t btree_csum_one_bio(struct bio *bio) ASSERT(!bio_flagged(bio, BIO_CLONED)); bio_for_each_segment_all(bvec, bio, iter_all) { root = BTRFS_I(bvec->bv_page->mapping->host)->root; - ret = csum_dirty_buffer(root->fs_info, bvec->bv_page); + ret = csum_dirty_buffer(root->fs_info, bvec); if (ret) break; } From patchwork Wed Sep 30 01:55:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807609 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 633E0618 for ; Wed, 30 Sep 2020 01:56:40 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 392F12145D for ; Wed, 30 Sep 2020 01:56:40 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="ET9Q3q0F" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729843AbgI3B4j (ORCPT ); Tue, 29 Sep 2020 21:56:39 -0400 Received: from mx2.suse.de ([195.135.220.15]:50394 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729633AbgI3B4j (ORCPT ); Tue, 29 Sep 2020 21:56:39 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601430997; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=wG8i2qZPHuSRdrKvzdn0GkTtfAS+II+TO0OzVlkgGEA=; b=ET9Q3q0FtspWy3ihmLLGx0ucrVZBe81gr5ehfxxhL1FKsaBxpsP1UdebSCUQWR8goNFFyT gjwiqc7d/znHJBbQZxs+lrancvBrF8VbvCDOOuIPjjhgkkINeXVpn/rnkRrBlltk4j5q6c u4h6SZIlQWI74TtUiR8/+QtGkohXDdg= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 34848AF95; Wed, 30 Sep 2020 01:56:37 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Cc: Goldwyn Rodrigues Subject: [PATCH v3 24/49] btrfs: inode: make btrfs_readpage_end_io_hook() follow sector size Date: Wed, 30 Sep 2020 09:55:14 +0800 Message-Id: <20200930015539.48867-25-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Currently btrfs_readpage_end_io_hook() just pass the whole page to check_data_csum(), which is fine since we only support sectorsize == PAGE_SIZE. To support subpage, we need to properly honor per-sector checksum verification, just like what we did in dio read path. This patch will do the csum verification in a for loop, starts with pg_off == start - page_offset(page), with sectorsize increasement for each loop. For sectorsize == PAGE_SIZE case, the pg_off will always be 0, and we will only finish with just one loop. For subpage, we do the proper loop. Signed-off-by: Goldwyn Rodrigues Signed-off-by: Qu Wenruo --- fs/btrfs/inode.c | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 10ea6a92685b..2ee6ff186be4 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -2849,9 +2849,12 @@ static int btrfs_readpage_end_io_hook(struct btrfs_io_bio *io_bio, u64 start, u64 end, int mirror) { size_t offset = start - page_offset(page); + size_t pg_off; struct inode *inode = page->mapping->host; struct extent_io_tree *io_tree = &BTRFS_I(inode)->io_tree; struct btrfs_root *root = BTRFS_I(inode)->root; + u32 sectorsize = root->fs_info->sectorsize; + bool found_err = false; if (PageChecked(page)) { ClearPageChecked(page); @@ -2868,7 +2871,17 @@ static int btrfs_readpage_end_io_hook(struct btrfs_io_bio *io_bio, } phy_offset >>= inode->i_sb->s_blocksize_bits; - return check_data_csum(inode, io_bio, phy_offset, page, offset); + for (pg_off = offset; pg_off < end - page_offset(page); + pg_off += sectorsize, phy_offset++) { + int ret; + + ret = check_data_csum(inode, io_bio, phy_offset, page, pg_off); + if (ret < 0) + found_err = true; + } + if (found_err) + return -EIO; + return 0; } /* From patchwork Wed Sep 30 01:55:15 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807611 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4D1A06CB for ; Wed, 30 Sep 2020 01:56:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 29F052145D for ; Wed, 30 Sep 2020 01:56:42 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="aIQzwL/w" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729896AbgI3B4l (ORCPT ); Tue, 29 Sep 2020 21:56:41 -0400 Received: from mx2.suse.de ([195.135.220.15]:50484 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729633AbgI3B4k (ORCPT ); Tue, 29 Sep 2020 21:56:40 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601430999; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9tM6N148Ox/Nh5KjVgE5urdKTEGEu8tgI2CnSGEo4UU=; b=aIQzwL/wZjcqZcRuS/LnzVbtYkrdL6UDnT1yTG6ZZGPdKVYFHYnZ/aPK6JHfJfsiKzXvJa EghOe5buwP6FYf2+1im9dRQo7EooE48WVmnTkjEI4WboUYlw5NL73cs89MtpfRlC424NbT jB/8PEco0Np8DSCoUP4GmMvEtOW0vcY= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 1B5B4AFAB for ; Wed, 30 Sep 2020 01:56:39 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v3 25/49] btrfs: introduce a helper to determine if the sectorsize is smaller than PAGE_SIZE Date: Wed, 30 Sep 2020 09:55:15 +0800 Message-Id: <20200930015539.48867-26-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Just to save us several letters for the incoming patches. Signed-off-by: Qu Wenruo --- fs/btrfs/ctree.h | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 9a72896bed2e..e3501dad88e2 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -3532,6 +3532,11 @@ static inline int btrfs_defrag_cancelled(struct btrfs_fs_info *fs_info) return signal_pending(current); } +static inline bool btrfs_is_subpage(struct btrfs_fs_info *fs_info) +{ + return (fs_info->sectorsize < PAGE_SIZE); +} + #define in_range(b, first, len) ((b) >= (first) && (b) < (first) + (len)) /* Sanity test specific functions */ From patchwork Wed Sep 30 01:55:16 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807613 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BFE336CB for ; Wed, 30 Sep 2020 01:56:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9BB2321531 for ; Wed, 30 Sep 2020 01:56:44 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="BBuCxGqe" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729904AbgI3B4o (ORCPT ); Tue, 29 Sep 2020 21:56:44 -0400 Received: from mx2.suse.de ([195.135.220.15]:50516 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729633AbgI3B4n (ORCPT ); Tue, 29 Sep 2020 21:56:43 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601431001; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tEFAkK/TLfuVv6gpha9XBzGFGpi2Pn0Zq8hQsC/QrF4=; b=BBuCxGqeGpBeneFLaqbxXgYcft2nYmQSTYEl+NHS6oZZldHEEvXzye5k/6HcPXdtwr3BmY uJJHl20oBAiWjOgn01KQmCRP56w2VMM9ZVrv72wxtmrMMcTXTYQQAwvTJ6UzORihmsjyir egVdXfqFuBYOaRpRQd2rZxk0FWyKF+4= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 0B41EAF95 for ; Wed, 30 Sep 2020 01:56:41 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v3 26/49] btrfs: extent_io: allow find_first_extent_bit() to find a range with exact bits match Date: Wed, 30 Sep 2020 09:55:16 +0800 Message-Id: <20200930015539.48867-27-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Currently if we pass mutliple @bits to find_first_extent_bit(), it will return the first range with one or more bits matching @bits. This is fine for current code, since most of them are just doing their own extra checks, and all existing callers only call it with 1 or 2 bits. But for the incoming subpage support, we want the ability to return range with exact match, so that caller can skip some extra checks. So this patch will add a new bool parameter, @exact_match, to find_first_extent_bit() and its callees. Currently all callers just pass 'false' to the new parameter, thus no functional change is introduced. Signed-off-by: Qu Wenruo --- fs/btrfs/block-group.c | 2 +- fs/btrfs/disk-io.c | 4 ++-- fs/btrfs/extent-io-tree.h | 2 +- fs/btrfs/extent-tree.c | 2 +- fs/btrfs/extent_io.c | 42 +++++++++++++++++++++++++------------ fs/btrfs/free-space-cache.c | 2 +- fs/btrfs/relocation.c | 2 +- fs/btrfs/transaction.c | 4 ++-- fs/btrfs/volumes.c | 2 +- 9 files changed, 39 insertions(+), 23 deletions(-) diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c index ea8aaf36647e..7e6ab6b765f6 100644 --- a/fs/btrfs/block-group.c +++ b/fs/btrfs/block-group.c @@ -461,7 +461,7 @@ u64 add_new_free_space(struct btrfs_block_group *block_group, u64 start, u64 end ret = find_first_extent_bit(&info->excluded_extents, start, &extent_start, &extent_end, EXTENT_DIRTY | EXTENT_UPTODATE, - NULL); + false, NULL); if (ret) break; diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index b34a3f312e0c..1ca121ca28aa 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -4516,7 +4516,7 @@ static int btrfs_destroy_marked_extents(struct btrfs_fs_info *fs_info, while (1) { ret = find_first_extent_bit(dirty_pages, start, &start, &end, - mark, NULL); + mark, false, NULL); if (ret) break; @@ -4556,7 +4556,7 @@ static int btrfs_destroy_pinned_extent(struct btrfs_fs_info *fs_info, */ mutex_lock(&fs_info->unused_bg_unpin_mutex); ret = find_first_extent_bit(unpin, 0, &start, &end, - EXTENT_DIRTY, &cached_state); + EXTENT_DIRTY, false, &cached_state); if (ret) { mutex_unlock(&fs_info->unused_bg_unpin_mutex); break; diff --git a/fs/btrfs/extent-io-tree.h b/fs/btrfs/extent-io-tree.h index 5927338c74a2..4d0dbb562a81 100644 --- a/fs/btrfs/extent-io-tree.h +++ b/fs/btrfs/extent-io-tree.h @@ -224,7 +224,7 @@ static inline int set_extent_uptodate(struct extent_io_tree *tree, u64 start, int find_first_extent_bit(struct extent_io_tree *tree, u64 start, u64 *start_ret, u64 *end_ret, unsigned bits, - struct extent_state **cached_state); + bool exact_match, struct extent_state **cached_state); void find_first_clear_extent_bit(struct extent_io_tree *tree, u64 start, u64 *start_ret, u64 *end_ret, unsigned bits); int find_contiguous_extent_bit(struct extent_io_tree *tree, u64 start, diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index e9eedc053fc5..406329dabb48 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -2880,7 +2880,7 @@ int btrfs_finish_extent_commit(struct btrfs_trans_handle *trans) mutex_lock(&fs_info->unused_bg_unpin_mutex); ret = find_first_extent_bit(unpin, 0, &start, &end, - EXTENT_DIRTY, &cached_state); + EXTENT_DIRTY, false, &cached_state); if (ret) { mutex_unlock(&fs_info->unused_bg_unpin_mutex); break; diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 5d982441bf6e..50cd5efc79ab 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -1521,13 +1521,27 @@ void extent_range_redirty_for_io(struct inode *inode, u64 start, u64 end) } } -/* find the first state struct with 'bits' set after 'start', and - * return it. tree->lock must be held. NULL will returned if - * nothing was found after 'start' +static bool match_extent_state(struct extent_state *state, unsigned bits, + bool exact_match) +{ + if (exact_match) + return ((state->state & bits) == bits); + return (state->state & bits); +} + +/* + * Find the first state struct with @bits set after @start. + * + * NOTE: tree->lock must be hold. + * + * @exact_match: Do we need to have all @bits set, or just any of + * the @bits. + * + * Return NULL if we can't find a match. */ static struct extent_state * find_first_extent_bit_state(struct extent_io_tree *tree, - u64 start, unsigned bits) + u64 start, unsigned bits, bool exact_match) { struct rb_node *node; struct extent_state *state; @@ -1542,7 +1556,8 @@ find_first_extent_bit_state(struct extent_io_tree *tree, while (1) { state = rb_entry(node, struct extent_state, rb_node); - if (state->end >= start && (state->state & bits)) + if (state->end >= start && + match_extent_state(state, bits, exact_match)) return state; node = rb_next(node); @@ -1563,7 +1578,7 @@ find_first_extent_bit_state(struct extent_io_tree *tree, */ int find_first_extent_bit(struct extent_io_tree *tree, u64 start, u64 *start_ret, u64 *end_ret, unsigned bits, - struct extent_state **cached_state) + bool exact_match, struct extent_state **cached_state) { struct extent_state *state; int ret = 1; @@ -1573,7 +1588,8 @@ int find_first_extent_bit(struct extent_io_tree *tree, u64 start, state = *cached_state; if (state->end == start - 1 && extent_state_in_tree(state)) { while ((state = next_state(state)) != NULL) { - if (state->state & bits) + if (match_extent_state(state, bits, + exact_match)) goto got_it; } free_extent_state(*cached_state); @@ -1584,7 +1600,7 @@ int find_first_extent_bit(struct extent_io_tree *tree, u64 start, *cached_state = NULL; } - state = find_first_extent_bit_state(tree, start, bits); + state = find_first_extent_bit_state(tree, start, bits, exact_match); got_it: if (state) { cache_state_if_flags(state, cached_state, 0); @@ -1619,7 +1635,7 @@ int find_contiguous_extent_bit(struct extent_io_tree *tree, u64 start, int ret = 1; spin_lock(&tree->lock); - state = find_first_extent_bit_state(tree, start, bits); + state = find_first_extent_bit_state(tree, start, bits, false); if (state) { *start_ret = state->start; *end_ret = state->end; @@ -2413,9 +2429,8 @@ int clean_io_failure(struct btrfs_fs_info *fs_info, goto out; spin_lock(&io_tree->lock); - state = find_first_extent_bit_state(io_tree, - failrec->start, - EXTENT_LOCKED); + state = find_first_extent_bit_state(io_tree, failrec->start, + EXTENT_LOCKED, false); spin_unlock(&io_tree->lock); if (state && state->start <= failrec->start && @@ -2451,7 +2466,8 @@ void btrfs_free_io_failure_record(struct btrfs_inode *inode, u64 start, u64 end) return; spin_lock(&failure_tree->lock); - state = find_first_extent_bit_state(failure_tree, start, EXTENT_DIRTY); + state = find_first_extent_bit_state(failure_tree, start, EXTENT_DIRTY, + false); while (state) { if (state->start > end) break; diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c index dc82fd0c80cb..1533df86536b 100644 --- a/fs/btrfs/free-space-cache.c +++ b/fs/btrfs/free-space-cache.c @@ -1093,7 +1093,7 @@ static noinline_for_stack int write_pinned_extent_entries( while (start < block_group->start + block_group->length) { ret = find_first_extent_bit(unpin, start, &extent_start, &extent_end, - EXTENT_DIRTY, NULL); + EXTENT_DIRTY, false, NULL); if (ret) return 0; diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c index 4ba1ab9cc76d..77a7e35a500c 100644 --- a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -3153,7 +3153,7 @@ int find_next_extent(struct reloc_control *rc, struct btrfs_path *path, ret = find_first_extent_bit(&rc->processed_blocks, key.objectid, &start, &end, - EXTENT_DIRTY, NULL); + EXTENT_DIRTY, false, NULL); if (ret == 0 && start <= key.objectid) { btrfs_release_path(path); diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c index 20c6ac1a5de7..5b3444641ea5 100644 --- a/fs/btrfs/transaction.c +++ b/fs/btrfs/transaction.c @@ -974,7 +974,7 @@ int btrfs_write_marked_extents(struct btrfs_fs_info *fs_info, atomic_inc(&BTRFS_I(fs_info->btree_inode)->sync_writers); while (!find_first_extent_bit(dirty_pages, start, &start, &end, - mark, &cached_state)) { + mark, false, &cached_state)) { bool wait_writeback = false; err = convert_extent_bit(dirty_pages, start, end, @@ -1029,7 +1029,7 @@ static int __btrfs_wait_marked_extents(struct btrfs_fs_info *fs_info, u64 end; while (!find_first_extent_bit(dirty_pages, start, &start, &end, - EXTENT_NEED_WAIT, &cached_state)) { + EXTENT_NEED_WAIT, false, &cached_state)) { /* * Ignore -ENOMEM errors returned by clear_extent_bit(). * When committing the transaction, we'll remove any entries diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 214856c4ccb1..c54329e92ced 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -1382,7 +1382,7 @@ static bool contains_pending_extent(struct btrfs_device *device, u64 *start, if (!find_first_extent_bit(&device->alloc_state, *start, &physical_start, &physical_end, - CHUNK_ALLOCATED, NULL)) { + CHUNK_ALLOCATED, false, NULL)) { if (in_range(physical_start, *start, len) || in_range(*start, physical_start, From patchwork Wed Sep 30 01:55:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807615 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CF515618 for ; Wed, 30 Sep 2020 01:56:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id AEF3D2145D for ; Wed, 30 Sep 2020 01:56:45 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="kVuybwbs" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729905AbgI3B4o (ORCPT ); Tue, 29 Sep 2020 21:56:44 -0400 Received: from mx2.suse.de ([195.135.220.15]:50550 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729898AbgI3B4n (ORCPT ); Tue, 29 Sep 2020 21:56:43 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601431002; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=TI5TPLiEU5zKd52eOxhVCHK7Ya0tQFmlIXIWvc5peHo=; b=kVuybwbsV35OrGULLtXKpTbed1MOvfZ6W5Mg1j+k+FKMNKWiII+0ULkxLhZxHkHMNgUhEB NIU3WvPlgei/en72xKP6oCKRa0K26PrIazwOFP6/XOlwJ3e6kMCXPQz4e5JEKCnEYFa7SM UTWLmPNvNkLE8YNGrqCYu8qx6WoTUK8= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id CA7DDAF99 for ; Wed, 30 Sep 2020 01:56:42 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v3 27/49] btrfs: extent_io: don't allow tree block to cross page boundary for subpage support Date: Wed, 30 Sep 2020 09:55:17 +0800 Message-Id: <20200930015539.48867-28-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org As a preparation for subpage sector size support (allowing filesystem with sector size smaller than page size to be mounted) if the sector size is smaller than page size, we don't allow tree block to be read if it crosses 64K(*) boundary. The 64K is selected because: - We are only going to support 64K page size for subpage for now - 64K is also the max node size btrfs supports This ensures that, tree blocks are always contained in one page for a system with 64K page size, which can greatly simplify the handling. Or we need to do complex multi-page handling for tree blocks. Currently the only way to create such tree blocks crossing 64K boundary is by btrfs-convert, which will get fixed soon and doesn't get wide-spread usage. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 50cd5efc79ab..28188509a206 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -5268,6 +5268,13 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, btrfs_err(fs_info, "bad tree block start %llu", start); return ERR_PTR(-EINVAL); } + if (btrfs_is_subpage(fs_info) && round_down(start, PAGE_SIZE) != + round_down(start + len - 1, PAGE_SIZE)) { + btrfs_err(fs_info, + "tree block crosses page boundary, start %llu nodesize %lu", + start, len); + return ERR_PTR(-EINVAL); + } eb = find_extent_buffer(fs_info, start); if (eb) From patchwork Wed Sep 30 01:55:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807617 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 71C566CB for ; Wed, 30 Sep 2020 01:56:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4BE7221531 for ; Wed, 30 Sep 2020 01:56:47 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="qHhxDkFl" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729912AbgI3B4q (ORCPT ); Tue, 29 Sep 2020 21:56:46 -0400 Received: from mx2.suse.de ([195.135.220.15]:50580 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729898AbgI3B4p (ORCPT ); Tue, 29 Sep 2020 21:56:45 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601431004; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+4NijXaoGWKrqkulDOUvaZmN5yE7iYY8ZfK84iLI580=; b=qHhxDkFlJmYg5TJikDn6ypwHNVd8l0udkhWYIjRaN/fkQONZ9lE4JhP+o8WLe4JLCglWTu gfHBMLMWHhh69CKjPfGNWnwFddfll+VD+SZhKe0DxC49NZRFVYAHUrPP87aRioxc4J9WIp oooF+CIvnbIF5Y2SerPX5+wIp/mZYT4= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id B4893AF95 for ; Wed, 30 Sep 2020 01:56:44 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v3 28/49] btrfs: extent_io: update num_extent_pages() to support subpage sized extent buffer Date: Wed, 30 Sep 2020 09:55:18 +0800 Message-Id: <20200930015539.48867-29-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org For subpage sized extent buffer, we have ensured no extent buffer will cross page boundary, thus we would only need one page for any extent buffer. This patch will update the function num_extent_pages() to handle such case. Now num_extent_pages() would return 1 instead of for subpage sized extent buffer. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.h | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h index e588b3100ede..552afc1c0bbc 100644 --- a/fs/btrfs/extent_io.h +++ b/fs/btrfs/extent_io.h @@ -229,8 +229,15 @@ void wait_on_extent_buffer_writeback(struct extent_buffer *eb); static inline int num_extent_pages(const struct extent_buffer *eb) { - return (round_up(eb->start + eb->len, PAGE_SIZE) >> PAGE_SHIFT) - - (eb->start >> PAGE_SHIFT); + /* + * For sectorsize == PAGE_SIZE case, since eb is always aligned to + * sectorsize, it's just (eb->len / PAGE_SIZE) >> PAGE_SHIFT. + * + * For sectorsize < PAGE_SIZE case, we only want to support 64K + * PAGE_SIZE, and ensured all tree blocks won't cross page boundary. + * So in that case we always got 1 page. + */ + return (round_up(eb->len, PAGE_SIZE) >> PAGE_SHIFT); } static inline int extent_buffer_uptodate(const struct extent_buffer *eb) From patchwork Wed Sep 30 01:55:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807619 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5C6DF6CB for ; Wed, 30 Sep 2020 01:56:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3748F2145D for ; Wed, 30 Sep 2020 01:56:50 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="Yz3HTJXu" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729916AbgI3B4t (ORCPT ); Tue, 29 Sep 2020 21:56:49 -0400 Received: from mx2.suse.de ([195.135.220.15]:50628 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729898AbgI3B4t (ORCPT ); Tue, 29 Sep 2020 21:56:49 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601431007; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XI2PNYGAU+IBiV2ZX6VjBzpjJBnS1gIwj1h/0qleh+Y=; b=Yz3HTJXu1SCO1f/XKw3TSY/JFSQJK+izzoC06bW4AXSjLkiE9lzYwigmfwdvdIVr4ddvjJ xUGniI0pRjNjpP0f0pvfAvRwAzTEKUb5+dGpxznzBpvvR8R4BOn2HVmKrN/ED7CvlQA3TB f7FKworZMsbJ5qeiIigF4eR2VhjjXHg= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id E3708AF95; Wed, 30 Sep 2020 01:56:46 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Cc: Goldwyn Rodrigues Subject: [PATCH v3 29/49] btrfs: handle sectorsize < PAGE_SIZE case for extent buffer accessors Date: Wed, 30 Sep 2020 09:55:19 +0800 Message-Id: <20200930015539.48867-30-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org To support sectorsize < PAGE_SIZE case, we need to take extra care for extent buffer accessors. Since sectorsize is smaller than PAGE_SIZE, one page can contain multiple tree blocks, we must use eb->start to determine the real offset to read/write for extent buffer accessors. This patch introduces two helpers to do these: - get_eb_page_index() This is to calculate the index to access extent_buffer::pages. It's just a simple wrapper around "start >> PAGE_SHIFT". For sectorsize == PAGE_SIZE case, nothing is changed. For sectorsize < PAGE_SIZE case, we always get index as 0, and the existing page shift works also fine. - get_eb_page_offset() This is to calculate the offset to access extent_buffer::pages. This needs to take extent_buffer::start into consideration. For sectorsize == PAGE_SIZE case, extent_buffer::start is always aligned to PAGE_SIZE, thus adding extent_buffer::start to offset_in_page() won't change the result. For sectorsize < PAGE_SIZE case, adding extent_buffer::start gives us the correct offset to access. This patch will touch the following parts to cover all extent buffer accessors: - BTRFS_SETGET_HEADER_FUNCS() - read_extent_buffer() - read_extent_buffer_to_user() - memcmp_extent_buffer() - write_extent_buffer_chunk_tree_uuid() - write_extent_buffer_fsid() - write_extent_buffer() - memzero_extent_buffer() - copy_extent_buffer_full() - copy_extent_buffer() - memcpy_extent_buffer() - memmove_extent_buffer() - btrfs_get_token_##bits() - btrfs_get_##bits() - btrfs_set_token_##bits() - btrfs_set_##bits() - generic_bin_search() Signed-off-by: Goldwyn Rodrigues Signed-off-by: Qu Wenruo --- fs/btrfs/ctree.c | 5 ++-- fs/btrfs/ctree.h | 38 ++++++++++++++++++++++-- fs/btrfs/extent_io.c | 66 ++++++++++++++++++++++++----------------- fs/btrfs/struct-funcs.c | 18 ++++++----- 4 files changed, 88 insertions(+), 39 deletions(-) diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c index cd392da69b81..0f6944a3a836 100644 --- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@ -1712,10 +1712,11 @@ static noinline int generic_bin_search(struct extent_buffer *eb, oip = offset_in_page(offset); if (oip + key_size <= PAGE_SIZE) { - const unsigned long idx = offset >> PAGE_SHIFT; + const unsigned long idx = get_eb_page_index(offset); char *kaddr = page_address(eb->pages[idx]); - tmp = (struct btrfs_disk_key *)(kaddr + oip); + tmp = (struct btrfs_disk_key *)(kaddr + + get_eb_page_offset(eb, offset)); } else { read_extent_buffer(eb, &unaligned, offset, key_size); tmp = &unaligned; diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index e3501dad88e2..0c3ea3599dc7 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -1448,14 +1448,15 @@ static inline void btrfs_set_token_##name(struct btrfs_map_token *token,\ #define BTRFS_SETGET_HEADER_FUNCS(name, type, member, bits) \ static inline u##bits btrfs_##name(const struct extent_buffer *eb) \ { \ - const type *p = page_address(eb->pages[0]); \ + const type *p = page_address(eb->pages[0]) + \ + offset_in_page(eb->start); \ u##bits res = le##bits##_to_cpu(p->member); \ return res; \ } \ static inline void btrfs_set_##name(const struct extent_buffer *eb, \ u##bits val) \ { \ - type *p = page_address(eb->pages[0]); \ + type *p = page_address(eb->pages[0]) + offset_in_page(eb->start); \ p->member = cpu_to_le##bits(val); \ } @@ -3241,6 +3242,39 @@ static inline void assertfail(const char *expr, const char* file, int line) { } #define ASSERT(expr) (void)(expr) #endif +/* + * Get the correct offset inside the page of extent buffer. + * + * Will handle both sectorsize == PAGE_SIZE and sectorsize < PAGE_SIZE cases. + * + * @eb: The target extent buffer + * @start: The offset inside the extent buffer + */ +static inline size_t get_eb_page_offset(const struct extent_buffer *eb, + unsigned long offset_in_eb) +{ + /* + * For sectorsize == PAGE_SIZE case, eb->start will always be aligned + * to PAGE_SIZE, thus adding it won't cause any difference. + * + * For sectorsize < PAGE_SIZE, we must only read the data belongs to + * the eb, thus we have to take the eb->start into consideration. + */ + return offset_in_page(offset_in_eb + eb->start); +} + +static inline unsigned long get_eb_page_index(unsigned long offset_in_eb) +{ + /* + * For sectorsize == PAGE_SIZE case, plain >> PAGE_SHIFT is enough. + * + * For sectorsize < PAGE_SIZE case, we only support 64K PAGE_SIZE, + * and has ensured all tree blocks are contained in one page, thus + * we always get index == 0. + */ + return offset_in_eb >> PAGE_SHIFT; +} + /* * Use that for functions that are conditionally exported for sanity tests but * otherwise static diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 28188509a206..e42a17039bf6 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -5673,7 +5673,7 @@ void read_extent_buffer(const struct extent_buffer *eb, void *dstv, struct page *page; char *kaddr; char *dst = (char *)dstv; - unsigned long i = start >> PAGE_SHIFT; + unsigned long i = get_eb_page_index(start); if (start + len > eb->len) { WARN(1, KERN_ERR "btrfs bad mapping eb start %llu len %lu, wanted %lu %lu\n", @@ -5682,7 +5682,7 @@ void read_extent_buffer(const struct extent_buffer *eb, void *dstv, return; } - offset = offset_in_page(start); + offset = get_eb_page_offset(eb, start); while (len > 0) { page = eb->pages[i]; @@ -5707,13 +5707,13 @@ int read_extent_buffer_to_user_nofault(const struct extent_buffer *eb, struct page *page; char *kaddr; char __user *dst = (char __user *)dstv; - unsigned long i = start >> PAGE_SHIFT; + unsigned long i = get_eb_page_index(start); int ret = 0; WARN_ON(start > eb->len); WARN_ON(start + len > eb->start + eb->len); - offset = offset_in_page(start); + offset = get_eb_page_offset(eb, start); while (len > 0) { page = eb->pages[i]; @@ -5742,13 +5742,13 @@ int memcmp_extent_buffer(const struct extent_buffer *eb, const void *ptrv, struct page *page; char *kaddr; char *ptr = (char *)ptrv; - unsigned long i = start >> PAGE_SHIFT; + unsigned long i = get_eb_page_index(start); int ret = 0; WARN_ON(start > eb->len); WARN_ON(start + len > eb->start + eb->len); - offset = offset_in_page(start); + offset = get_eb_page_offset(eb, start); while (len > 0) { page = eb->pages[i]; @@ -5774,7 +5774,7 @@ void write_extent_buffer_chunk_tree_uuid(const struct extent_buffer *eb, char *kaddr; WARN_ON(!PageUptodate(eb->pages[0])); - kaddr = page_address(eb->pages[0]); + kaddr = page_address(eb->pages[0]) + get_eb_page_offset(eb, 0); memcpy(kaddr + offsetof(struct btrfs_header, chunk_tree_uuid), srcv, BTRFS_FSID_SIZE); } @@ -5784,7 +5784,7 @@ void write_extent_buffer_fsid(const struct extent_buffer *eb, const void *srcv) char *kaddr; WARN_ON(!PageUptodate(eb->pages[0])); - kaddr = page_address(eb->pages[0]); + kaddr = page_address(eb->pages[0]) + get_eb_page_offset(eb, 0); memcpy(kaddr + offsetof(struct btrfs_header, fsid), srcv, BTRFS_FSID_SIZE); } @@ -5797,12 +5797,12 @@ void write_extent_buffer(const struct extent_buffer *eb, const void *srcv, struct page *page; char *kaddr; char *src = (char *)srcv; - unsigned long i = start >> PAGE_SHIFT; + unsigned long i = get_eb_page_index(start); WARN_ON(start > eb->len); WARN_ON(start + len > eb->start + eb->len); - offset = offset_in_page(start); + offset = get_eb_page_offset(eb, start); while (len > 0) { page = eb->pages[i]; @@ -5826,12 +5826,12 @@ void memzero_extent_buffer(const struct extent_buffer *eb, unsigned long start, size_t offset; struct page *page; char *kaddr; - unsigned long i = start >> PAGE_SHIFT; + unsigned long i = get_eb_page_index(start); WARN_ON(start > eb->len); WARN_ON(start + len > eb->start + eb->len); - offset = offset_in_page(start); + offset = get_eb_page_offset(eb, start); while (len > 0) { page = eb->pages[i]; @@ -5855,10 +5855,22 @@ void copy_extent_buffer_full(const struct extent_buffer *dst, ASSERT(dst->len == src->len); - num_pages = num_extent_pages(dst); - for (i = 0; i < num_pages; i++) - copy_page(page_address(dst->pages[i]), - page_address(src->pages[i])); + if (dst->fs_info->sectorsize == PAGE_SIZE) { + num_pages = num_extent_pages(dst); + for (i = 0; i < num_pages; i++) + copy_page(page_address(dst->pages[i]), + page_address(src->pages[i])); + } else { + unsigned long src_index = get_eb_page_index(0); + unsigned long dst_index = get_eb_page_index(0); + size_t src_offset = get_eb_page_offset(src, 0); + size_t dst_offset = get_eb_page_offset(dst, 0); + + ASSERT(src_index == 0 && dst_index == 0); + memcpy(page_address(dst->pages[dst_index]) + dst_offset, + page_address(src->pages[src_index]) + src_offset, + src->len); + } } void copy_extent_buffer(const struct extent_buffer *dst, @@ -5871,11 +5883,11 @@ void copy_extent_buffer(const struct extent_buffer *dst, size_t offset; struct page *page; char *kaddr; - unsigned long i = dst_offset >> PAGE_SHIFT; + unsigned long i = get_eb_page_index(dst_offset); WARN_ON(src->len != dst_len); - offset = offset_in_page(dst_offset); + offset = get_eb_page_offset(dst, dst_offset); while (len > 0) { page = dst->pages[i]; @@ -5919,7 +5931,7 @@ static inline void eb_bitmap_offset(const struct extent_buffer *eb, * the bitmap item in the extent buffer + the offset of the byte in the * bitmap item. */ - offset = start + byte_offset; + offset = start + offset_in_page(eb->start) + byte_offset; *page_index = offset >> PAGE_SHIFT; *page_offset = offset_in_page(offset); @@ -6083,11 +6095,11 @@ void memcpy_extent_buffer(const struct extent_buffer *dst, } while (len > 0) { - dst_off_in_page = offset_in_page(dst_offset); - src_off_in_page = offset_in_page(src_offset); + dst_off_in_page = get_eb_page_offset(dst, dst_offset); + src_off_in_page = get_eb_page_offset(dst, src_offset); - dst_i = dst_offset >> PAGE_SHIFT; - src_i = src_offset >> PAGE_SHIFT; + dst_i = get_eb_page_index(dst_offset); + src_i = get_eb_page_index(src_offset); cur = min(len, (unsigned long)(PAGE_SIZE - src_off_in_page)); @@ -6133,11 +6145,11 @@ void memmove_extent_buffer(const struct extent_buffer *dst, return; } while (len > 0) { - dst_i = dst_end >> PAGE_SHIFT; - src_i = src_end >> PAGE_SHIFT; + dst_i = get_eb_page_index(dst_end); + src_i = get_eb_page_index(src_end); - dst_off_in_page = offset_in_page(dst_end); - src_off_in_page = offset_in_page(src_end); + dst_off_in_page = get_eb_page_offset(dst, dst_end); + src_off_in_page = get_eb_page_offset(dst, src_end); cur = min_t(unsigned long, len, src_off_in_page + 1); cur = min(cur, dst_off_in_page + 1); diff --git a/fs/btrfs/struct-funcs.c b/fs/btrfs/struct-funcs.c index 079b059818e9..769901c2b3c9 100644 --- a/fs/btrfs/struct-funcs.c +++ b/fs/btrfs/struct-funcs.c @@ -67,8 +67,9 @@ u##bits btrfs_get_token_##bits(struct btrfs_map_token *token, \ const void *ptr, unsigned long off) \ { \ const unsigned long member_offset = (unsigned long)ptr + off; \ - const unsigned long idx = member_offset >> PAGE_SHIFT; \ - const unsigned long oip = offset_in_page(member_offset); \ + const unsigned long idx = get_eb_page_index(member_offset); \ + const unsigned long oip = get_eb_page_offset(token->eb, \ + member_offset); \ const int size = sizeof(u##bits); \ u8 lebytes[sizeof(u##bits)]; \ const int part = PAGE_SIZE - oip; \ @@ -95,8 +96,8 @@ u##bits btrfs_get_##bits(const struct extent_buffer *eb, \ const void *ptr, unsigned long off) \ { \ const unsigned long member_offset = (unsigned long)ptr + off; \ - const unsigned long oip = offset_in_page(member_offset); \ - const unsigned long idx = member_offset >> PAGE_SHIFT; \ + const unsigned long oip = get_eb_page_offset(eb, member_offset);\ + const unsigned long idx = get_eb_page_index(member_offset); \ char *kaddr = page_address(eb->pages[idx]); \ const int size = sizeof(u##bits); \ const int part = PAGE_SIZE - oip; \ @@ -116,8 +117,9 @@ void btrfs_set_token_##bits(struct btrfs_map_token *token, \ u##bits val) \ { \ const unsigned long member_offset = (unsigned long)ptr + off; \ - const unsigned long idx = member_offset >> PAGE_SHIFT; \ - const unsigned long oip = offset_in_page(member_offset); \ + const unsigned long idx = get_eb_page_index(member_offset); \ + const unsigned long oip = get_eb_page_offset(token->eb, \ + member_offset); \ const int size = sizeof(u##bits); \ u8 lebytes[sizeof(u##bits)]; \ const int part = PAGE_SIZE - oip; \ @@ -146,8 +148,8 @@ void btrfs_set_##bits(const struct extent_buffer *eb, void *ptr, \ unsigned long off, u##bits val) \ { \ const unsigned long member_offset = (unsigned long)ptr + off; \ - const unsigned long oip = offset_in_page(member_offset); \ - const unsigned long idx = member_offset >> PAGE_SHIFT; \ + const unsigned long oip = get_eb_page_offset(eb, member_offset);\ + const unsigned long idx = get_eb_page_index(member_offset); \ char *kaddr = page_address(eb->pages[idx]); \ const int size = sizeof(u##bits); \ const int part = PAGE_SIZE - oip; \ From patchwork Wed Sep 30 01:55:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807621 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 60757618 for ; Wed, 30 Sep 2020 01:56:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3C43E2145D for ; Wed, 30 Sep 2020 01:56:51 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="AJ0OIRfA" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729922AbgI3B4u (ORCPT ); Tue, 29 Sep 2020 21:56:50 -0400 Received: from mx2.suse.de ([195.135.220.15]:50664 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729917AbgI3B4u (ORCPT ); Tue, 29 Sep 2020 21:56:50 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601431008; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nATJwzB7HNMv3p8xpmRB7SDjAV7YkGig36/MnT0PnCc=; b=AJ0OIRfAMiBOcY+giO0pWLBvrv9mgy0AcsiFt+xjEFNpN8Ha4YVFPxwfGfG5/18ieN+Im7 h0Lb+rCxPk/GPglgbR6RQRnSzqmDCs3Wyu1Zey0gWtOoqPofNYS1EgwT3jSiXBOVe0EdBT 1cDo6Paas12bSFDLnk191qPwvOuSqjY= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id DE42BAFAB for ; Wed, 30 Sep 2020 01:56:48 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v3 30/49] btrfs: disk-io: only clear EXTENT_LOCK bit for extent_invalidatepage() Date: Wed, 30 Sep 2020 09:55:20 +0800 Message-Id: <20200930015539.48867-31-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org In extent_invalidatepage() it will try to clear all possible bits since it's calling clear_extent_bit() with delete == 1. That would try to clear all existing bits. This is currently fine, since for btree io tree, it only utilizes EXTENT_LOCK bit. But this could be a problem for later subpage support, which will utilize extra io tree bit to represent extra info. This patch will just convert that clear_extent_bit() to unlock_extent_cached(). As for btree io tree, only EXTENT_LOCKED bit is utilized, this doesn't change the behavior, but provides a much cleaner basis for incoming subpage support. Signed-off-by: Qu Wenruo --- fs/btrfs/disk-io.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 1ca121ca28aa..10bdb0a8a92f 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -996,8 +996,13 @@ static void extent_invalidatepage(struct extent_io_tree *tree, lock_extent_bits(tree, start, end, &cached_state); wait_on_page_writeback(page); - clear_extent_bit(tree, start, end, EXTENT_LOCKED | EXTENT_DELALLOC | - EXTENT_DO_ACCOUNTING, 1, 1, &cached_state); + + /* + * Currently for btree io tree, only EXTENT_LOCKED is utilized, + * so here we only need to unlock the extent range to free any + * existing extent state. + */ + unlock_extent_cached(tree, start, end, &cached_state); } static void btree_invalidatepage(struct page *page, unsigned int offset, From patchwork Wed Sep 30 01:55:21 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807623 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9FC196CB for ; Wed, 30 Sep 2020 01:56:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7873E21531 for ; Wed, 30 Sep 2020 01:56:54 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="Q1BRRrmh" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729926AbgI3B4x (ORCPT ); Tue, 29 Sep 2020 21:56:53 -0400 Received: from mx2.suse.de ([195.135.220.15]:50684 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729917AbgI3B4x (ORCPT ); Tue, 29 Sep 2020 21:56:53 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601431010; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dE5g3SPzfF0SAj8sVOy/bjIKDnIYFeImvgqR08+a/zE=; b=Q1BRRrmhAPTS0abKGB1qsCOAFDhWLKg8q//QtsrbgPiuw/D9IjCODNTxFxVBdS4vA3AzNw zNKkaIJsBsfGB5ANaG6uSpyFgofY0h7HI+qd+y5YNow+3mK0TzYYxYfNoaHf87QIccbmg4 Mcw4PVhFJsW+ARIi9A3Ibx/LpQ17Qno= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id A0A61AF95 for ; Wed, 30 Sep 2020 01:56:50 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v3 31/49] btrfs: extent-io: make type of extent_state::state to be at least 32 bits Date: Wed, 30 Sep 2020 09:55:21 +0800 Message-Id: <20200930015539.48867-32-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Currently we use 'unsigned' for extent_state::state, which is only ensured to be at least 16 bits. But for incoming subpage support, we are going to introduce more bits to at least match the following page bits: - PageUptodate - PagePrivate2 Thus we will go beyond 16 bits. To support this, make extent_state::state at least 32bit and to be more explicit, we use "u32" to be clear about the max supported bits. This doesn't increase the memory usage for x86_64, but may affect other architectures. Signed-off-by: Qu Wenruo --- fs/btrfs/extent-io-tree.h | 36 +++++++++++++++------------- fs/btrfs/extent_io.c | 49 +++++++++++++++++++-------------------- fs/btrfs/extent_io.h | 2 +- 3 files changed, 45 insertions(+), 42 deletions(-) diff --git a/fs/btrfs/extent-io-tree.h b/fs/btrfs/extent-io-tree.h index 4d0dbb562a81..108b386118fe 100644 --- a/fs/btrfs/extent-io-tree.h +++ b/fs/btrfs/extent-io-tree.h @@ -22,6 +22,10 @@ struct io_failure_record; #define EXTENT_QGROUP_RESERVED (1U << 12) #define EXTENT_CLEAR_DATA_RESV (1U << 13) #define EXTENT_DELALLOC_NEW (1U << 14) + +/* For subpage btree io tree, to indicate there is an extent buffer */ +#define EXTENT_HAS_TREE_BLOCK (1U << 15) + #define EXTENT_DO_ACCOUNTING (EXTENT_CLEAR_META_RESV | \ EXTENT_CLEAR_DATA_RESV) #define EXTENT_CTLBITS (EXTENT_DO_ACCOUNTING) @@ -73,7 +77,7 @@ struct extent_state { /* ADD NEW ELEMENTS AFTER THIS */ wait_queue_head_t wq; refcount_t refs; - unsigned state; + u32 state; struct io_failure_record *failrec; @@ -105,19 +109,19 @@ void __cold extent_io_exit(void); u64 count_range_bits(struct extent_io_tree *tree, u64 *start, u64 search_end, - u64 max_bytes, unsigned bits, int contig); + u64 max_bytes, u32 bits, int contig); void free_extent_state(struct extent_state *state); int test_range_bit(struct extent_io_tree *tree, u64 start, u64 end, - unsigned bits, int filled, + u32 bits, int filled, struct extent_state *cached_state); int clear_record_extent_bits(struct extent_io_tree *tree, u64 start, u64 end, - unsigned bits, struct extent_changeset *changeset); + u32 bits, struct extent_changeset *changeset); int clear_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, - unsigned bits, int wake, int delete, + u32 bits, int wake, int delete, struct extent_state **cached); int __clear_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, - unsigned bits, int wake, int delete, + u32 bits, int wake, int delete, struct extent_state **cached, gfp_t mask, struct extent_changeset *changeset); @@ -141,7 +145,7 @@ static inline int unlock_extent_cached_atomic(struct extent_io_tree *tree, } static inline int clear_extent_bits(struct extent_io_tree *tree, u64 start, - u64 end, unsigned bits) + u64 end, u32 bits) { int wake = 0; @@ -152,15 +156,15 @@ static inline int clear_extent_bits(struct extent_io_tree *tree, u64 start, } int set_record_extent_bits(struct extent_io_tree *tree, u64 start, u64 end, - unsigned bits, struct extent_changeset *changeset); + u32 bits, struct extent_changeset *changeset); int set_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, - unsigned bits, u64 *failed_start, + u32 bits, u64 *failed_start, struct extent_state **cached_state, gfp_t mask); int set_extent_bits_nowait(struct extent_io_tree *tree, u64 start, u64 end, - unsigned bits); + u32 bits); static inline int set_extent_bits(struct extent_io_tree *tree, u64 start, - u64 end, unsigned bits) + u64 end, u32 bits) { return set_extent_bit(tree, start, end, bits, NULL, NULL, GFP_NOFS); } @@ -188,11 +192,11 @@ static inline int clear_extent_dirty(struct extent_io_tree *tree, u64 start, } int convert_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, - unsigned bits, unsigned clear_bits, + u32 bits, u32 clear_bits, struct extent_state **cached_state); static inline int set_extent_delalloc(struct extent_io_tree *tree, u64 start, - u64 end, unsigned int extra_bits, + u64 end, u32 extra_bits, struct extent_state **cached_state) { return set_extent_bit(tree, start, end, @@ -223,12 +227,12 @@ static inline int set_extent_uptodate(struct extent_io_tree *tree, u64 start, } int find_first_extent_bit(struct extent_io_tree *tree, u64 start, - u64 *start_ret, u64 *end_ret, unsigned bits, + u64 *start_ret, u64 *end_ret, u32 bits, bool exact_match, struct extent_state **cached_state); void find_first_clear_extent_bit(struct extent_io_tree *tree, u64 start, - u64 *start_ret, u64 *end_ret, unsigned bits); + u64 *start_ret, u64 *end_ret, u32 bits); int find_contiguous_extent_bit(struct extent_io_tree *tree, u64 start, - u64 *start_ret, u64 *end_ret, unsigned bits); + u64 *start_ret, u64 *end_ret, u32 bits); bool btrfs_find_delalloc_range(struct extent_io_tree *tree, u64 *start, u64 *end, u64 max_bytes, struct extent_state **cached_state); diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index e42a17039bf6..0c4ce0b1f4ce 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -142,7 +142,7 @@ struct extent_page_data { unsigned int sync_io:1; }; -static int add_extent_changeset(struct extent_state *state, unsigned bits, +static int add_extent_changeset(struct extent_state *state, u32 bits, struct extent_changeset *changeset, int set) { @@ -530,7 +530,7 @@ static void merge_state(struct extent_io_tree *tree, } static void set_state_bits(struct extent_io_tree *tree, - struct extent_state *state, unsigned *bits, + struct extent_state *state, u32 *bits, struct extent_changeset *changeset); /* @@ -547,7 +547,7 @@ static int insert_state(struct extent_io_tree *tree, struct extent_state *state, u64 start, u64 end, struct rb_node ***p, struct rb_node **parent, - unsigned *bits, struct extent_changeset *changeset) + u32 *bits, struct extent_changeset *changeset) { struct rb_node *node; @@ -628,11 +628,11 @@ static struct extent_state *next_state(struct extent_state *state) */ static struct extent_state *clear_state_bit(struct extent_io_tree *tree, struct extent_state *state, - unsigned *bits, int wake, + u32 *bits, int wake, struct extent_changeset *changeset) { struct extent_state *next; - unsigned bits_to_clear = *bits & ~EXTENT_CTLBITS; + u32 bits_to_clear = *bits & ~EXTENT_CTLBITS; int ret; if ((bits_to_clear & EXTENT_DIRTY) && (state->state & EXTENT_DIRTY)) { @@ -695,7 +695,7 @@ static void extent_io_tree_panic(struct extent_io_tree *tree, int err) * This takes the tree lock, and returns 0 on success and < 0 on error. */ int __clear_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, - unsigned bits, int wake, int delete, + u32 bits, int wake, int delete, struct extent_state **cached_state, gfp_t mask, struct extent_changeset *changeset) { @@ -868,7 +868,7 @@ static void wait_on_state(struct extent_io_tree *tree, * The tree lock is taken by this function */ static void wait_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, - unsigned long bits) + u32 bits) { struct extent_state *state; struct rb_node *node; @@ -915,9 +915,9 @@ static void wait_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, static void set_state_bits(struct extent_io_tree *tree, struct extent_state *state, - unsigned *bits, struct extent_changeset *changeset) + u32 *bits, struct extent_changeset *changeset) { - unsigned bits_to_set = *bits & ~EXTENT_CTLBITS; + u32 bits_to_set = *bits & ~EXTENT_CTLBITS; int ret; if (tree->private_data && is_data_inode(tree->private_data)) @@ -964,7 +964,7 @@ static void cache_state(struct extent_state *state, static int __must_check __set_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, - unsigned bits, unsigned exclusive_bits, + u32 bits, u32 exclusive_bits, u64 *failed_start, struct extent_state **cached_state, gfp_t mask, struct extent_changeset *changeset) { @@ -1180,7 +1180,7 @@ __set_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, } int set_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, - unsigned bits, u64 * failed_start, + u32 bits, u64 * failed_start, struct extent_state **cached_state, gfp_t mask) { return __set_extent_bit(tree, start, end, bits, 0, failed_start, @@ -1207,7 +1207,7 @@ int set_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, * All allocations are done with GFP_NOFS. */ int convert_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, - unsigned bits, unsigned clear_bits, + u32 bits, u32 clear_bits, struct extent_state **cached_state) { struct extent_state *state; @@ -1408,7 +1408,7 @@ int convert_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, /* wrappers around set/clear extent bit */ int set_record_extent_bits(struct extent_io_tree *tree, u64 start, u64 end, - unsigned bits, struct extent_changeset *changeset) + u32 bits, struct extent_changeset *changeset) { /* * We don't support EXTENT_LOCKED yet, as current changeset will @@ -1423,14 +1423,14 @@ int set_record_extent_bits(struct extent_io_tree *tree, u64 start, u64 end, } int set_extent_bits_nowait(struct extent_io_tree *tree, u64 start, u64 end, - unsigned bits) + u32 bits) { return __set_extent_bit(tree, start, end, bits, 0, NULL, NULL, GFP_NOWAIT, NULL); } int clear_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, - unsigned bits, int wake, int delete, + u32 bits, int wake, int delete, struct extent_state **cached) { return __clear_extent_bit(tree, start, end, bits, wake, delete, @@ -1438,7 +1438,7 @@ int clear_extent_bit(struct extent_io_tree *tree, u64 start, u64 end, } int clear_record_extent_bits(struct extent_io_tree *tree, u64 start, u64 end, - unsigned bits, struct extent_changeset *changeset) + u32 bits, struct extent_changeset *changeset) { /* * Don't support EXTENT_LOCKED case, same reason as @@ -1521,7 +1521,7 @@ void extent_range_redirty_for_io(struct inode *inode, u64 start, u64 end) } } -static bool match_extent_state(struct extent_state *state, unsigned bits, +static bool match_extent_state(struct extent_state *state, u32 bits, bool exact_match) { if (exact_match) @@ -1541,7 +1541,7 @@ static bool match_extent_state(struct extent_state *state, unsigned bits, */ static struct extent_state * find_first_extent_bit_state(struct extent_io_tree *tree, - u64 start, unsigned bits, bool exact_match) + u64 start, u32 bits, bool exact_match) { struct rb_node *node; struct extent_state *state; @@ -1577,7 +1577,7 @@ find_first_extent_bit_state(struct extent_io_tree *tree, * Return 1 if we found nothing. */ int find_first_extent_bit(struct extent_io_tree *tree, u64 start, - u64 *start_ret, u64 *end_ret, unsigned bits, + u64 *start_ret, u64 *end_ret, u32 bits, bool exact_match, struct extent_state **cached_state) { struct extent_state *state; @@ -1629,7 +1629,7 @@ int find_first_extent_bit(struct extent_io_tree *tree, u64 start, * returned will be the full contiguous area with the bits set. */ int find_contiguous_extent_bit(struct extent_io_tree *tree, u64 start, - u64 *start_ret, u64 *end_ret, unsigned bits) + u64 *start_ret, u64 *end_ret, u32 bits) { struct extent_state *state; int ret = 1; @@ -1666,7 +1666,7 @@ int find_contiguous_extent_bit(struct extent_io_tree *tree, u64 start, * trim @end_ret to the appropriate size. */ void find_first_clear_extent_bit(struct extent_io_tree *tree, u64 start, - u64 *start_ret, u64 *end_ret, unsigned bits) + u64 *start_ret, u64 *end_ret, u32 bits) { struct extent_state *state; struct rb_node *node, *prev = NULL, *next; @@ -2056,8 +2056,7 @@ noinline_for_stack bool find_lock_delalloc_range(struct inode *inode, void extent_clear_unlock_delalloc(struct btrfs_inode *inode, u64 start, u64 end, struct page *locked_page, - unsigned clear_bits, - unsigned long page_ops) + u32 clear_bits, unsigned long page_ops) { clear_extent_bit(&inode->io_tree, start, end, clear_bits, 1, 0, NULL); @@ -2072,7 +2071,7 @@ void extent_clear_unlock_delalloc(struct btrfs_inode *inode, u64 start, u64 end, */ u64 count_range_bits(struct extent_io_tree *tree, u64 *start, u64 search_end, u64 max_bytes, - unsigned bits, int contig) + u32 bits, int contig) { struct rb_node *node; struct extent_state *state; @@ -2192,7 +2191,7 @@ struct io_failure_record *get_state_failrec(struct extent_io_tree *tree, u64 sta * range is found set. */ int test_range_bit(struct extent_io_tree *tree, u64 start, u64 end, - unsigned bits, int filled, struct extent_state *cached) + u32 bits, int filled, struct extent_state *cached) { struct extent_state *state = NULL; struct rb_node *node; diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h index 552afc1c0bbc..602d6568c8ea 100644 --- a/fs/btrfs/extent_io.h +++ b/fs/btrfs/extent_io.h @@ -288,7 +288,7 @@ void extent_range_clear_dirty_for_io(struct inode *inode, u64 start, u64 end); void extent_range_redirty_for_io(struct inode *inode, u64 start, u64 end); void extent_clear_unlock_delalloc(struct btrfs_inode *inode, u64 start, u64 end, struct page *locked_page, - unsigned bits_to_clear, + u32 bits_to_clear, unsigned long page_ops); struct bio *btrfs_bio_alloc(u64 first_byte); struct bio *btrfs_io_bio_alloc(unsigned int nr_iovecs); From patchwork Wed Sep 30 01:55:22 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807625 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 58A056CB for ; Wed, 30 Sep 2020 01:56:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 389F921531 for ; Wed, 30 Sep 2020 01:56:55 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="sMaHSJWu" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729930AbgI3B4y (ORCPT ); Tue, 29 Sep 2020 21:56:54 -0400 Received: from mx2.suse.de ([195.135.220.15]:50704 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729924AbgI3B4y (ORCPT ); Tue, 29 Sep 2020 21:56:54 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601431012; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=c3tpx+5FbPKXtKZeKcMAkC9wAiUpMdiz2L32yJUSmek=; b=sMaHSJWuPFBjviSAg2v4L7nuJLa1g7gKbTWbRw5XCosE/iIcr0xPEMkazaSn4iVhBSEsPG rSq1ijgCg7UtZnbY8oJDc2cIPXDSrY09ljxhDS0EFqxWD/7YP6xcEsL53rsQ7LeQwmBpWI 9RLiPs57nx7f2U8NCZbNscBUZRJEvgE= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 5AEEBAFAB for ; Wed, 30 Sep 2020 01:56:52 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v3 32/49] btrfs: extent_io: use extent_io_tree to handle subpage extent buffer allocation Date: Wed, 30 Sep 2020 09:55:22 +0800 Message-Id: <20200930015539.48867-33-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Currently btrfs uses page::private as an indicator of who owns the extent buffer, this method won't really work on subpage support, as one page can contain several tree blocks (up to 16 for 4K node size and 64K page size). Instead, here we utilize btree extent io tree to handle them. For btree io tree, we introduce a new bit, EXTENT_HAS_TREE_BLOCK to indicate that we have an in-tree extent buffer for the range. This will affects the following functions: - alloc_extent_buffer() Now for subpage we never use page->private to grab an existing eb. Instead, we rely on extra safenet in alloc_extent_buffer() to detect two callers on the same eb. - btrfs_release_extent_buffer_pages() Now for subpage, we clear the EXTENT_HAS_TREE_BLOCK bit first, then check if the remaining range in the page has EXTENT_HAS_TREE_BLOCK bit. If not, then clear the private bit for the page. - attach_extent_buffer_page() Now we set EXTENT_HAS_TREE_BLOCK bit for the new extent buffer to be attached, and set the page private, with NULL as page::private. Signed-off-by: Qu Wenruo --- fs/btrfs/btrfs_inode.h | 12 ++++++ fs/btrfs/extent-io-tree.h | 2 +- fs/btrfs/extent_io.c | 80 ++++++++++++++++++++++++++++++++++++++- 3 files changed, 91 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/btrfs_inode.h b/fs/btrfs/btrfs_inode.h index c47b6c6fea9f..cff818e0c406 100644 --- a/fs/btrfs/btrfs_inode.h +++ b/fs/btrfs/btrfs_inode.h @@ -217,6 +217,18 @@ static inline struct btrfs_inode *BTRFS_I(const struct inode *inode) return container_of(inode, struct btrfs_inode, vfs_inode); } +static inline struct btrfs_fs_info *page_to_fs_info(struct page *page) +{ + ASSERT(page->mapping); + return BTRFS_I(page->mapping->host)->root->fs_info; +} + +static inline struct extent_io_tree +*info_to_btree_io_tree(struct btrfs_fs_info *fs_info) +{ + return &BTRFS_I(fs_info->btree_inode)->io_tree; +} + static inline unsigned long btrfs_inode_hash(u64 objectid, const struct btrfs_root *root) { diff --git a/fs/btrfs/extent-io-tree.h b/fs/btrfs/extent-io-tree.h index 108b386118fe..c4e73c84ba34 100644 --- a/fs/btrfs/extent-io-tree.h +++ b/fs/btrfs/extent-io-tree.h @@ -23,7 +23,7 @@ struct io_failure_record; #define EXTENT_CLEAR_DATA_RESV (1U << 13) #define EXTENT_DELALLOC_NEW (1U << 14) -/* For subpage btree io tree, to indicate there is an extent buffer */ +/* For subpage btree io tree, indicates there is an in-tree extent buffer */ #define EXTENT_HAS_TREE_BLOCK (1U << 15) #define EXTENT_DO_ACCOUNTING (EXTENT_CLEAR_META_RESV | \ diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 0c4ce0b1f4ce..4dbc0b79c4ce 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -3134,6 +3134,18 @@ static void attach_extent_buffer_page(struct extent_buffer *eb, if (page->mapping) assert_spin_locked(&page->mapping->private_lock); + if (btrfs_is_subpage(eb->fs_info) && page->mapping) { + struct extent_io_tree *io_tree = + info_to_btree_io_tree(eb->fs_info); + + if (!PagePrivate(page)) + attach_page_private(page, NULL); + + set_extent_bit(io_tree, eb->start, eb->start + eb->len - 1, + EXTENT_HAS_TREE_BLOCK, NULL, NULL, GFP_ATOMIC); + return; + } + if (!PagePrivate(page)) attach_page_private(page, eb); else @@ -4955,6 +4967,36 @@ int extent_buffer_under_io(const struct extent_buffer *eb) test_bit(EXTENT_BUFFER_DIRTY, &eb->bflags)); } +static void detach_extent_buffer_subpage(struct extent_buffer *eb) +{ + struct btrfs_fs_info *fs_info = eb->fs_info; + struct extent_io_tree *io_tree = info_to_btree_io_tree(fs_info); + struct page *page = eb->pages[0]; + bool mapped = !test_bit(EXTENT_BUFFER_UNMAPPED, &eb->bflags); + int ret; + + if (!page) + return; + + if (mapped) + spin_lock(&page->mapping->private_lock); + + __clear_extent_bit(io_tree, eb->start, eb->start + eb->len - 1, + EXTENT_HAS_TREE_BLOCK, 0, 0, NULL, GFP_ATOMIC, NULL); + + /* Test if we still have other extent buffer in the page range */ + ret = test_range_bit(io_tree, round_down(eb->start, PAGE_SIZE), + round_down(eb->start, PAGE_SIZE) + PAGE_SIZE - 1, + EXTENT_HAS_TREE_BLOCK, 0, NULL); + if (!ret) + detach_page_private(eb->pages[0]); + if (mapped) + spin_unlock(&page->mapping->private_lock); + + /* One for when we allocated the page */ + put_page(page); +} + /* * Release all pages attached to the extent buffer. */ @@ -4966,6 +5008,9 @@ static void btrfs_release_extent_buffer_pages(struct extent_buffer *eb) BUG_ON(extent_buffer_under_io(eb)); + if (btrfs_is_subpage(eb->fs_info) && mapped) + return detach_extent_buffer_subpage(eb); + num_pages = num_extent_pages(eb); for (i = 0; i < num_pages; i++) { struct page *page = eb->pages[i]; @@ -5260,6 +5305,7 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, struct extent_buffer *exists = NULL; struct page *p; struct address_space *mapping = fs_info->btree_inode->i_mapping; + bool subpage = btrfs_is_subpage(fs_info); int uptodate = 1; int ret; @@ -5292,7 +5338,12 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, } spin_lock(&mapping->private_lock); - if (PagePrivate(p)) { + /* + * Subpage support doesn't use page::private at all, so we + * completely rely on the radix insert lock to prevent two + * ebs allocated for the same bytenr. + */ + if (PagePrivate(p) && !subpage) { /* * We could have already allocated an eb for this page * and attached one so lets see if we can get a ref on @@ -5333,8 +5384,21 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, * we could crash. */ } - if (uptodate) + if (uptodate) { set_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags); + } else if (subpage) { + /* + * For subpage, we must check extent_io_tree to get if the eb + * is really uptodate, as the page uptodate is only set if the + * whole page is uptodate. + * We can still have uptodate range in the page. + */ + struct extent_io_tree *io_tree = info_to_btree_io_tree(fs_info); + + if (test_range_bit(io_tree, eb->start, eb->start + eb->len - 1, + EXTENT_UPTODATE, 1, NULL)) + set_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags); + } again: ret = radix_tree_preload(GFP_NOFS); if (ret) { @@ -5373,6 +5437,18 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, if (eb->pages[i]) unlock_page(eb->pages[i]); } + /* + * For subpage case, btrfs_release_extent_buffer() will clear the + * EXTENT_HAS_TREE_BLOCK bit if there is a page. + * + * Since we're here because we hit a race with another caller, who + * succeeded in inserting the eb, we shouldn't clear that + * EXTENT_HAS_TREE_BLOCK bit. So here we cleanup the page manually. + */ + if (subpage) { + put_page(eb->pages[0]); + eb->pages[i] = NULL; + } btrfs_release_extent_buffer(eb); return exists; From patchwork Wed Sep 30 01:55:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807627 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 44B56618 for ; Wed, 30 Sep 2020 01:56:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 218E021531 for ; Wed, 30 Sep 2020 01:56:57 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="tDv3QbH2" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729935AbgI3B44 (ORCPT ); Tue, 29 Sep 2020 21:56:56 -0400 Received: from mx2.suse.de ([195.135.220.15]:50730 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729924AbgI3B4z (ORCPT ); Tue, 29 Sep 2020 21:56:55 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601431014; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=OJbPtFjlYngo9brT9t/2Ci+mNTwxpkoETnmOgFwfpnw=; b=tDv3QbH2AAK64pqnG6R4lei7hzv1PMCmUUmMSRUiL0ZS8V5e0ZIPI9NskvFglOGjuUvdZI RDFmAhbQDsdNG2PVWEsgqLHGWLjLDVX+GalsdptprK+GUPlADNMhRlNE1JZE5v3Xd9R8j3 Rt8/TjlW3Eu8Yqhrn3dv0FNP6mA+WT4= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 22ECEAF95 for ; Wed, 30 Sep 2020 01:56:54 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v3 33/49] btrfs: extent_io: make set/clear_extent_buffer_uptodate() to support subpage size Date: Wed, 30 Sep 2020 09:55:23 +0800 Message-Id: <20200930015539.48867-34-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org For those two functions, to support subpage size they just need the follow work: - set/clear the EXTENT_UPTODATE bits for io_tree - set page Uptodate if the full range of the page is uptodate Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 28 ++++++++++++++++++++++++++-- 1 file changed, 26 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 4dbc0b79c4ce..c9bbb91c6155 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -5602,10 +5602,18 @@ bool set_extent_buffer_dirty(struct extent_buffer *eb) void clear_extent_buffer_uptodate(struct extent_buffer *eb) { int i; - struct page *page; + struct page *page = eb->pages[0]; int num_pages; clear_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags); + + if (btrfs_is_subpage(eb->fs_info) && page->mapping) { + struct extent_io_tree *io_tree = + info_to_btree_io_tree(eb->fs_info); + + clear_extent_uptodate(io_tree, eb->start, + eb->start + eb->len - 1, NULL); + } num_pages = num_extent_pages(eb); for (i = 0; i < num_pages; i++) { page = eb->pages[i]; @@ -5617,10 +5625,26 @@ void clear_extent_buffer_uptodate(struct extent_buffer *eb) void set_extent_buffer_uptodate(struct extent_buffer *eb) { int i; - struct page *page; + struct page *page = eb->pages[0]; int num_pages; set_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags); + + if (btrfs_is_subpage(eb->fs_info) && page->mapping) { + struct extent_state *cached = NULL; + struct extent_io_tree *io_tree = + info_to_btree_io_tree(eb->fs_info); + u64 page_start = page_offset(page); + u64 page_end = page_offset(page) + PAGE_SIZE - 1; + + set_extent_uptodate(io_tree, eb->start, eb->start + eb->len - 1, + &cached, GFP_NOFS); + if (test_range_bit(io_tree, page_start, page_end, + EXTENT_UPTODATE, 1, cached)) + SetPageUptodate(page); + free_extent_state(cached); + return; + } num_pages = num_extent_pages(eb); for (i = 0; i < num_pages; i++) { page = eb->pages[i]; From patchwork Wed Sep 30 01:55:24 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807629 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9C77E618 for ; Wed, 30 Sep 2020 01:56:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7E9CB2145D for ; Wed, 30 Sep 2020 01:56:59 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="Lm1/fbct" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729937AbgI3B46 (ORCPT ); Tue, 29 Sep 2020 21:56:58 -0400 Received: from mx2.suse.de ([195.135.220.15]:50756 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729747AbgI3B45 (ORCPT ); Tue, 29 Sep 2020 21:56:57 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601431016; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=S8QL7yhujweGOV9TJ9X7yk4d3Yo9faHnV7MYdCpAgWo=; b=Lm1/fbct7d0x9vu1Zz5xPhPb8bFpo11Fyq4VOwyzW+6Me+J544/LNGeP2hhixrlvlT8Dnj aswQsBDjqkQAdeN8g//lmUjhVxzqmdldpOaMkVpoq9SknTGSS/jMgOUPnbNXRsk2XraX49 EeAd+bGG4LYlIpR9hZXvsV5in9An0P0= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 61F27AF95 for ; Wed, 30 Sep 2020 01:56:56 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v3 34/49] btrfs: extent_io: make the assert test on page uptodate able to handle subpage Date: Wed, 30 Sep 2020 09:55:24 +0800 Message-Id: <20200930015539.48867-35-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org There are quite some assert test on page uptodate in extent buffer write accessors. They ensure the destination page is already uptodate. This is fine for regular sector size case, but not for subpage case, as for subpage we only mark the page uptodate if the page contains no hole and all its extent buffers are uptodate. So instead of checking PageUptodate(), for subpage case we check EXTENT_UPTODATE bit for the range covered by the extent buffer. To make the check more elegant, introduce a helper, assert_eb_range_uptodate() to do the check for both subpage and regular sector size cases. The following functions are involved: - write_extent_buffer_chunk_tree_uuid() - write_extent_buffer_fsid() - write_extent_buffer() - memzero_extent_buffer() - copy_extent_buffer() - extent_buffer_test_bit() - extent_buffer_bitmap_set() - extent_buffer_bitmap_clear() Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 44 ++++++++++++++++++++++++++++++++++---------- 1 file changed, 34 insertions(+), 10 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index c9bbb91c6155..210ae3349108 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -5867,12 +5867,36 @@ int memcmp_extent_buffer(const struct extent_buffer *eb, const void *ptrv, return ret; } +/* + * A helper to ensure that the extent buffer is uptodate. + * + * For regular sector size == PAGE_SIZE case, check if @page is uptodate. + * For subpage case, check if the range covered by the eb has EXTENT_UPTODATE. + */ +static void assert_eb_range_uptodate(const struct extent_buffer *eb, + struct page *page) +{ + struct btrfs_fs_info *fs_info = eb->fs_info; + + if (btrfs_is_subpage(fs_info) && page->mapping) { + struct extent_io_tree *io_tree = info_to_btree_io_tree(fs_info); + + /* For subpage and mapped eb, check the EXTENT_UPTODATE bit. */ + WARN_ON(!test_range_bit(io_tree, eb->start, + eb->start + eb->len - 1, EXTENT_UPTODATE, 1, + NULL)); + } else { + /* For regular eb or dummy eb, check the page status directly */ + WARN_ON(!PageUptodate(page)); + } +} + void write_extent_buffer_chunk_tree_uuid(const struct extent_buffer *eb, const void *srcv) { char *kaddr; - WARN_ON(!PageUptodate(eb->pages[0])); + assert_eb_range_uptodate(eb, eb->pages[0]); kaddr = page_address(eb->pages[0]) + get_eb_page_offset(eb, 0); memcpy(kaddr + offsetof(struct btrfs_header, chunk_tree_uuid), srcv, BTRFS_FSID_SIZE); @@ -5882,7 +5906,7 @@ void write_extent_buffer_fsid(const struct extent_buffer *eb, const void *srcv) { char *kaddr; - WARN_ON(!PageUptodate(eb->pages[0])); + assert_eb_range_uptodate(eb, eb->pages[0]); kaddr = page_address(eb->pages[0]) + get_eb_page_offset(eb, 0); memcpy(kaddr + offsetof(struct btrfs_header, fsid), srcv, BTRFS_FSID_SIZE); @@ -5905,7 +5929,7 @@ void write_extent_buffer(const struct extent_buffer *eb, const void *srcv, while (len > 0) { page = eb->pages[i]; - WARN_ON(!PageUptodate(page)); + assert_eb_range_uptodate(eb, page); cur = min(len, PAGE_SIZE - offset); kaddr = page_address(page); @@ -5934,7 +5958,7 @@ void memzero_extent_buffer(const struct extent_buffer *eb, unsigned long start, while (len > 0) { page = eb->pages[i]; - WARN_ON(!PageUptodate(page)); + assert_eb_range_uptodate(eb, page); cur = min(len, PAGE_SIZE - offset); kaddr = page_address(page); @@ -5990,7 +6014,7 @@ void copy_extent_buffer(const struct extent_buffer *dst, while (len > 0) { page = dst->pages[i]; - WARN_ON(!PageUptodate(page)); + assert_eb_range_uptodate(dst, page); cur = min(len, (unsigned long)(PAGE_SIZE - offset)); @@ -6052,7 +6076,7 @@ int extent_buffer_test_bit(const struct extent_buffer *eb, unsigned long start, eb_bitmap_offset(eb, start, nr, &i, &offset); page = eb->pages[i]; - WARN_ON(!PageUptodate(page)); + assert_eb_range_uptodate(eb, page); kaddr = page_address(page); return 1U & (kaddr[offset] >> (nr & (BITS_PER_BYTE - 1))); } @@ -6077,7 +6101,7 @@ void extent_buffer_bitmap_set(const struct extent_buffer *eb, unsigned long star eb_bitmap_offset(eb, start, pos, &i, &offset); page = eb->pages[i]; - WARN_ON(!PageUptodate(page)); + assert_eb_range_uptodate(eb, page); kaddr = page_address(page); while (len >= bits_to_set) { @@ -6088,7 +6112,7 @@ void extent_buffer_bitmap_set(const struct extent_buffer *eb, unsigned long star if (++offset >= PAGE_SIZE && len > 0) { offset = 0; page = eb->pages[++i]; - WARN_ON(!PageUptodate(page)); + assert_eb_range_uptodate(eb, page); kaddr = page_address(page); } } @@ -6120,7 +6144,7 @@ void extent_buffer_bitmap_clear(const struct extent_buffer *eb, eb_bitmap_offset(eb, start, pos, &i, &offset); page = eb->pages[i]; - WARN_ON(!PageUptodate(page)); + assert_eb_range_uptodate(eb, page); kaddr = page_address(page); while (len >= bits_to_clear) { @@ -6131,7 +6155,7 @@ void extent_buffer_bitmap_clear(const struct extent_buffer *eb, if (++offset >= PAGE_SIZE && len > 0) { offset = 0; page = eb->pages[++i]; - WARN_ON(!PageUptodate(page)); + assert_eb_range_uptodate(eb, page); kaddr = page_address(page); } } From patchwork Wed Sep 30 01:55:25 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807631 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CF5826CB for ; Wed, 30 Sep 2020 01:57:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A8E852145D for ; Wed, 30 Sep 2020 01:57:01 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="hHDYv9z7" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729945AbgI3B5A (ORCPT ); Tue, 29 Sep 2020 21:57:00 -0400 Received: from mx2.suse.de ([195.135.220.15]:50772 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729747AbgI3B5A (ORCPT ); Tue, 29 Sep 2020 21:57:00 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601431018; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=v2mUGXBH/Q6r6EnGlFLGbgjG1Js96Cbx2vQqjHniJaI=; b=hHDYv9z7DpBeZSifCcQopSNJgtboqI95vZcK8WDjs2XKQaJcP52JEp2qzyc2Hfv5c1bn96 Q3cyAfx3dWMNx8mnhqI9YAUKbFDq+9OJaCZKVREP/20aLaU4qFHaWJdoyx6YaHdeZFPHYR gXVLRkHQdjm41qMSXeFVL5m4i0g6iBE= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 4BDD6AF99 for ; Wed, 30 Sep 2020 01:56:58 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v3 35/49] btrfs: extent_io: implement subpage metadata read and its endio function Date: Wed, 30 Sep 2020 09:55:25 +0800 Message-Id: <20200930015539.48867-36-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org For subpage metadata read, since we're completely relying on io tree other than page bits, its read submission and endio function is different from the regular page size. For submission part: - Do extent locking/waiting Addition to page locking, we do extra extent io tree locking, which provides more accurate range locking. Since we're still utilizing the page locking, that means we will have higher delay for reading tree blocks in the same page. (reading extent buffers in the same page will be forced sequential). - Submit extent page directly To simply the process, as all the metadata read is always contained in one page. For endio part: - Do extent locking/waiting The same as submission part. This behavior has a small problem that, extent locking/waiting are all going to allocate memory, thus they can all fail. Currently we're relying on the BUG_ON() in various set_extent_bits() calls. But when we're going to handle the error from them, this way would make it more complex to pass all the ENOMEM error upwards. Signed-off-by: Qu Wenruo --- fs/btrfs/disk-io.c | 81 ++++++++++++++++++++++++++++++++++++++++ fs/btrfs/extent_io.c | 88 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 169 insertions(+) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 10bdb0a8a92f..89021e552da0 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -651,6 +651,84 @@ static int btrfs_check_extent_buffer(struct extent_buffer *eb) return ret; } +static int btree_read_subpage_endio_hook(struct page *page, u64 start, u64 end, + int mirror) +{ + struct btrfs_fs_info *fs_info = page_to_fs_info(page); + struct extent_buffer *eb; + int reads_done; + int ret = 0; + + if (!IS_ALIGNED(start, fs_info->sectorsize) || + !IS_ALIGNED(end - start + 1, fs_info->sectorsize) || + !IS_ALIGNED(end - start + 1, fs_info->nodesize)) { + WARN_ON(IS_ENABLED(CONFIG_BTRFS_DEBUG)); + btrfs_err(fs_info, "invalid tree read bytenr"); + return -EUCLEAN; + } + + /* + * We don't allow bio merge for subpage metadata read, so we should + * only get one eb for each endio hook. + */ + ASSERT(end == start + fs_info->nodesize - 1); + ASSERT(PagePrivate(page)); + + rcu_read_lock(); + eb = radix_tree_lookup(&fs_info->buffer_radix, + start / fs_info->sectorsize); + rcu_read_unlock(); + + /* + * When we are reading one tree block, eb must have been + * inserted into the radix tree. If not something is wrong. + */ + if (!eb) { + WARN_ON(IS_ENABLED(CONFIG_BTRFS_DEBUG)); + btrfs_err(fs_info, + "can't find extent buffer for bytenr %llu", + start); + return -EUCLEAN; + } + /* + * The pending IO might have been the only thing that kept + * this buffer in memory. Make sure we have a ref for all + * this other checks + */ + atomic_inc(&eb->refs); + + reads_done = atomic_dec_and_test(&eb->io_pages); + /* Subpage read must finish in page read */ + ASSERT(reads_done); + + eb->read_mirror= mirror; + if (test_bit(EXTENT_BUFFER_READ_ERR, &eb->bflags)) { + ret = -EIO; + goto err; + } + ret = btrfs_check_extent_buffer(eb); + if (ret < 0) + goto err; + + if (test_and_clear_bit(EXTENT_BUFFER_READAHEAD, &eb->bflags)) + btree_readahead_hook(eb, ret); + + set_extent_buffer_uptodate(eb); + + free_extent_buffer(eb); + return ret; +err: + /* + * our io error hook is going to dec the io pages + * again, we have to make sure it has something to + * decrement + */ + atomic_inc(&eb->io_pages); + clear_extent_buffer_uptodate(eb); + free_extent_buffer(eb); + return ret; +} + static int btree_readpage_end_io_hook(struct btrfs_io_bio *io_bio, u64 phy_offset, struct page *page, u64 start, u64 end, int mirror) @@ -659,6 +737,9 @@ static int btree_readpage_end_io_hook(struct btrfs_io_bio *io_bio, int ret = 0; bool reads_done; + if (btrfs_is_subpage(page_to_fs_info(page))) + return btree_read_subpage_endio_hook(page, start, end, mirror); + /* Metadata pages that goes through IO should all have private set */ ASSERT(PagePrivate(page) && page->private); eb = (struct extent_buffer *)page->private; diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 210ae3349108..1423f69bc210 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -3082,6 +3082,15 @@ static int submit_extent_page(unsigned int opf, else contig = bio_end_sector(bio) == sector; + /* + * For subpage metadata read, never merge request, so that + * we get endio hook called on each metadata read. + */ + if (btrfs_is_subpage(page_to_fs_info(page)) && + tree->owner == IO_TREE_BTREE_INODE_IO && + (opf & REQ_OP_READ)) + ASSERT(force_bio_submit); + ASSERT(tree->ops); if (btrfs_bio_fits_in_stripe(page, io_size, bio, bio_flags)) can_merge = false; @@ -5652,6 +5661,82 @@ void set_extent_buffer_uptodate(struct extent_buffer *eb) } } +static int read_extent_buffer_subpage(struct extent_buffer *eb, int wait, + int mirror_num) +{ + struct btrfs_fs_info *fs_info = eb->fs_info; + struct extent_io_tree *io_tree = info_to_btree_io_tree(fs_info); + struct page *page = eb->pages[0]; + struct bio *bio = NULL; + int ret = 0; + + ASSERT(!test_bit(EXTENT_BUFFER_UNMAPPED, &eb->bflags)); + + /* Lock page first then lock extent range */ + if (wait == WAIT_NONE) { + if (!trylock_page(page)) + return 0; + } else { + lock_page(page); + } + + if (wait == WAIT_NONE) { + ret = try_lock_extent(io_tree, eb->start, + eb->start + eb->len - 1); + if (ret <= 0) { + unlock_page(page); + return ret; + } + } else { + ret = lock_extent(io_tree, eb->start, eb->start + eb->len - 1); + if (ret < 0) { + unlock_page(page); + return ret; + } + } + + ret = 0; + if (test_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags) || + PageUptodate(page) || + test_range_bit(io_tree, eb->start, eb->start + eb->len - 1, + EXTENT_UPTODATE, 1, NULL)) { + set_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags); + unlock_page(page); + unlock_extent(io_tree, eb->start, eb->start + eb->len - 1); + return ret; + } + atomic_set(&eb->io_pages, 1); + + ret = submit_extent_page(REQ_OP_READ | REQ_META, NULL, page, eb->start, + eb->len, eb->start - page_offset(page), &bio, + end_bio_extent_readpage, mirror_num, 0, 0, + true); + if (ret) { + /* + * In the endio function, if we hit something wrong we will + * increase the io_pages, so here we need to decrease it for error + * path. + */ + atomic_dec(&eb->io_pages); + } + if (bio) { + int tmp; + + tmp = submit_one_bio(bio, mirror_num, 0); + if (tmp < 0) + return tmp; + } + if (ret || wait != WAIT_COMPLETE) + return ret; + + wait_on_page_locked(page); + wait_extent_bit(io_tree, eb->start, eb->start + eb->len - 1, + EXTENT_LOCKED); + if (!test_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags)) + ret = -EIO; + return ret; +} + int read_extent_buffer_pages(struct extent_buffer *eb, int wait, int mirror_num) { int i; @@ -5668,6 +5753,9 @@ int read_extent_buffer_pages(struct extent_buffer *eb, int wait, int mirror_num) if (test_bit(EXTENT_BUFFER_UPTODATE, &eb->bflags)) return 0; + if (btrfs_is_subpage(eb->fs_info)) + return read_extent_buffer_subpage(eb, wait, mirror_num); + num_pages = num_extent_pages(eb); for (i = 0; i < num_pages; i++) { page = eb->pages[i]; From patchwork Wed Sep 30 01:55:26 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807633 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4C4796CB for ; Wed, 30 Sep 2020 01:57:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2CD652145D for ; Wed, 30 Sep 2020 01:57:03 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="A/amcoZ1" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729949AbgI3B5C (ORCPT ); Tue, 29 Sep 2020 21:57:02 -0400 Received: from mx2.suse.de ([195.135.220.15]:50798 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729940AbgI3B5B (ORCPT ); Tue, 29 Sep 2020 21:57:01 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601431020; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=DmVdilXneXqnUVyvIupvFb83PQ7eOKq1hRiJvxyGibQ=; b=A/amcoZ1SYNyD7/aqLPL78qIionE5DAA6Nzako1DFgHF++vrVnsmQAkrS/pyjeau5x2nCP 2y1FPvuJ9pAzPfRlpo8LCf5ap3Y1fxr/AinaoDx7RoHOBnZfwsSxULGkjMR41XjeFJNk/b kjtFjnqqxIkOaAXi3PVpYvh2aBjfLwE= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id EB93CAF95 for ; Wed, 30 Sep 2020 01:56:59 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v3 36/49] btrfs: extent_io: implement try_release_extent_buffer() for subpage metadata support Date: Wed, 30 Sep 2020 09:55:26 +0800 Message-Id: <20200930015539.48867-37-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org For try_release_extent_buffer(), we just iterate through all the range with EXTENT_NEW set, and try freeing each extent buffer. Also introduce a helper, find_first_subpage_eb(), to locate find the first eb in the range. This helper will also be utilized for later subpage patches. Signed-off-by: Qu Wenruo --- fs/btrfs/disk-io.c | 6 ++++ fs/btrfs/extent_io.c | 83 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 89 insertions(+) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 89021e552da0..efbe12e4f952 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -1047,6 +1047,12 @@ static int btree_writepages(struct address_space *mapping, static int btree_readpage(struct file *file, struct page *page) { + /* + * For subpage, we don't support VFS to call btree_readpages(), + * directly. + */ + if (btrfs_is_subpage(page_to_fs_info(page))) + return -ENOTTY; return extent_read_full_page(page, btree_get_extent, 0); } diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 1423f69bc210..6aa25681aea4 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -2743,6 +2743,48 @@ blk_status_t btrfs_submit_read_repair(struct inode *inode, return status; } +/* + * A helper for locate subpage extent buffer. + * + * NOTE: returned extent buffer won't has its ref increased. + * + * @extra_bits: Extra bits to match. + * The returned eb range will match all extra_bits. + * + * Return 0 if we found one extent buffer and record it in @eb_ret. + * Return 1 if there is no extent buffer in the range. + */ +static int find_first_subpage_eb(struct btrfs_fs_info *fs_info, + struct extent_buffer **eb_ret, u64 start, + u64 end, u32 extra_bits) +{ + struct extent_io_tree *io_tree = info_to_btree_io_tree(fs_info); + u64 found_start; + u64 found_end; + int ret; + + ASSERT(btrfs_is_subpage(fs_info) && eb_ret); + + ret = find_first_extent_bit(io_tree, start, &found_start, &found_end, + EXTENT_HAS_TREE_BLOCK | extra_bits, true, NULL); + if (ret > 0 || found_start > end) + return 1; + + /* found_start can be smaller than start */ + start = max(start, found_start); + + /* + * Here we can't call find_extent_buffer() which will increase + * eb->refs. + */ + rcu_read_lock(); + *eb_ret = radix_tree_lookup(&fs_info->buffer_radix, + start / fs_info->sectorsize); + rcu_read_unlock(); + ASSERT(*eb_ret); + return 0; +} + /* lots and lots of room for performance fixes in the end_bio funcs */ void end_extent_writepage(struct page *page, int err, u64 start, u64 end) @@ -6374,10 +6416,51 @@ void memmove_extent_buffer(const struct extent_buffer *dst, } } +static int try_release_subpage_eb(struct page *page) +{ + struct btrfs_fs_info *fs_info = page_to_fs_info(page); + struct extent_io_tree *io_tree = info_to_btree_io_tree(fs_info); + u64 cur = page_offset(page); + u64 end = page_offset(page) + PAGE_SIZE - 1; + int ret; + + while (cur <= end) { + struct extent_buffer *eb; + + ret = find_first_subpage_eb(fs_info, &eb, cur, end, 0); + if (ret > 0) + break; + + cur = eb->start + eb->len; + + spin_lock(&eb->refs_lock); + if (atomic_read(&eb->refs) != 1 || extent_buffer_under_io(eb) || + !test_and_clear_bit(EXTENT_BUFFER_TREE_REF, &eb->bflags)) { + spin_unlock(&eb->refs_lock); + continue; + } + /* + * Here we don't care the return value, we will always check + * the EXTENT_HAS_TREE_BLOCK bit at the end. + */ + release_extent_buffer(eb); + } + + /* Finally check if there is any EXTENT_HAS_TREE_BLOCK bit remaining */ + if (test_range_bit(io_tree, page_offset(page), end, + EXTENT_HAS_TREE_BLOCK, 0, NULL)) + ret = 0; + else + ret = 1; + return ret; +} + int try_release_extent_buffer(struct page *page) { struct extent_buffer *eb; + if (btrfs_is_subpage(page_to_fs_info(page))) + return try_release_subpage_eb(page); /* * We need to make sure nobody is attaching this page to an eb right * now. From patchwork Wed Sep 30 01:55:27 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807635 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EB63A618 for ; Wed, 30 Sep 2020 01:57:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C21B32145D for ; Wed, 30 Sep 2020 01:57:04 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="a+QjQEas" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729953AbgI3B5D (ORCPT ); Tue, 29 Sep 2020 21:57:03 -0400 Received: from mx2.suse.de ([195.135.220.15]:50832 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729950AbgI3B5C (ORCPT ); Tue, 29 Sep 2020 21:57:02 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601431021; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NaisI4tRgIxExBLYX2BkuNqgYAVDdP62/iyDNZC8Z8Y=; b=a+QjQEasVXhXsVWN++UnGNUnIJeAbRwOdh6eevFozRA7XgDjtgVTWN+FhGHYR7hPYeKgOV IQrA38bVH6RxbyBaeFlcA66h5+u5TAZmWlzZ0N1Dz8uk4zDi5hP0wqhgzbzQ0i8mJXwrca IiAHB7pXqW58jaLq0KrZTxWgL9MdUyk= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id BD0A5AF99 for ; Wed, 30 Sep 2020 01:57:01 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v3 37/49] btrfs: set btree inode track_uptodate for subpage support Date: Wed, 30 Sep 2020 09:55:27 +0800 Message-Id: <20200930015539.48867-38-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Let btree io tree to track EXTENT_UPTODATE bit, so that for subpage metadata IO, we don't need to bother tracking the UPTODATE status manually through bio submission/endio functions. Currently only subpage metadata will cleanup the extra bits utizlied (EXTENT_HAS_TREE_BLOCK, EXTENT_UPTODATE, EXTENT_LOCKED), while the regular page size will only clean up EXTENT_LOCKED. This still allows the regular page size case to avoid the extra delay in extent io tree operations, but allows subpage case to be sector size aligned. Signed-off-by: Qu Wenruo --- fs/btrfs/disk-io.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index efbe12e4f952..97c44f518a49 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -2244,7 +2244,14 @@ static void btrfs_init_btree_inode(struct btrfs_fs_info *fs_info) RB_CLEAR_NODE(&BTRFS_I(inode)->rb_node); extent_io_tree_init(fs_info, &BTRFS_I(inode)->io_tree, IO_TREE_BTREE_INODE_IO, inode); - BTRFS_I(inode)->io_tree.track_uptodate = false; + /* + * For subpage size support, btree inode tracks EXTENT_UPTODATE for + * its IO. + */ + if (btrfs_is_subpage(fs_info)) + BTRFS_I(inode)->io_tree.track_uptodate = true; + else + BTRFS_I(inode)->io_tree.track_uptodate = false; extent_map_tree_init(&BTRFS_I(inode)->extent_tree); BTRFS_I(inode)->io_tree.ops = &btree_extent_io_ops; From patchwork Wed Sep 30 01:55:28 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807637 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 183FB618 for ; Wed, 30 Sep 2020 01:57:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id EC6EC21531 for ; Wed, 30 Sep 2020 01:57:06 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="Gz8tKRY7" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729957AbgI3B5G (ORCPT ); Tue, 29 Sep 2020 21:57:06 -0400 Received: from mx2.suse.de ([195.135.220.15]:50868 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729767AbgI3B5F (ORCPT ); Tue, 29 Sep 2020 21:57:05 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601431023; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NiJeh8dbrKT2BY0wSXL7NHSBFI06q2uR9GdxgHUXOWA=; b=Gz8tKRY7e8f2NXUTAVQ7mcXXaquevGiXy168W1PhmAq7BBCT69zzMT5mmHkmr3JYDPp+al Gp4HFsD9JgqDk/GWLdoi+ugzwfo62k5uBphvDQx2g3P/m5fNto7vVaifvCOKVlQeTKidSH BVaN1vqiY9acw3Ytj6K5gYlpZHm0OQ4= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id A9339AE07 for ; Wed, 30 Sep 2020 01:57:03 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v3 38/49] btrfs: allow RO mount of 4K sector size fs on 64K page system Date: Wed, 30 Sep 2020 09:55:28 +0800 Message-Id: <20200930015539.48867-39-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org This adds the basic RO mount ability for 4K sector size on 64K page system. Currently we only plan to support 4K and 64K page system. Signed-off-by: Qu Wenruo --- fs/btrfs/disk-io.c | 24 +++++++++++++++++++++--- fs/btrfs/super.c | 7 +++++++ 2 files changed, 28 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 97c44f518a49..e0dc7b92411e 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -2565,13 +2565,21 @@ static int validate_super(struct btrfs_fs_info *fs_info, btrfs_err(fs_info, "invalid sectorsize %llu", sectorsize); ret = -EINVAL; } - /* Only PAGE SIZE is supported yet */ - if (sectorsize != PAGE_SIZE) { + + /* + * For 4K page size, we only support 4K sector size. + * For 64K page size, we support RW for 64K sector size, and RO for + * 4K sector size. + */ + if ((PAGE_SIZE == SZ_4K && sectorsize != PAGE_SIZE) || + (PAGE_SIZE == SZ_64K && (sectorsize != SZ_4K && + sectorsize != SZ_64K))) { btrfs_err(fs_info, - "sectorsize %llu not supported yet, only support %lu", + "sectorsize %llu not supported yet for page size %lu", sectorsize, PAGE_SIZE); ret = -EINVAL; } + if (!is_power_of_2(nodesize) || nodesize < sectorsize || nodesize > BTRFS_MAX_METADATA_BLOCKSIZE) { btrfs_err(fs_info, "invalid nodesize %llu", nodesize); @@ -3219,6 +3227,16 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device goto fail_alloc; } + /* For 4K sector size support, it's only read-only yet */ + if (PAGE_SIZE == SZ_64K && sectorsize == SZ_4K) { + if (!sb_rdonly(sb) || btrfs_super_log_root(disk_super)) { + btrfs_err(fs_info, + "subpage sector size only support RO yet"); + err = -EINVAL; + goto fail_alloc; + } + } + ret = btrfs_init_workqueues(fs_info, fs_devices); if (ret) { err = ret; diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 25967ecaaf0a..743a2fadf4ee 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -1922,6 +1922,13 @@ static int btrfs_remount(struct super_block *sb, int *flags, char *data) ret = -EINVAL; goto restore; } + if (btrfs_is_subpage(fs_info)) { + btrfs_warn(fs_info, + "read-write mount is not yet allowed for sector size %u page size %lu", + fs_info->sectorsize, PAGE_SIZE); + ret = -EINVAL; + goto restore; + } ret = btrfs_cleanup_fs_roots(fs_info); if (ret) From patchwork Wed Sep 30 01:55:29 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807639 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F39861668 for ; Wed, 30 Sep 2020 01:57:08 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CBEB421531 for ; Wed, 30 Sep 2020 01:57:08 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="ev9Dbcwn" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729958AbgI3B5I (ORCPT ); Tue, 29 Sep 2020 21:57:08 -0400 Received: from mx2.suse.de ([195.135.220.15]:50900 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729951AbgI3B5H (ORCPT ); Tue, 29 Sep 2020 21:57:07 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601431025; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QKkd+KETE8Sz7Lnxy1dsr2hi1Yf9mWbl/e0tAEn4UW4=; b=ev9DbcwnICRrDetgFI3/ig3pGSjXA+f33qJxbdFUZINL4RKHwPcukYotEeKJarNLumjFtu sYezR3F2nc+iiSD36zBAMJytu30QV9/0dUNes38CKneNdiw7tV+xhGSGIJ0Pu1GUvgXkyY 0vf2aiTKhPJBOdy76BfCfHM6FgZDZys= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 6DA18AFAB for ; Wed, 30 Sep 2020 01:57:05 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v3 39/49] btrfs: disk-io: allow btree_set_page_dirty() to do more sanity check on subpage metadata Date: Wed, 30 Sep 2020 09:55:29 +0800 Message-Id: <20200930015539.48867-40-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org For btree_set_page_dirty(), we should also check the extent buffer sanity for subpage support. Unlike the regular sector size case, since one page can contain multile extent buffers, and page::private no longer contains the pointer to extent buffer. So this patch will iterate through the extent_io_tree to find out any EXTENT_HAS_TREE_BLOCK bit, and check if any extent buffers in the page range has EXTENT_BUFFER_DIRTY and proper refs. Also, since we need to find subpage extent outside of extent_io.c, export find_first_subpage_eb() as btrfs_find_first_subpage_eb(). Signed-off-by: Qu Wenruo --- fs/btrfs/disk-io.c | 36 ++++++++++++++++++++++++++++++------ fs/btrfs/extent_io.c | 8 ++++---- fs/btrfs/extent_io.h | 4 ++++ 3 files changed, 38 insertions(+), 10 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index e0dc7b92411e..d31999978821 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -1110,14 +1110,38 @@ static void btree_invalidatepage(struct page *page, unsigned int offset, static int btree_set_page_dirty(struct page *page) { #ifdef DEBUG + struct btrfs_fs_info *fs_info = page_to_fs_info(page); struct extent_buffer *eb; - BUG_ON(!PagePrivate(page)); - eb = (struct extent_buffer *)page->private; - BUG_ON(!eb); - BUG_ON(!test_bit(EXTENT_BUFFER_DIRTY, &eb->bflags)); - BUG_ON(!atomic_read(&eb->refs)); - btrfs_assert_tree_locked(eb); + if (fs_info->sectorsize == PAGE_SIZE) { + BUG_ON(!PagePrivate(page)); + eb = (struct extent_buffer *)page->private; + BUG_ON(!eb); + BUG_ON(!test_bit(EXTENT_BUFFER_DIRTY, &eb->bflags)); + BUG_ON(!atomic_read(&eb->refs)); + btrfs_assert_tree_locked(eb); + } else { + u64 page_start = page_offset(page); + u64 page_end = page_start + PAGE_SIZE - 1; + u64 cur = page_start; + bool found_dirty_eb = false; + int ret; + + ASSERT(btrfs_is_subpage(fs_info)); + while (cur <= page_end) { + ret = btrfs_find_first_subpage_eb(fs_info, &eb, cur, + page_end, 0); + if (ret > 0) + break; + cur = eb->start + eb->len; + if (test_bit(EXTENT_BUFFER_DIRTY, &eb->bflags)) { + found_dirty_eb = true; + ASSERT(atomic_read(&eb->refs)); + btrfs_assert_tree_locked(eb); + } + } + BUG_ON(!found_dirty_eb); + } #endif return __set_page_dirty_nobuffers(page); } diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 6aa25681aea4..5750a3b92777 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -2754,9 +2754,9 @@ blk_status_t btrfs_submit_read_repair(struct inode *inode, * Return 0 if we found one extent buffer and record it in @eb_ret. * Return 1 if there is no extent buffer in the range. */ -static int find_first_subpage_eb(struct btrfs_fs_info *fs_info, - struct extent_buffer **eb_ret, u64 start, - u64 end, u32 extra_bits) +int btrfs_find_first_subpage_eb(struct btrfs_fs_info *fs_info, + struct extent_buffer **eb_ret, u64 start, + u64 end, u32 extra_bits) { struct extent_io_tree *io_tree = info_to_btree_io_tree(fs_info); u64 found_start; @@ -6427,7 +6427,7 @@ static int try_release_subpage_eb(struct page *page) while (cur <= end) { struct extent_buffer *eb; - ret = find_first_subpage_eb(fs_info, &eb, cur, end, 0); + ret = btrfs_find_first_subpage_eb(fs_info, &eb, cur, end, 0); if (ret > 0) break; diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h index 602d6568c8ea..f527b6fa258d 100644 --- a/fs/btrfs/extent_io.h +++ b/fs/btrfs/extent_io.h @@ -298,6 +298,10 @@ struct bio *btrfs_bio_clone_partial(struct bio *orig, int offset, int size); struct btrfs_fs_info; struct btrfs_inode; +int btrfs_find_first_subpage_eb(struct btrfs_fs_info *fs_info, + struct extent_buffer **eb_ret, u64 start, + u64 end, unsigned int extra_bits); + int repair_io_failure(struct btrfs_fs_info *fs_info, u64 ino, u64 start, u64 length, u64 logical, struct page *page, unsigned int pg_offset, int mirror_num); From patchwork Wed Sep 30 01:55:30 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807641 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A59D06CB for ; Wed, 30 Sep 2020 01:57:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8113C21531 for ; Wed, 30 Sep 2020 01:57:10 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="kX7GoQ0z" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729962AbgI3B5J (ORCPT ); Tue, 29 Sep 2020 21:57:09 -0400 Received: from mx2.suse.de ([195.135.220.15]:50964 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729777AbgI3B5I (ORCPT ); Tue, 29 Sep 2020 21:57:08 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601431027; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=IQuXTQ8wGZMeGkbOyMaJVaGYtD6UrQBAoNom1MdB1HQ=; b=kX7GoQ0zuLfFVIEkrkGR/7GETcihbToiFZktk7C2rLL1ApqXn+dmq2vDeBsrdFnwAcE3jB qJBc6cZ/q9k0HGXoWcKe8fjfu4cN5HYDJ5ulrs5m0vTTl/UN2zbXu0UCr+JQIbZ+Xhj7mQ 1GpxazBHoeycHDgbHSbG56NfpF/Nnhc= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 46DFDAE07 for ; Wed, 30 Sep 2020 01:57:07 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v3 40/49] btrfs: disk-io: support subpage metadata csum calculation at write time Date: Wed, 30 Sep 2020 09:55:30 +0800 Message-Id: <20200930015539.48867-41-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Add a new helper, csum_dirty_subpage_buffers(), to iterate through all possible extent buffers in one bvec. Also extract the code to calculate csum for one extent buffer into csum_one_extent_buffer(), so that both the existing csum_dirty_buffer() and the new csum_dirty_subpage_buffers() can reuse the same routine. Signed-off-by: Qu Wenruo --- fs/btrfs/disk-io.c | 103 ++++++++++++++++++++++++++++++++++----------- 1 file changed, 79 insertions(+), 24 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index d31999978821..9aa68e2344e1 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -490,35 +490,13 @@ static int btree_read_extent_buffer_pages(struct extent_buffer *eb, return ret; } -/* - * checksum a dirty tree block before IO. This has extra checks to make sure - * we only fill in the checksum field in the first page of a multi-page block - */ - -static int csum_dirty_buffer(struct btrfs_fs_info *fs_info, struct bio_vec *bvec) +static int csum_one_extent_buffer(struct extent_buffer *eb) { - struct extent_buffer *eb; - struct page *page = bvec->bv_page; - u64 start = page_offset(page); - u64 found_start; + struct btrfs_fs_info *fs_info = eb->fs_info; u8 result[BTRFS_CSUM_SIZE]; u16 csum_size = btrfs_super_csum_size(fs_info->super_copy); int ret; - eb = (struct extent_buffer *)page->private; - if (page != eb->pages[0]) - return 0; - - found_start = btrfs_header_bytenr(eb); - /* - * Please do not consolidate these warnings into a single if. - * It is useful to know what went wrong. - */ - if (WARN_ON(found_start != start)) - return -EUCLEAN; - if (WARN_ON(!PageUptodate(page))) - return -EUCLEAN; - ASSERT(memcmp_extent_buffer(eb, fs_info->fs_devices->metadata_uuid, offsetof(struct btrfs_header, fsid), BTRFS_FSID_SIZE) == 0); @@ -543,6 +521,83 @@ static int csum_dirty_buffer(struct btrfs_fs_info *fs_info, struct bio_vec *bvec return 0; } +/* + * Do all the csum calculation and extra sanity checks on all extent + * buffers in the bvec. + */ +static int csum_dirty_subpage_buffers(struct btrfs_fs_info *fs_info, + struct bio_vec *bvec) +{ + struct page *page = bvec->bv_page; + u64 page_start = page_offset(page); + u64 start = page_start + bvec->bv_offset; + u64 end = start + bvec->bv_len - 1; + u64 cur = start; + int ret = 0; + + while (cur <= end) { + struct extent_io_tree *io_tree = info_to_btree_io_tree(fs_info); + struct extent_buffer *eb; + + ret = btrfs_find_first_subpage_eb(fs_info, &eb, cur, end, 0); + if (ret > 0) { + ret = 0; + break; + } + + /* + * Here we can't use PageUptodate() to check the status. + * As one page is uptodate only when all its extent buffers + * are uptodate, and no holes between them. + * So here we use EXTENT_UPTODATE bit to make sure the exntent + * buffer is uptodate. + */ + if (WARN_ON(test_range_bit(io_tree, eb->start, + eb->start + eb->len - 1, EXTENT_UPTODATE, 1, + NULL) == 0)) + return -EUCLEAN; + if (WARN_ON(cur != btrfs_header_bytenr(eb))) + return -EUCLEAN; + + ret = csum_one_extent_buffer(eb); + if (ret < 0) + return ret; + cur = eb->start + eb->len; + } + return ret; +} + +/* + * checksum a dirty tree block before IO. This has extra checks to make sure + * we only fill in the checksum field in the first page of a multi-page block + */ +static int csum_dirty_buffer(struct btrfs_fs_info *fs_info, struct bio_vec *bvec) +{ + struct extent_buffer *eb; + struct page *page = bvec->bv_page; + u64 start = page_offset(page) + bvec->bv_offset; + u64 found_start; + + if (btrfs_is_subpage(fs_info)) + return csum_dirty_subpage_buffers(fs_info, bvec); + + eb = (struct extent_buffer *)page->private; + if (page != eb->pages[0]) + return 0; + + found_start = btrfs_header_bytenr(eb); + /* + * Please do not consolidate these warnings into a single if. + * It is useful to know what went wrong. + */ + if (WARN_ON(found_start != start)) + return -EUCLEAN; + if (WARN_ON(!PageUptodate(page))) + return -EUCLEAN; + + return csum_one_extent_buffer(eb); +} + static int check_tree_block_fsid(struct extent_buffer *eb) { struct btrfs_fs_info *fs_info = eb->fs_info; From patchwork Wed Sep 30 01:55:31 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807643 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4EE4D618 for ; Wed, 30 Sep 2020 01:57:12 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2FF6C2145D for ; Wed, 30 Sep 2020 01:57:12 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="O59lWBor" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729969AbgI3B5L (ORCPT ); Tue, 29 Sep 2020 21:57:11 -0400 Received: from mx2.suse.de ([195.135.220.15]:50998 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729777AbgI3B5K (ORCPT ); Tue, 29 Sep 2020 21:57:10 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601431029; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1SB1pqWVr7NbERMw+v2ieT057OZaaR08zrX57by8fIA=; b=O59lWBorFAMsXV3dXhSJUjhD1x/RxgtKLXP4oBJGqltxtm7tBEUgVHfOzCJ5PZNlsCRSQ3 oq0N1VMkrHv5iL2g6UgEEliV6xMsXVSJhXtSKFfCKVTIdV8fO+ec61tePrX5mKXmNylS3A 0+cy7L8WD5KXDdhwZs7YinKfYx5k9xc= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 181CDAE07 for ; Wed, 30 Sep 2020 01:57:09 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v3 41/49] btrfs: extent_io: prevent extent_state from being merged for btree io tree Date: Wed, 30 Sep 2020 09:55:31 +0800 Message-Id: <20200930015539.48867-42-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org For incoming subpage metadata rw support, prevent extent_state from being merged for btree io tree. The main cause is set_extent_buffer_dirty(). In the following call chain, we could fall into the situation where we have to call set_extent_dirty() with atomic context: alloc_reserved_tree_block() |- path->leave_spinning = 1; |- btrfs_insert_empty_item() |- btrfs_search_slot() | Now the path has all its tree block spinning locked |- setup_items_for_insert(); |- btrfs_unlock_up_safe(path, 1); | Now path->nodes[0] still spin locked |- btrfs_mark_buffer_dirty(leaf); |- set_extent_buffer_dirty() Since set_extent_buffer_dirty() is in fact a pretty common call, just fall back to GFP_ATOMIC allocation used in __set_extent_bit() may exhause the pool sooner than we expected. So this patch goes another direction, by not merging all extent_state for subpage btree io tree. Since for subpage btree io tree, all in tree extent buffers has EXTENT_HAS_TREE_BLOCK bit set during its lifespan, as long as extent_state is not merged, each extent buffer would has its own extent_state, so that set/clear_extent_bit() can reuse existing extent buffer extent_state, without allocating new memory. The cost is obvious, around 150 bytes per subpage extent buffer. But considering for subpage extent buffer, we saved 15 page pointers, this should save 120 bytes, so the net cost is just 30 bytes per subpage extent buffer, which should be acceptable. Signed-off-by: Qu Wenruo --- fs/btrfs/disk-io.c | 14 ++++++++++++-- fs/btrfs/extent-io-tree.h | 14 ++++++++++++++ fs/btrfs/extent_io.c | 19 ++++++++++++++----- 3 files changed, 40 insertions(+), 7 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 9aa68e2344e1..e466c30b52c8 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -2326,11 +2326,21 @@ static void btrfs_init_btree_inode(struct btrfs_fs_info *fs_info) /* * For subpage size support, btree inode tracks EXTENT_UPTODATE for * its IO. + * + * And never merge extent states to make all set/clear operation never + * to allocate memory, except the initial EXTENT_HAS_TREE_BLOCK bit. + * This adds extra ~150 bytes for each extent buffer. + * + * TODO: Josef's rwsem rework on tree lock would kill the leave_spining + * case, and then we can revert this behavior. */ - if (btrfs_is_subpage(fs_info)) + if (btrfs_is_subpage(fs_info)) { BTRFS_I(inode)->io_tree.track_uptodate = true; - else + BTRFS_I(inode)->io_tree.never_merge = true; + } else { BTRFS_I(inode)->io_tree.track_uptodate = false; + BTRFS_I(inode)->io_tree.never_merge = false; + } extent_map_tree_init(&BTRFS_I(inode)->extent_tree); BTRFS_I(inode)->io_tree.ops = &btree_extent_io_ops; diff --git a/fs/btrfs/extent-io-tree.h b/fs/btrfs/extent-io-tree.h index c4e73c84ba34..5c0a66146f05 100644 --- a/fs/btrfs/extent-io-tree.h +++ b/fs/btrfs/extent-io-tree.h @@ -62,6 +62,20 @@ struct extent_io_tree { u64 dirty_bytes; bool track_uptodate; + /* + * Never to merge extent_state. + * + * This allows any set/clear function to be execute in atomic context + * without allocating extra memory. + * The cost is extra memory usage. + * + * Should only be used for subpage btree io tree, which mostly adds per + * extent buffer memory usage. + * + * Default: false. + */ + bool never_merge; + /* Who owns this io tree, should be one of IO_TREE_* */ u8 owner; diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 5750a3b92777..d9a05979396d 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -285,6 +285,7 @@ void extent_io_tree_init(struct btrfs_fs_info *fs_info, spin_lock_init(&tree->lock); tree->private_data = private_data; tree->owner = owner; + tree->never_merge = false; if (owner == IO_TREE_INODE_FILE_EXTENT) lockdep_set_class(&tree->lock, &file_extent_tree_class); } @@ -480,11 +481,18 @@ static inline struct rb_node *tree_search(struct extent_io_tree *tree, } /* - * utility function to look for merge candidates inside a given range. + * Utility function to look for merge candidates inside a given range. * Any extents with matching state are merged together into a single - * extent in the tree. Extents with EXTENT_IO in their state field - * are not merged because the end_io handlers need to be able to do - * operations on them without sleeping (or doing allocations/splits). + * extent in the tree. + * + * Except the following cases: + * - extent_state with EXTENT_LOCK or EXTENT_BOUNDARY bit set + * Those extents are not merged because end_io handlers need to be able + * to do operations on them without sleeping (or doing allocations/splits) + * + * - extent_io_tree with never_merge bit set + * Same reason as above, but extra call sites may have spinlock/rwlock hold, + * and we don't want to abuse GFP_ATOMIC. * * This should be called with the tree lock held. */ @@ -494,7 +502,8 @@ static void merge_state(struct extent_io_tree *tree, struct extent_state *other; struct rb_node *other_node; - if (state->state & (EXTENT_LOCKED | EXTENT_BOUNDARY)) + if (state->state & (EXTENT_LOCKED | EXTENT_BOUNDARY) || + tree->never_merge) return; other_node = rb_prev(&state->rb_node); From patchwork Wed Sep 30 01:55:32 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807645 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2C3356CB for ; Wed, 30 Sep 2020 01:57:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0E2D221531 for ; Wed, 30 Sep 2020 01:57:14 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="i2aF//8O" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729974AbgI3B5N (ORCPT ); Tue, 29 Sep 2020 21:57:13 -0400 Received: from mx2.suse.de ([195.135.220.15]:51032 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729777AbgI3B5M (ORCPT ); Tue, 29 Sep 2020 21:57:12 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601431030; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ZfWv7koY0dyceo8JE3xCSKQAF7wUxt/o89+5X+bjTrg=; b=i2aF//8OFdKjoYSQbifEfyw0g9VaZlxJr2uTThJ3NeTRuTiW6Cpnffs4y5/07t5BKW/LWf wKrsx2XKf0sq9SVI/UYuMahCD/MZ2IdXn1stHVebGPP87Nrxm9pGkwdapkVMWst8aLM9qO YQfYwUMSNV29Y+OqL3O/8epAkxNJmU8= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id D3830AF99 for ; Wed, 30 Sep 2020 01:57:10 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v3 42/49] btrfs: extent_io: make set_extent_buffer_dirty() to support subpage sized metadata Date: Wed, 30 Sep 2020 09:55:32 +0800 Message-Id: <20200930015539.48867-43-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org For set_extent_buffer_dirty() to support subpage sized metadata, we only need to call set_extent_dirty(). As any dirty extent buffer in the page would make the whole page dirty, we can re-use the existing routine without problem, just need to add above call of set_extent_buffer_dirty(). Now since a page is dirty if any extent buffer in it is dirty, the WARN_ON() in alloc_extent_buffer() can be falsely triggered, also update the WARN_ON(PageDirty()) check into assert_eb_range_not_dirty() to support subpage case. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 35 ++++++++++++++++++++++++++++++++++- 1 file changed, 34 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index d9a05979396d..ae7ab7364115 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -5354,6 +5354,22 @@ struct extent_buffer *alloc_test_extent_buffer(struct btrfs_fs_info *fs_info, } #endif +static void assert_eb_range_not_dirty(struct extent_buffer *eb, + struct page *page) +{ + struct btrfs_fs_info *fs_info = eb->fs_info; + + if (btrfs_is_subpage(fs_info) && page->mapping) { + struct extent_io_tree *io_tree = info_to_btree_io_tree(fs_info); + + WARN_ON(test_range_bit(io_tree, eb->start, + eb->start + eb->len - 1, EXTENT_DIRTY, 0, + NULL)); + } else { + WARN_ON(PageDirty(page)); + } +} + struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, u64 start) { @@ -5426,12 +5442,13 @@ struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info, * drop the ref the old guy had. */ ClearPagePrivate(p); + assert_eb_range_not_dirty(eb, p); WARN_ON(PageDirty(p)); put_page(p); } attach_extent_buffer_page(eb, p); spin_unlock(&mapping->private_lock); - WARN_ON(PageDirty(p)); + assert_eb_range_not_dirty(eb, p); eb->pages[i] = p; if (!PageUptodate(p)) uptodate = 0; @@ -5651,6 +5668,22 @@ bool set_extent_buffer_dirty(struct extent_buffer *eb) for (i = 0; i < num_pages; i++) set_page_dirty(eb->pages[i]); + /* + * For subpage size, also set the sector aligned EXTENT_DIRTY range for + * btree io tree + */ + if (btrfs_is_subpage(eb->fs_info)) { + struct extent_io_tree *io_tree = + info_to_btree_io_tree(eb->fs_info); + + /* + * set_extent_buffer_dirty() can be called with + * path->leave_spinning == 1, in that case we can't sleep. + */ + set_extent_dirty(io_tree, eb->start, eb->start + eb->len - 1, + GFP_ATOMIC); + } + #ifdef CONFIG_BTRFS_DEBUG for (i = 0; i < num_pages; i++) ASSERT(PageDirty(eb->pages[i])); From patchwork Wed Sep 30 01:55:33 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807647 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2009A618 for ; Wed, 30 Sep 2020 01:57:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0017D21531 for ; Wed, 30 Sep 2020 01:57:15 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="QsPJ4M/7" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729977AbgI3B5P (ORCPT ); Tue, 29 Sep 2020 21:57:15 -0400 Received: from mx2.suse.de ([195.135.220.15]:51056 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729777AbgI3B5O (ORCPT ); Tue, 29 Sep 2020 21:57:14 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601431033; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=OCsQBMcW4SJhU+/UsLUy+uWecoTQnfXw0RIjNfGJC0A=; b=QsPJ4M/7bf3oNZXXiNgoNDA9cL3qYs7P6kpM8eyCO1nQv9u0YBow0lwK3IT10quuFLAbu5 klFX+mkssJapfO/gsqGFKNU/Z1FE+pRlL+bhAJSI2sls6Mzllm+MCVNJeuI2sEXJ9JIvl3 blwBrvoBVfB5mpNoGifKeaH1uD2m0Co= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 046C2AE07 for ; Wed, 30 Sep 2020 01:57:13 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v3 43/49] btrfs: extent_io: add subpage support for clear_extent_buffer_dirty() Date: Wed, 30 Sep 2020 09:55:33 +0800 Message-Id: <20200930015539.48867-44-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org To support subpage metadata, clear_extent_buffer_dirty() needs to clear the page dirty if and only if all extent buffers in the page range are no longer dirty. This is pretty different from the exist clear_extent_buffer_dirty() routine, so add a new helper function, clear_subpage_extent_buffer_dirty() to do this for subpage metadata. Also since the main part of clearing page dirty code is still the same, extract that into btree_clear_page_dirty() so that it can be utilized for both cases. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 47 +++++++++++++++++++++++++++++++++----------- 1 file changed, 35 insertions(+), 12 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index ae7ab7364115..07dec345f662 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -5622,30 +5622,53 @@ void free_extent_buffer_stale(struct extent_buffer *eb) release_extent_buffer(eb); } +static void btree_clear_page_dirty(struct page *page) +{ + ASSERT(PageDirty(page)); + + lock_page(page); + clear_page_dirty_for_io(page); + xa_lock_irq(&page->mapping->i_pages); + if (!PageDirty(page)) + __xa_clear_mark(&page->mapping->i_pages, + page_index(page), PAGECACHE_TAG_DIRTY); + xa_unlock_irq(&page->mapping->i_pages); + ClearPageError(page); + unlock_page(page); +} + +static void clear_subpage_extent_buffer_dirty(const struct extent_buffer *eb) +{ + struct btrfs_fs_info *fs_info = eb->fs_info; + struct extent_io_tree *io_tree = info_to_btree_io_tree(fs_info); + struct page *page = eb->pages[0]; + u64 page_start = page_offset(page); + u64 page_end = page_start + PAGE_SIZE - 1; + int ret; + + clear_extent_dirty(io_tree, eb->start, eb->start + eb->len - 1, NULL); + ret = test_range_bit(io_tree, page_start, page_end, EXTENT_DIRTY, 0, NULL); + /* All extent buffers in the page range is cleared now */ + if (ret == 0 && PageDirty(page)) + btree_clear_page_dirty(page); + WARN_ON(atomic_read(&eb->refs) == 0); +} + void clear_extent_buffer_dirty(const struct extent_buffer *eb) { int i; int num_pages; struct page *page; + if (btrfs_is_subpage(eb->fs_info)) + return clear_subpage_extent_buffer_dirty(eb); num_pages = num_extent_pages(eb); for (i = 0; i < num_pages; i++) { page = eb->pages[i]; if (!PageDirty(page)) continue; - - lock_page(page); - WARN_ON(!PagePrivate(page)); - - clear_page_dirty_for_io(page); - xa_lock_irq(&page->mapping->i_pages); - if (!PageDirty(page)) - __xa_clear_mark(&page->mapping->i_pages, - page_index(page), PAGECACHE_TAG_DIRTY); - xa_unlock_irq(&page->mapping->i_pages); - ClearPageError(page); - unlock_page(page); + btree_clear_page_dirty(page); } WARN_ON(atomic_read(&eb->refs) == 0); } From patchwork Wed Sep 30 01:55:34 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807649 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 70F0B6CB for ; Wed, 30 Sep 2020 01:57:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5532821531 for ; Wed, 30 Sep 2020 01:57:19 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="A5IGfnoG" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729984AbgI3B5S (ORCPT ); Tue, 29 Sep 2020 21:57:18 -0400 Received: from mx2.suse.de ([195.135.220.15]:51092 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729777AbgI3B5R (ORCPT ); Tue, 29 Sep 2020 21:57:17 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601431035; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2D4tuPYMVI49qB8ctr0d9pHlZR4ZnqLI8lYfLMyEsC0=; b=A5IGfnoGNi9H045BYs+nIVa+hrSNy7yQklyMjj1+h6b/glmcAb9zrhidFSTkJvLfXpiR28 BoyrKScI2ZBFHDr8Jaf7K1JLEqPQ/+EmHvapDYLZbv165+alwevT9JndBSN/HcHN8689zT DuEE6K7nRjb2xTYSqtWe3nu5vrLdq9Q= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id DF0D8AE07 for ; Wed, 30 Sep 2020 01:57:15 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v3 44/49] btrfs: extent_io: make set_btree_ioerr() accept extent buffer Date: Wed, 30 Sep 2020 09:55:34 +0800 Message-Id: <20200930015539.48867-45-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Current set_btree_ioerr() only accepts @page parameter and grabs extent buffer from page::private. This works fine for sector size == PAGE_SIZE case, but not for subpage case. Adds an extra parameter, @eb, for callers to pass extent buffer to this function, so that subpage code can reuse this function. Also since we are here, change how we grab "fs_info->flags" by using the fs_info directly. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 16 +++++++--------- 1 file changed, 7 insertions(+), 9 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 07dec345f662..f80ba4c13fe6 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -3907,10 +3907,9 @@ static noinline_for_stack int lock_extent_buffer_for_io(struct extent_buffer *eb return ret; } -static void set_btree_ioerr(struct page *page) +static void set_btree_ioerr(struct page *page, struct extent_buffer *eb) { - struct extent_buffer *eb = (struct extent_buffer *)page->private; - struct btrfs_fs_info *fs_info; + struct btrfs_fs_info *fs_info = eb->fs_info; SetPageError(page); if (test_and_set_bit(EXTENT_BUFFER_WRITE_ERR, &eb->bflags)) @@ -3920,7 +3919,6 @@ static void set_btree_ioerr(struct page *page) * If we error out, we should add back the dirty_metadata_bytes * to make it consistent. */ - fs_info = eb->fs_info; percpu_counter_add_batch(&fs_info->dirty_metadata_bytes, eb->len, fs_info->dirty_metadata_batch); @@ -3964,13 +3962,13 @@ static void set_btree_ioerr(struct page *page) */ switch (eb->log_index) { case -1: - set_bit(BTRFS_FS_BTREE_ERR, &eb->fs_info->flags); + set_bit(BTRFS_FS_BTREE_ERR, &fs_info->flags); break; case 0: - set_bit(BTRFS_FS_LOG1_ERR, &eb->fs_info->flags); + set_bit(BTRFS_FS_LOG1_ERR, &fs_info->flags); break; case 1: - set_bit(BTRFS_FS_LOG2_ERR, &eb->fs_info->flags); + set_bit(BTRFS_FS_LOG2_ERR, &fs_info->flags); break; default: BUG(); /* unexpected, logic error */ @@ -3995,7 +3993,7 @@ static void end_bio_extent_buffer_writepage(struct bio *bio) if (bio->bi_status || test_bit(EXTENT_BUFFER_WRITE_ERR, &eb->bflags)) { ClearPageUptodate(page); - set_btree_ioerr(page); + set_btree_ioerr(page, eb); } end_page_writeback(page); @@ -4051,7 +4049,7 @@ static noinline_for_stack int write_one_eb(struct extent_buffer *eb, end_bio_extent_buffer_writepage, 0, 0, 0, false); if (ret) { - set_btree_ioerr(p); + set_btree_ioerr(p, eb); if (PageWriteback(p)) end_page_writeback(p); if (atomic_sub_and_test(num_pages - i, &eb->io_pages)) From patchwork Wed Sep 30 01:55:35 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807651 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0F78A6CB for ; Wed, 30 Sep 2020 01:57:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DFCD22145D for ; Wed, 30 Sep 2020 01:57:20 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="drOCzSIJ" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729992AbgI3B5U (ORCPT ); Tue, 29 Sep 2020 21:57:20 -0400 Received: from mx2.suse.de ([195.135.220.15]:51114 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729982AbgI3B5T (ORCPT ); Tue, 29 Sep 2020 21:57:19 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601431037; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NLPSJIfF4OGBpgVkLYPrraboZachCT7dFv+9cncXCJo=; b=drOCzSIJYm7Qw742VjC/+AcmA0RyDh7Ilgu32rXkvClhHFEkrmNGiUJdPWIZ2uAnDVd3S0 xune9dki6sFMqcvsRvScJZKKuY20KvKzg/ncWb8CdWcAxW4LMU4o4I1IynUqIw98b5GkOp ODgFPiRZk+iqtYyG1ZK+cNBIqxlQITM= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id D3230AF99 for ; Wed, 30 Sep 2020 01:57:17 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v3 45/49] btrfs: extent_io: introduce write_one_subpage_eb() function Date: Wed, 30 Sep 2020 09:55:35 +0800 Message-Id: <20200930015539.48867-46-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org The new function, write_one_subpage_eb(), as a subroutine for subpage metadata write, will handle the extent buffer bio submission. The main difference between the new write_one_subpage_eb() and write_one_eb() is: - Page unlock write_one_subpage_eb() will not unlock the page, and it's the caller to lock the page , submit all extent buffers in the page, then unlock the page. - Extra EXTENT_* bits along with page status update New EXTENT_WRITEBACK bit is introduced to trace extent buffer write back. For page dirty bit, it will only be cleared if all dirty extent buffers in the page range has been cleaned. For page writeback bit, it will be set anyway, and cleared in the error path if no other extent buffers are under writeback. Signed-off-by: Qu Wenruo --- fs/btrfs/extent-io-tree.h | 3 ++ fs/btrfs/extent_io.c | 75 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 78 insertions(+) diff --git a/fs/btrfs/extent-io-tree.h b/fs/btrfs/extent-io-tree.h index 5c0a66146f05..12673bd50378 100644 --- a/fs/btrfs/extent-io-tree.h +++ b/fs/btrfs/extent-io-tree.h @@ -26,6 +26,9 @@ struct io_failure_record; /* For subpage btree io tree, indicates there is an in-tree extent buffer */ #define EXTENT_HAS_TREE_BLOCK (1U << 15) +/* For subpage btree io tree, indicates the range is under writeback */ +#define EXTENT_WRITEBACK (1U << 16) + #define EXTENT_DO_ACCOUNTING (EXTENT_CLEAR_META_RESV | \ EXTENT_CLEAR_DATA_RESV) #define EXTENT_CTLBITS (EXTENT_DO_ACCOUNTING) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index f80ba4c13fe6..736bc33a0e64 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -3124,6 +3124,7 @@ static int submit_extent_page(unsigned int opf, ASSERT(bio_ret); if (*bio_ret) { + bool force_merge = false; bool contig; bool can_merge = true; @@ -3149,6 +3150,7 @@ static int submit_extent_page(unsigned int opf, if (prev_bio_flags != bio_flags || !contig || !can_merge || force_bio_submit || bio_add_page(bio, page, io_size, pg_offset) < io_size) { + ASSERT(!force_merge); ret = submit_one_bio(bio, mirror_num, prev_bio_flags); if (ret < 0) { *bio_ret = NULL; @@ -4007,6 +4009,76 @@ static void end_bio_extent_buffer_writepage(struct bio *bio) bio_put(bio); } +/* + * Unlike the work in write_one_eb(), we won't unlock the page even we + * succeeded submitting the extent buffer. + * It's callers responsibility to unlock the page after all extent + * + * Caller should still call write_one_eb() other than this function directly. + * As write_one_eb() has extra prepration before submitting the extent buffer. + */ +static int write_one_subpage_eb(struct extent_buffer *eb, + struct writeback_control *wbc, + struct extent_page_data *epd) +{ + struct btrfs_fs_info *fs_info = eb->fs_info; + struct extent_state *cached = NULL; + struct extent_io_tree *io_tree = info_to_btree_io_tree(fs_info); + struct page *page = eb->pages[0]; + u64 page_start = page_offset(page); + u64 page_end = page_start + PAGE_SIZE - 1; + unsigned int write_flags = wbc_to_write_flags(wbc) | REQ_META; + bool no_dirty_ebs = false; + int ret; + + ASSERT(PageLocked(page)); + + /* Convert the EXTENT_DIRTY to EXTENT_WRITEBACK for this eb */ + ret = convert_extent_bit(io_tree, eb->start, eb->start + eb->len - 1, + EXTENT_WRITEBACK, EXTENT_DIRTY, &cached); + if (ret < 0) + return ret; + /* + * Only clear page dirty if there is no dirty extent buffer in the + * page range + */ + if (!test_range_bit(io_tree, page_start, page_end, EXTENT_DIRTY, 0, + cached)) { + clear_page_dirty_for_io(page); + no_dirty_ebs = true; + } + /* Any extent buffer writeback will mark the full page writeback */ + set_page_writeback(page); + + ret = submit_extent_page(REQ_OP_WRITE | write_flags, wbc, page, + eb->start, eb->len, eb->start - page_offset(page), + &epd->bio, end_bio_extent_buffer_writepage, 0, 0, 0, + false); + if (ret) { + clear_extent_bit(io_tree, eb->start, eb->start + eb->len - 1, + EXTENT_WRITEBACK, 0, 0, &cached); + set_btree_ioerr(page, eb); + if (PageWriteback(page) && + !test_range_bit(io_tree, page_start, page_end, + EXTENT_WRITEBACK, 0, cached)) + end_page_writeback(page); + + if (atomic_dec_and_test(&eb->io_pages)) + end_extent_buffer_writeback(eb); + free_extent_state(cached); + return -EIO; + } + free_extent_state(cached); + /* + * Submission finishes without problem, if no eb is dirty anymore, we + * have submitted a page. + * Update the nr_written in wbc. + */ + if (no_dirty_ebs) + update_nr_written(wbc, 1); + return ret; +} + static noinline_for_stack int write_one_eb(struct extent_buffer *eb, struct writeback_control *wbc, struct extent_page_data *epd) @@ -4038,6 +4110,9 @@ static noinline_for_stack int write_one_eb(struct extent_buffer *eb, memzero_extent_buffer(eb, start, end - start); } + if (btrfs_is_subpage(eb->fs_info)) + return write_one_subpage_eb(eb, wbc, epd); + for (i = 0; i < num_pages; i++) { struct page *p = eb->pages[i]; From patchwork Wed Sep 30 01:55:36 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807653 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CFCDA618 for ; Wed, 30 Sep 2020 01:57:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B0C1F21531 for ; Wed, 30 Sep 2020 01:57:22 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="JSYyByXh" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729995AbgI3B5V (ORCPT ); Tue, 29 Sep 2020 21:57:21 -0400 Received: from mx2.suse.de ([195.135.220.15]:51142 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729982AbgI3B5V (ORCPT ); Tue, 29 Sep 2020 21:57:21 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601431039; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vCU1gdUHsGiMPxMjDplrQJ0PCIqbC/J9C9iJRZpzwRU=; b=JSYyByXh2ExLFDDIqv1CFEzUFiaNqSZuX+rZcc1NQZAeW8VI77w6l+zUNge4gVNE7nPYLt NSO+Ssgt2nh0G66wQJYWxbuU5CKQ3fzd1QaTW9I3HuGSw2wgHN3JLH6m93UE8tfpxKkEC5 wWGm/9SQ0l/pAZ1A1BKL00ShkG0a5/M= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id A7A5CAF99 for ; Wed, 30 Sep 2020 01:57:19 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v3 46/49] btrfs: extent_io: make lock_extent_buffer_for_io() subpage compatible Date: Wed, 30 Sep 2020 09:55:36 +0800 Message-Id: <20200930015539.48867-47-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org To support subpage metadata locking, the following aspects are modified: - Locking sequence For regular sectorsize, we lock extent buffer first, then lock each page. For subpage sectorsize, we can't do that anymore, but let the caller to lock the whole page first, then lock each extent buffer in the page. - Extent io tree locking For subpage metadata, we also lock the range in btree io tree. This allow the endio function to get unmerged extent_state, so that in endio function we don't need to allocate memory in atomic context. This also follows the behavior in metadata read path. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 47 +++++++++++++++++++++++++++++++++++++++----- 1 file changed, 42 insertions(+), 5 deletions(-) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index 736bc33a0e64..be8c863f7806 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -3803,6 +3803,9 @@ static void end_extent_buffer_writeback(struct extent_buffer *eb) * Lock extent buffer status and pages for write back. * * May try to flush write bio if we can't get the lock. + * For subpage extent buffer, caller is responsible to lock the page, we won't + * flush write bio, which can cause extent buffers in the same page submitted + * to different bios. * * Return 0 if the extent buffer doesn't need to be submitted. * (E.g. the extent buffer is not dirty) @@ -3813,26 +3816,47 @@ static noinline_for_stack int lock_extent_buffer_for_io(struct extent_buffer *eb struct extent_page_data *epd) { struct btrfs_fs_info *fs_info = eb->fs_info; + struct extent_io_tree *io_tree = info_to_btree_io_tree(fs_info); int i, num_pages, failed_page_nr; + bool extent_locked = false; int flush = 0; int ret = 0; + if (btrfs_is_subpage(fs_info)) { + /* + * For subpage extent buffer write, caller is responsible to + * lock the page first. + */ + ASSERT(PageLocked(eb->pages[0])); + + /* + * Also lock the range so that endio can always get unmerged + * extent_state. + */ + ret = lock_extent(io_tree, eb->start, eb->start + eb->len - 1); + if (ret < 0) + goto out; + extent_locked = true; + } + if (!btrfs_try_tree_write_lock(eb)) { ret = flush_write_bio(epd); if (ret < 0) - return ret; + goto out; flush = 1; btrfs_tree_lock(eb); } if (test_bit(EXTENT_BUFFER_WRITEBACK, &eb->bflags)) { btrfs_tree_unlock(eb); - if (!epd->sync_io) - return 0; + if (!epd->sync_io) { + ret = 0; + goto out; + } if (!flush) { ret = flush_write_bio(epd); if (ret < 0) - return ret; + goto out; flush = 1; } while (1) { @@ -3860,11 +3884,19 @@ static noinline_for_stack int lock_extent_buffer_for_io(struct extent_buffer *eb ret = 1; } else { spin_unlock(&eb->refs_lock); + if (extent_locked) + unlock_extent(io_tree, eb->start, + eb->start + eb->len - 1); } btrfs_tree_unlock(eb); - if (!ret) + /* + * Either the tree does not need to be submitted, or we're + * submitting subpage extent buffer. + * Either we we don't need to lock the page(s). + */ + if (!ret || btrfs_is_subpage(fs_info)) return ret; num_pages = num_extent_pages(eb); @@ -3906,6 +3938,11 @@ static noinline_for_stack int lock_extent_buffer_for_io(struct extent_buffer *eb fs_info->dirty_metadata_batch); btrfs_clear_header_flag(eb, BTRFS_HEADER_FLAG_WRITTEN); btrfs_tree_unlock(eb); + /* Subpage should never reach this routine */ + ASSERT(!btrfs_is_subpage(fs_info)); +out: + if (extent_locked) + unlock_extent(io_tree, eb->start, eb->start + eb->len - 1); return ret; } From patchwork Wed Sep 30 01:55:37 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807655 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4D45B6CB for ; Wed, 30 Sep 2020 01:57:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2BEF821531 for ; Wed, 30 Sep 2020 01:57:25 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="f+Lz62Y7" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729999AbgI3B5X (ORCPT ); Tue, 29 Sep 2020 21:57:23 -0400 Received: from mx2.suse.de ([195.135.220.15]:51206 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729982AbgI3B5X (ORCPT ); Tue, 29 Sep 2020 21:57:23 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601431041; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ED2Rm9yQjkzhTx0dFHOgYPzukyrCoriVlWajPBcFhiA=; b=f+Lz62Y7WP0mDWWuWDJT8l3LX8rkNqrMvvFpFqxEuymw5ookNiRPk7onmrSMuy87Cm6iT6 jgcoVJG+icjt9FDrWbBSDXxl1oGtuEv9oGa0wI8/UrAgSt1JHJH/26QN5mTucAmC8xsAy0 mDgmAUYxG+6VR7mIC6f8FzwkU9UosdM= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id AFFCEAF99 for ; Wed, 30 Sep 2020 01:57:21 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v3 47/49] btrfs: extent_io: introduce submit_btree_subpage() to submit a page for subpage metadata write Date: Wed, 30 Sep 2020 09:55:37 +0800 Message-Id: <20200930015539.48867-48-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org The new function, submit_btree_subpage(), will submit all the dirty extent buffers in the page. The major difference between submit_btree_page() is: - Page locking sequence Now we lock page first then lock extent buffers, thus we don't need to unlock the page just after writting one extent buffer. The page get unlocked after we have submitted all extent buffers. - Bio submission Since one extent buffer is ensured to be contained into one page, we call submit_extent_page() directly. Signed-off-by: Qu Wenruo --- fs/btrfs/extent_io.c | 69 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 69 insertions(+) diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index be8c863f7806..bd79b3531a75 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -4185,6 +4185,72 @@ static noinline_for_stack int write_one_eb(struct extent_buffer *eb, return ret; } +/* + * A helper to submit one subpage btree page. + * + * The main difference between submit_btree_page() is: + * - Page locking sequence + * Page are locked first, then lock extent buffers + * + * - Flush write bio + * We only flush bio if we may be unable to fit current extent buffers into + * current bio. + * + * Return >=0 for the number of submitted extent buffers. + * Return <0 for fatal error. + */ +static int submit_btree_subpage(struct page *page, + struct writeback_control *wbc, + struct extent_page_data *epd) +{ + struct btrfs_fs_info *fs_info = page_to_fs_info(page); + int submitted = 0; + u64 page_start = page_offset(page); + u64 page_end = page_start + PAGE_SIZE - 1; + u64 cur = page_start; + int ret; + + /* Lock the page first */ + lock_page(page); + + /* Then lock and write each extent buffers in the range */ + while (cur <= page_end) { + struct extent_buffer *eb; + + ret = btrfs_find_first_subpage_eb(fs_info, &eb, cur, page_end, + EXTENT_DIRTY); + if (ret > 0) + break; + ret = atomic_inc_not_zero(&eb->refs); + if (!ret) + continue; + + cur = eb->start + eb->len; + ret = lock_extent_buffer_for_io(eb, epd); + if (ret == 0) { + free_extent_buffer(eb); + continue; + } + if (ret < 0) { + free_extent_buffer(eb); + goto cleanup; + } + ret = write_one_eb(eb, wbc, epd); + free_extent_buffer(eb); + if (ret < 0) + goto cleanup; + submitted++; + } + unlock_page(page); + return submitted; + +cleanup: + unlock_page(page); + /* We hit error, end bio for the submitted extent buffers */ + end_write_bio(epd, ret); + return ret; +} + /* * A helper to submit a btree page. * @@ -4210,6 +4276,9 @@ static int submit_btree_page(struct page *page, struct writeback_control *wbc, if (!PagePrivate(page)) return 0; + if (btrfs_is_subpage(page_to_fs_info(page))) + return submit_btree_subpage(page, wbc, epd); + spin_lock(&mapping->private_lock); if (!PagePrivate(page)) { spin_unlock(&mapping->private_lock); From patchwork Wed Sep 30 01:55:38 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807657 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 68FD36CB for ; Wed, 30 Sep 2020 01:57:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 434AF21531 for ; Wed, 30 Sep 2020 01:57:27 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="uO2HzYK8" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730003AbgI3B50 (ORCPT ); Tue, 29 Sep 2020 21:57:26 -0400 Received: from mx2.suse.de ([195.135.220.15]:51226 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729807AbgI3B50 (ORCPT ); Tue, 29 Sep 2020 21:57:26 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601431044; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=IqpMqqh3Lzfwb451Y/vAeZwZNKfeSfdU5Ztayvb8MyE=; b=uO2HzYK86GgCIAErJpGygX7duIaMU75xuzrEgSYClKtE+X/CVinVVo+NuHIyaSI1cZQBMy 8BKWVpQcz16+1J4Dzl0Byj/iHEEfkKECDhDVic5am2YOyZOuKtJVmOl5zedUvK7/nvlQaE PXHUQikVtTvV7AbmAgUTtzM7rJONKwA= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 85B33AE07 for ; Wed, 30 Sep 2020 01:57:24 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v3 48/49] btrfs: extent_io: introduce end_bio_subpage_eb_writepage() function Date: Wed, 30 Sep 2020 09:55:38 +0800 Message-Id: <20200930015539.48867-49-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org The new function, end_bio_subpage_eb_writepage(), will handle the metadata writeback endio. The major difference involved is: - Page Writeback clear We will only clear the page writeback bit after all extent buffers in the same page has finished their writeback. This means we need to check the EXTENT_WRITEBACK bit for the page range. - Clear EXTENT_WRITEBACK bit for btree inode This is the new bit for btree inode io tree. It emulates the same page status, but in sector size aligned range. The new bit is remapped from EXTENT_DEFRAG, as defrag is impossible for btree inode, it should be pretty safe to use. Also since the new endio function needs quite some extent io tree operations, change btree_submit_bio_hook() to queue the endio work into metadata endio workqueue. Signed-off-by: Qu Wenruo --- fs/btrfs/disk-io.c | 21 ++++++++++++- fs/btrfs/extent_io.c | 70 ++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 90 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index e466c30b52c8..2ac980f739dc 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -961,6 +961,7 @@ blk_status_t btrfs_wq_submit_bio(struct inode *inode, struct bio *bio, async->mirror_num = mirror_num; async->submit_bio_start = submit_bio_start; + btrfs_init_work(&async->work, run_one_async_start, run_one_async_done, run_one_async_free); @@ -1031,7 +1032,25 @@ static blk_status_t btree_submit_bio_hook(struct inode *inode, struct bio *bio, if (ret) goto out_w_error; ret = btrfs_map_bio(fs_info, bio, mirror_num); - } else if (!async) { + if (ret < 0) + goto out_w_error; + return ret; + } + + /* + * For subpage metadata write, the endio involes several + * extent_io_tree operations, which is not suitable for endio + * context. + * Thus we need to queue them into endio workqueue. + */ + if (btrfs_is_subpage(fs_info)) { + ret = btrfs_bio_wq_end_io(fs_info, bio, + BTRFS_WQ_ENDIO_METADATA); + if (ret) + goto out_w_error; + } + + if (!async) { ret = btree_csum_one_bio(bio); if (ret) goto out_w_error; diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c index bd79b3531a75..fc882daf6899 100644 --- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -4014,6 +4014,73 @@ static void set_btree_ioerr(struct page *page, struct extent_buffer *eb) } } +/* + * The endio function for subpage extent buffer write. + * + * Unlike end_bio_extent_buffer_writepage(), we only call end_page_writeback() + * after all extent buffers in the page has finished their writeback. + */ +static void end_bio_subpage_eb_writepage(struct bio *bio) +{ + struct bio_vec *bvec; + struct bvec_iter_all iter_all; + + ASSERT(!bio_flagged(bio, BIO_CLONED)); + bio_for_each_segment_all(bvec, bio, iter_all) { + struct page *page = bvec->bv_page; + struct btrfs_fs_info *fs_info = page_to_fs_info(page); + struct extent_buffer *eb; + u64 page_start = page_offset(page); + u64 page_end = page_start + PAGE_SIZE - 1; + u64 bvec_start = page_offset(page) + bvec->bv_offset; + u64 bvec_end = bvec_start + bvec->bv_len - 1; + u64 cur_bytenr = bvec_start; + + ASSERT(IS_ALIGNED(bvec->bv_len, fs_info->nodesize)); + + /* Iterate through all extent buffers in the range */ + while (cur_bytenr <= bvec_end) { + struct extent_state *cached = NULL; + struct extent_io_tree *io_tree = + info_to_btree_io_tree(fs_info); + int done; + int ret; + + ret = btrfs_find_first_subpage_eb(fs_info, &eb, + cur_bytenr, bvec_end, 0); + if (ret > 0) + break; + + cur_bytenr = eb->start + eb->len; + + ASSERT(test_bit(EXTENT_BUFFER_WRITEBACK, &eb->bflags)); + done = atomic_dec_and_test(&eb->io_pages); + ASSERT(done); + + if (bio->bi_status || + test_bit(EXTENT_BUFFER_WRITE_ERR, &eb->bflags)) { + ClearPageUptodate(page); + set_btree_ioerr(page, eb); + } + + clear_extent_bit(io_tree, eb->start, + eb->start + eb->len - 1, + EXTENT_WRITEBACK | EXTENT_LOCKED, 1, 0, + &cached); + /* + * Only end the page writeback if there is no extent + * buffer under writeback in the page anymore + */ + if (!test_range_bit(io_tree, page_start, page_end, + EXTENT_WRITEBACK, 0, cached)) + end_page_writeback(page); + free_extent_state(cached); + end_extent_buffer_writeback(eb); + } + } + bio_put(bio); +} + static void end_bio_extent_buffer_writepage(struct bio *bio) { struct bio_vec *bvec; @@ -4021,6 +4088,9 @@ static void end_bio_extent_buffer_writepage(struct bio *bio) int done; struct bvec_iter_all iter_all; + if (btrfs_is_subpage(page_to_fs_info(bio_first_page_all(bio)))) + return end_bio_subpage_eb_writepage(bio); + ASSERT(!bio_flagged(bio, BIO_CLONED)); bio_for_each_segment_all(bvec, bio, iter_all) { struct page *page = bvec->bv_page; From patchwork Wed Sep 30 01:55:39 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qu Wenruo X-Patchwork-Id: 11807659 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F128B618 for ; Wed, 30 Sep 2020 01:57:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C43182145D for ; Wed, 30 Sep 2020 01:57:30 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=suse.com header.i=@suse.com header.b="HLqC+DYB" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730007AbgI3B5a (ORCPT ); Tue, 29 Sep 2020 21:57:30 -0400 Received: from mx2.suse.de ([195.135.220.15]:51266 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729807AbgI3B53 (ORCPT ); Tue, 29 Sep 2020 21:57:29 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1601431048; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=TTFqggKtYVDEOn8mkUOTe1+I+Lj715IkCZSDlviaapo=; b=HLqC+DYBwMoh8SIpgqiepFKgKEDqrwgzsoeBb/4XHMKbAvAWEjIkc5RGclxyrppRRaBa1w Rb4xUNRN3bE3d1cQ6SWG06RFOPtgRFQ0ZXBMWL+3YTcVlQOynFbF5A4GPYaD4yU3VwCvbr zmiVJmYx/KsfUQmy2UPg3PxG6vB7B0Y= Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id F1D63AF99 for ; Wed, 30 Sep 2020 01:57:27 +0000 (UTC) From: Qu Wenruo To: linux-btrfs@vger.kernel.org Subject: [PATCH v3 49/49] btrfs: support metadata read write for test Date: Wed, 30 Sep 2020 09:55:39 +0800 Message-Id: <20200930015539.48867-50-wqu@suse.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200930015539.48867-1-wqu@suse.com> References: <20200930015539.48867-1-wqu@suse.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org Signed-off-by: Qu Wenruo --- fs/btrfs/disk-io.c | 10 ---------- fs/btrfs/file.c | 4 ++++ fs/btrfs/super.c | 7 ------- 3 files changed, 4 insertions(+), 17 deletions(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 2ac980f739dc..8b5f65e6c5fa 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -3335,16 +3335,6 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device goto fail_alloc; } - /* For 4K sector size support, it's only read-only yet */ - if (PAGE_SIZE == SZ_64K && sectorsize == SZ_4K) { - if (!sb_rdonly(sb) || btrfs_super_log_root(disk_super)) { - btrfs_err(fs_info, - "subpage sector size only support RO yet"); - err = -EINVAL; - goto fail_alloc; - } - } - ret = btrfs_init_workqueues(fs_info, fs_devices); if (ret) { err = ret; diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 4507c3d09399..0785e16ba243 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -1937,6 +1937,10 @@ static ssize_t btrfs_file_write_iter(struct kiocb *iocb, loff_t oldsize; int clean_page = 0; + /* Don't support data write yet */ + if (btrfs_is_subpage(fs_info)) + return -EOPNOTSUPP; + if (!(iocb->ki_flags & IOCB_DIRECT) && (iocb->ki_flags & IOCB_NOWAIT)) return -EOPNOTSUPP; diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 743a2fadf4ee..25967ecaaf0a 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -1922,13 +1922,6 @@ static int btrfs_remount(struct super_block *sb, int *flags, char *data) ret = -EINVAL; goto restore; } - if (btrfs_is_subpage(fs_info)) { - btrfs_warn(fs_info, - "read-write mount is not yet allowed for sector size %u page size %lu", - fs_info->sectorsize, PAGE_SIZE); - ret = -EINVAL; - goto restore; - } ret = btrfs_cleanup_fs_roots(fs_info); if (ret)