From patchwork Tue Jun 21 06:40:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naohiro Aota X-Patchwork-Id: 12888716 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1FE69C433EF for ; Tue, 21 Jun 2022 06:44:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345976AbiFUGoH (ORCPT ); Tue, 21 Jun 2022 02:44:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56988 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345836AbiFUGoG (ORCPT ); Tue, 21 Jun 2022 02:44:06 -0400 Received: from esa3.hgst.iphmx.com (esa3.hgst.iphmx.com [216.71.153.141]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 145FC1B7B3 for ; Mon, 20 Jun 2022 23:44:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1655793843; x=1687329843; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=0ye5mulVRtQm8sh5Cog5V39eofpB8oOVMFQt6emy1IQ=; b=HHCZjpvtQJVnEbCdyqVFwvn9oHA/GZYtot6zA2hrJpiistW6jm02GByU y1Y4l36cGCLTDzSj2ryDNkqnjHjpXOpx9LJCBwS4TLNNGYfIMsP23csts DkApFm+l5JyxsWLBiLJaIUklf3kXEgmK4aBD2JGYcXJWaRiJGmwNGX3/s xkN2v+GJt5d0ev3JnMw95Alf/ufzlRz1YX/kgkwxp0Hxx3XVxCngnFpJn rTxzMrgbIho18r1beMr4T3Afv4OxHVZct6pBY4Imd2QlTVcsQ5AF4Ex61 /COHOd3Isgc7alPpGV2pqBNoCAeUzK7XUw7TKi7k7QX2PJ+EL/jAo4+rd w==; X-IronPort-AV: E=Sophos;i="5.92,209,1650902400"; d="scan'208";a="208550409" Received: from uls-op-cesaip02.wdc.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 21 Jun 2022 14:44:02 +0800 IronPort-SDR: T2y7C5o1whEkery/Crkjb8QSI6+kCdN+JLqDkj99XW8zm8HH6e5eYrvvQnRflZWOiWVJ5ItKI9 Y0oxCCfZ2q9CQC/T/cFMs7Zr60TCrFSpO+WIs5ENEznsJ8vVj5QxQgmyztpd8u2qmcam7iONzE eZj6lkR1L4HKfw8jeQ/ewRN8aagciulI+iw0S3eQSAOc6eWaz5cpKm4GsvvBgf7y7xtHL57A3G B78dBLAIFAOhTZyH3mwB7w6JtFOhHASgLhg9wHXfco1lj4+WUbyxK3lcjACUdmi9D5gjlviCNZ ocNMhuEETSJ2LFqR1bVcJn1/ Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 20 Jun 2022 23:02:11 -0700 IronPort-SDR: 40LYZqPjiQFWaSlqdUZnPKAgFZLJHDbrDcqu7/05YUwQyER/z+S66FY2Gn3C/DuXBi0U2s1CNv rwxVqG7LWcadpwtn60eiDLmZ3ok9Dz60q3sG8QnoDeNaBfMuEiYev5egss5C2Brfdq3VJKJ7n7 eiPtRmp/YlX7RJV099EyzZ2DuXb1ODla1Umz0q4V1P6anUo5okgmPjd7U03tuG4TpErT5poETC crUjmpVrojIUrB8UxMgaanrD/iFvVBjRAAQjHXIyx/WebgIDkfDEqs5AKOC/HXvOCeNE6VWPjR pb8= WDCIronportException: Internal Received: from dtjcyy2.ad.shared (HELO naota-xeon.wdc.com) ([10.225.48.142]) by uls-op-cesaip02.wdc.com with ESMTP; 20 Jun 2022 23:44:04 -0700 From: Naohiro Aota To: linux-btrfs@vger.kernel.org Cc: fdmanana@kernel.org, Naohiro Aota Subject: [PATCH v2 1/4] btrfs: ensure pages are unlocked on cow_file_range() failure Date: Tue, 21 Jun 2022 15:40:59 +0900 Message-Id: <3ae46fc143bf820a39e7cf425d675d6825af849d.1655791781.git.naohiro.aota@wdc.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org There is a hung_task report on zoned btrfs like below. https://github.com/naota/linux/issues/59 [ 726.328648] INFO: task rocksdb:high0:11085 blocked for more than 241 seconds. [ 726.329839] Not tainted 5.16.0-rc1+ #1 [ 726.330484] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 726.331603] task:rocksdb:high0 state:D stack: 0 pid:11085 ppid: 11082 flags:0x00000000 [ 726.331608] Call Trace: [ 726.331611] [ 726.331614] __schedule+0x2e5/0x9d0 [ 726.331622] schedule+0x58/0xd0 [ 726.331626] io_schedule+0x3f/0x70 [ 726.331629] __folio_lock+0x125/0x200 [ 726.331634] ? find_get_entries+0x1bc/0x240 [ 726.331638] ? filemap_invalidate_unlock_two+0x40/0x40 [ 726.331642] truncate_inode_pages_range+0x5b2/0x770 [ 726.331649] truncate_inode_pages_final+0x44/0x50 [ 726.331653] btrfs_evict_inode+0x67/0x480 [ 726.331658] evict+0xd0/0x180 [ 726.331661] iput+0x13f/0x200 [ 726.331664] do_unlinkat+0x1c0/0x2b0 [ 726.331668] __x64_sys_unlink+0x23/0x30 [ 726.331670] do_syscall_64+0x3b/0xc0 [ 726.331674] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 726.331677] RIP: 0033:0x7fb9490a171b [ 726.331681] RSP: 002b:00007fb943ffac68 EFLAGS: 00000246 ORIG_RAX: 0000000000000057 [ 726.331684] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fb9490a171b [ 726.331686] RDX: 00007fb943ffb040 RSI: 000055a6bbe6ec20 RDI: 00007fb94400d300 [ 726.331687] RBP: 00007fb943ffad00 R08: 0000000000000000 R09: 0000000000000000 [ 726.331688] R10: 0000000000000031 R11: 0000000000000246 R12: 00007fb943ffb000 [ 726.331690] R13: 00007fb943ffb040 R14: 0000000000000000 R15: 00007fb943ffd260 [ 726.331693] While we debug the issue, we found running fstests generic/551 on 5GB non-zoned null_blk device in the emulated zoned mode also had a similar hung issue. Also, we can reproduce the same symptom with an error injected cow_file_range() setup. The hang occurs when cow_file_range() fails in the middle of allocation. cow_file_range() called from do_allocation_zoned() can split the give region ([start, end]) for allocation depending on current block group usages. When btrfs can allocate bytes for one part of the split regions but fails for the other region (e.g. because of -ENOSPC), we return the error leaving the pages in the succeeded regions locked. Technically, this occurs only when @unlock == 0. Otherwise, we unlock the pages in an allocated region after creating an ordered extent. Considering the callers of cow_file_range(unlock=0) won't write out the pages, we can unlock the pages on error exit from cow_file_range(). So, we can ensure all the pages except @locked_page are unlocked on error case. In summary, cow_file_range now behaves like this: - page_started == 1 (return value) - All the pages are unlocked. IO is started. - unlock == 1 - All the pages except @locked_page are unlocked in any case - unlock == 0 - On success, all the pages are locked for writing out them - On failure, all the pages except @locked_page are unlocked Fixes: 42c011000963 ("btrfs: zoned: introduce dedicated data write path for zoned filesystems") CC: stable@vger.kernel.org # 5.12+ Signed-off-by: Naohiro Aota --- fs/btrfs/inode.c | 72 ++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 64 insertions(+), 8 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 6d5351454f11..4f453f6077fe 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -1133,6 +1133,28 @@ static u64 get_extent_allocation_hint(struct btrfs_inode *inode, u64 start, * *page_started is set to one if we unlock locked_page and do everything * required to start IO on it. It may be clean and already done with * IO when we return. + * + * When unlock == 1, we unlock the pages in successfully allocated regions. + * When unlock == 0, we leave them locked for writing them out. + * + * However, we unlock all the pages except @locked_page in case of failure. + * + * In summary, page locking state will be as follow: + * + * - page_started == 1 (return value) + * - All the pages are unlocked. IO is started. + * - Note that this can happen only on success + * - unlock == 1 + * - All the pages except @locked_page are unlocked in any case + * - unlock == 0 + * - On success, all the pages are locked for writing out them + * - On failure, all the pages except @locked_page are unlocked + * + * When a failure happens in the second or later iteration of the + * while-loop, the ordered extents created in previous iterations are kept + * intact. So, the caller must clean them up by calling + * btrfs_cleanup_ordered_extents(). See btrfs_run_delalloc_range() for + * example. */ static noinline int cow_file_range(struct btrfs_inode *inode, struct page *locked_page, @@ -1142,6 +1164,7 @@ static noinline int cow_file_range(struct btrfs_inode *inode, struct btrfs_root *root = inode->root; struct btrfs_fs_info *fs_info = root->fs_info; u64 alloc_hint = 0; + u64 orig_start = start; u64 num_bytes; unsigned long ram_size; u64 cur_alloc_size = 0; @@ -1329,18 +1352,44 @@ static noinline int cow_file_range(struct btrfs_inode *inode, btrfs_dec_block_group_reservations(fs_info, ins.objectid); btrfs_free_reserved_extent(fs_info, ins.objectid, ins.offset, 1); out_unlock: + /* + * Now, we have three regions to clean up, as shown below. + * + * |-------(1)----|---(2)---|-------------(3)----------| + * `- orig_start `- start `- start + cur_alloc_size `- end + * + * We process each region below. + */ + clear_bits = EXTENT_LOCKED | EXTENT_DELALLOC | EXTENT_DELALLOC_NEW | EXTENT_DEFRAG | EXTENT_CLEAR_META_RESV; page_ops = PAGE_UNLOCK | PAGE_START_WRITEBACK | PAGE_END_WRITEBACK; + /* - * If we reserved an extent for our delalloc range (or a subrange) and - * failed to create the respective ordered extent, then it means that - * when we reserved the extent we decremented the extent's size from - * the data space_info's bytes_may_use counter and incremented the - * space_info's bytes_reserved counter by the same amount. We must make - * sure extent_clear_unlock_delalloc() does not try to decrement again - * the data space_info's bytes_may_use counter, therefore we do not pass - * it the flag EXTENT_CLEAR_DATA_RESV. + * For the range (1). We have already instantiated the ordered extents + * for this region. They are cleaned up by + * btrfs_cleanup_ordered_extents() in e.g, + * btrfs_run_delalloc_range(). EXTENT_LOCKED | EXTENT_DELALLOC are + * already cleared in the above loop. And, EXTENT_DELALLOC_NEW | + * EXTENT_DEFRAG | EXTENT_CLEAR_META_RESV are handled by the cleanup + * function. + * + * However, in case of unlock == 0, we still need to unlock the pages + * (except @locked_page) to ensure all the pages are unlocked. + */ + if (!unlock && orig_start < start) + extent_clear_unlock_delalloc(inode, orig_start, start - 1, + locked_page, 0, page_ops); + + /* + * For the range (2). If we reserved an extent for our delalloc range + * (or a subrange) and failed to create the respective ordered extent, + * then it means that when we reserved the extent we decremented the + * extent's size from the data space_info's bytes_may_use counter and + * incremented the space_info's bytes_reserved counter by the same + * amount. We must make sure extent_clear_unlock_delalloc() does not try + * to decrement again the data space_info's bytes_may_use counter, + * therefore we do not pass it the flag EXTENT_CLEAR_DATA_RESV. */ if (extent_reserved) { extent_clear_unlock_delalloc(inode, start, @@ -1352,6 +1401,13 @@ static noinline int cow_file_range(struct btrfs_inode *inode, if (start >= end) goto out; } + + /* + * For the range (3). We never touched the region. In addition to the + * clear_bits above, we add EXTENT_CLEAR_DATA_RESV to release the data + * space_info's bytes_may_use counter, reserved in + * btrfs_check_data_free_space(). + */ extent_clear_unlock_delalloc(inode, start, end, locked_page, clear_bits | EXTENT_CLEAR_DATA_RESV, page_ops); From patchwork Tue Jun 21 06:41:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naohiro Aota X-Patchwork-Id: 12888717 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3EE60C43334 for ; Tue, 21 Jun 2022 06:44:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1346024AbiFUGoH (ORCPT ); Tue, 21 Jun 2022 02:44:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57000 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345498AbiFUGoG (ORCPT ); Tue, 21 Jun 2022 02:44:06 -0400 Received: from esa3.hgst.iphmx.com (esa3.hgst.iphmx.com [216.71.153.141]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1335E1B7B1 for ; Mon, 20 Jun 2022 23:44:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1655793845; x=1687329845; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=uEsr0emZcvqZpNaKlHb50eTZgkPIqmlIf01Dsc7MAsQ=; b=F44u9SFWmSRcQNovfSJVaQRtZtp1yBsvJsvPi6RzKqZJgVGrEgCxQ9UO FlXPuJzFJ4zhRl81DO0/92p4Kx3PRtWvB+kCrvj7KhUcUCtx9JiIf5Ri3 OmBaIinabi2S9QRB0yNi+i/eDZVRDM3xG+znJGaghfJisZMY4Scm9Am+a B+dwwms2rqSeInvZz2wtC4w21QDTjHwB300rK1iPAMUQpYful1rnt7M4P P7harNNUUnrEcEBpzWEhXNORiXvpWxgst/7w52yb0LSAkap5aeBkkgnCH OmooUgl8HdLsgPBrPkdYYxoX7rStcb+UIGgphvMnpHZs94XDJnULUvMcZ Q==; X-IronPort-AV: E=Sophos;i="5.92,209,1650902400"; d="scan'208";a="208550411" Received: from uls-op-cesaip02.wdc.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 21 Jun 2022 14:44:03 +0800 IronPort-SDR: 52nx6uL9Dpnopo3yh/TYDn1oTh0uihv1W9CGE4kY+sR/BiBNWzpei+0d6vRm3m61qWrbc8nsSA hb5GLxnsOQALkRl59BoIeYw+VTaK4PHW632gdS1xoh7aE9ijygOw8xDTdz7LKhJ/egZchTU2F5 mtdCtBWgXtK15sUcf8O53QaTVK+I7MicNMnAaBs7CSXx1lLdaPiOLOEtvkmyzyI0mZJhLLly22 zP5OkvWS8GmELo59k7ntQWQsySJmiVrOH/iKmFnedtVefHt+nzkNQSIUkX3/LXLVXz1yl/94Gj otsaXSMih7saAofFqRqea10O Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 20 Jun 2022 23:02:12 -0700 IronPort-SDR: SWIStqUcT8oi9lqWIWD6RwaP+MhQ7R7q+sAxyd0eVJWC5pd2vhpUSffmAs6FRJvUHiMuszdH55 7aF0Hv6xLwMJrTE2E/4Uxk0nxAqqru4ctnuoRFlvfwj3+C77nj5wJz0IY7Tpc2/iye1yfYCBG8 u1+itkEegaEyGjsLO1NoxK4NcZNQ28kgrlUWYCGAzgCCua3DCuOsj3IlUAFRRxXpwe6tYJutNE e2iHBwoVnUQVP/LASr9ad3YthGF8gW/Mzy1W4zyz9JyQVH1/QYumpaI3y6PG1EMBJHaUnJcVDN KD4= WDCIronportException: Internal Received: from dtjcyy2.ad.shared (HELO naota-xeon.wdc.com) ([10.225.48.142]) by uls-op-cesaip02.wdc.com with ESMTP; 20 Jun 2022 23:44:04 -0700 From: Naohiro Aota To: linux-btrfs@vger.kernel.org Cc: fdmanana@kernel.org, Naohiro Aota Subject: [PATCH v2 2/4] btrfs: extend btrfs_cleanup_ordered_extens for NULL locked_page Date: Tue, 21 Jun 2022 15:41:00 +0900 Message-Id: X-Mailer: git-send-email 2.35.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org btrfs_cleanup_ordered_extents() assumes locked_page to be non-NULL, so it is not usable for submit_uncompressed_range() which can habe NULL locked_page. This commit supports locked_page == NULL case. Also, it rewrites redundant "page_offset(locked_page)". Signed-off-by: Naohiro Aota --- fs/btrfs/inode.c | 36 +++++++++++++++++++++--------------- 1 file changed, 21 insertions(+), 15 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 4f453f6077fe..326150552e57 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -195,11 +195,14 @@ static inline void btrfs_cleanup_ordered_extents(struct btrfs_inode *inode, { unsigned long index = offset >> PAGE_SHIFT; unsigned long end_index = (offset + bytes - 1) >> PAGE_SHIFT; - u64 page_start = page_offset(locked_page); - u64 page_end = page_start + PAGE_SIZE - 1; - + u64 page_start, page_end; struct page *page; + if (locked_page) { + page_start = page_offset(locked_page); + page_end = page_start + PAGE_SIZE - 1; + } + while (index <= end_index) { /* * For locked page, we will call end_extent_writepage() on it @@ -212,7 +215,7 @@ static inline void btrfs_cleanup_ordered_extents(struct btrfs_inode *inode, * btrfs_mark_ordered_io_finished() would skip the accounting * for the page range, and the ordered extent will never finish. */ - if (index == (page_offset(locked_page) >> PAGE_SHIFT)) { + if (locked_page && index == (page_start >> PAGE_SHIFT)) { index++; continue; } @@ -231,17 +234,20 @@ static inline void btrfs_cleanup_ordered_extents(struct btrfs_inode *inode, put_page(page); } - /* The locked page covers the full range, nothing needs to be done */ - if (bytes + offset <= page_offset(locked_page) + PAGE_SIZE) - return; - /* - * In case this page belongs to the delalloc range being instantiated - * then skip it, since the first page of a range is going to be - * properly cleaned up by the caller of run_delalloc_range - */ - if (page_start >= offset && page_end <= (offset + bytes - 1)) { - bytes = offset + bytes - page_offset(locked_page) - PAGE_SIZE; - offset = page_offset(locked_page) + PAGE_SIZE; + if (locked_page) { + /* The locked page covers the full range, nothing needs to be done */ + if (bytes + offset <= page_start + PAGE_SIZE) + return; + /* + * In case this page belongs to the delalloc range being + * instantiated then skip it, since the first page of a range is + * going to be properly cleaned up by the caller of + * run_delalloc_range + */ + if (page_start >= offset && page_end <= (offset + bytes - 1)) { + bytes = offset + bytes - page_offset(locked_page) - PAGE_SIZE; + offset = page_offset(locked_page) + PAGE_SIZE; + } } return __endio_write_update_ordered(inode, offset, bytes, false); From patchwork Tue Jun 21 06:41:01 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naohiro Aota X-Patchwork-Id: 12888718 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D6E02CCA473 for ; Tue, 21 Jun 2022 06:44:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345498AbiFUGoI (ORCPT ); Tue, 21 Jun 2022 02:44:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57006 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345836AbiFUGoH (ORCPT ); Tue, 21 Jun 2022 02:44:07 -0400 Received: from esa3.hgst.iphmx.com (esa3.hgst.iphmx.com [216.71.153.141]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ACDCF1B7B3 for ; Mon, 20 Jun 2022 23:44:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1655793846; x=1687329846; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=LKqaWlqTZiWZBdfoT4tAItaDYJo4IZ0507qTs4YTTSQ=; b=mlL48zgh28Szzq/rcYfsfajgMyuQn54epiMiIPFcnOEqd0ExjTfeid+6 XiJKHt2gpXV+G8aWDu6FTt+qxW5mJmhjeWF0aRyNK+e6uVhdPDfnLA3D4 kyePElXyYlzIDESZMNURt6p3dtDurMOkyowrPPvFbINUUutyUuiRmo1Fo Qb0iALK+yHax0XWr2G+j/Ab6Grbb/FYonb0g2UoUAtCEHMApYVcRGHXmS SY1TS9dM7G+cvDnDpTO6avNccna1Hd/a3ST0BFvvSvdKsgjZk08qs2JIZ a0k9QS3mJUeeXiZi5y04ej5SUJEueZIcdu3BGfXllkcQ4GZuwKnMolvWs Q==; X-IronPort-AV: E=Sophos;i="5.92,209,1650902400"; d="scan'208";a="208550415" Received: from uls-op-cesaip02.wdc.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 21 Jun 2022 14:44:04 +0800 IronPort-SDR: 4SOPKNm2JE29Iw8x1E/ocvW1BMAejPfUMzzs37WHE7zU2wkK+nM6cFP6eX+LJr7UhVFT6XpLst PqqYAPu/086YHcqtoVKVwr+N5ZwscYRZdW8+Ej30btChJmjXlGM1FLfLV07WqwT8wBLyWP/GjD pyHJEd7lMtK+10jxvZNkX3d9fG4QCbxiERGjAL/z8JFP2x8RYvVk77mqoz5NzjX4mtDrlmPQV4 jcYBIYm49MPuhAGKTa3FnnxGOwCh6WRIGVAP5DAZJskEgAnSPmxh9opPxROz1hpFdEbzAsKXq/ mHWqniaXgIkvCIe58beghD4d Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 20 Jun 2022 23:02:13 -0700 IronPort-SDR: /nYkH5hbUGNpVKaGbnm0Fi81YE4l5gf/Muc/zhGUIY9jfvNZ+aKxWF4J64PJKxijnCySvTjz2G ftpZp/atat/lK6Lp4JiV2CUmh8d7Q+3JO00y79uCgSygLlRKDDYc5HQphizW+cA3qLhwxOTeAL F4xO2T/W04HFFsx9XQsbaYX4C2wqElmyv5MfYrPrNgRdKiA4bnZ+IZrLwfxAuRXBo1PVabnN0L lKIBhEILCXsX1chmMjw7EqAEiWoAscH+whQYP/eZBlPWo7vVVAQmzaUMZMzYlo3xrf03xK0aj8 8FQ= WDCIronportException: Internal Received: from dtjcyy2.ad.shared (HELO naota-xeon.wdc.com) ([10.225.48.142]) by uls-op-cesaip02.wdc.com with ESMTP; 20 Jun 2022 23:44:05 -0700 From: Naohiro Aota To: linux-btrfs@vger.kernel.org Cc: fdmanana@kernel.org, Naohiro Aota Subject: [PATCH v2 3/4] btrfs: fix error handling of fallbacked uncompress write Date: Tue, 21 Jun 2022 15:41:01 +0900 Message-Id: <7347f1de449c3a3f36690b816c2ded133508c5c2.1655791781.git.naohiro.aota@wdc.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org When cow_file_range() fails in the middle of the allocation loop, it unlocks the pages but leaves the ordered extents intact. Thus, we need to call btrfs_cleanup_ordered_extents() to finish the created ordered extents. Also, we need to call end_extent_writepage() if locked_page is available because btrfs_cleanup_ordered_extents() never process the region on the locked_page. Furthermore, we need to set the mapping as error if locked_page is unavailable before unlocking the pages, so that the errno is properly propagated to the userland. CC: stable@vger.kernel.org # 5.18+ Signed-off-by: Naohiro Aota --- I choose 5.18+ as the target as they are after refactoring and we can apply the series cleanly. Technically, older versions potentially have the same issue, but it might not happen actually. So, let's choose easy targets to apply. --- fs/btrfs/inode.c | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 326150552e57..38d8e6d78e77 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -933,8 +933,18 @@ static int submit_uncompressed_range(struct btrfs_inode *inode, goto out; } if (ret < 0) { - if (locked_page) + btrfs_cleanup_ordered_extents(inode, locked_page, start, end - start + 1); + if (locked_page) { + u64 page_start = page_offset(locked_page); + u64 page_end = page_start + PAGE_SIZE - 1; + + btrfs_page_set_error(inode->root->fs_info, locked_page, + page_start, PAGE_SIZE); + set_page_writeback(locked_page); + end_page_writeback(locked_page); + end_extent_writepage(locked_page, ret, page_start, page_end); unlock_page(locked_page); + } goto out; } @@ -1383,9 +1393,12 @@ static noinline int cow_file_range(struct btrfs_inode *inode, * However, in case of unlock == 0, we still need to unlock the pages * (except @locked_page) to ensure all the pages are unlocked. */ - if (!unlock && orig_start < start) + if (!unlock && orig_start < start) { + if (!locked_page) + mapping_set_error(inode->vfs_inode.i_mapping, ret); extent_clear_unlock_delalloc(inode, orig_start, start - 1, locked_page, 0, page_ops); + } /* * For the range (2). If we reserved an extent for our delalloc range From patchwork Tue Jun 21 06:41:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naohiro Aota X-Patchwork-Id: 12888719 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id ECF51C433EF for ; Tue, 21 Jun 2022 06:44:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1346045AbiFUGoJ (ORCPT ); Tue, 21 Jun 2022 02:44:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57014 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1346026AbiFUGoI (ORCPT ); Tue, 21 Jun 2022 02:44:08 -0400 Received: from esa3.hgst.iphmx.com (esa3.hgst.iphmx.com [216.71.153.141]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6655A1B7B1 for ; Mon, 20 Jun 2022 23:44:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1655793846; x=1687329846; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=0Pv8MSGCq1umYzKQ+m2R3VG4oIT0Y7tt8Hrc1rom7To=; b=iM+bKjJXsCeYuA4hOSVrwzYUkIWY62MHkfA7MS4jMDVp47fl7NADGBKS KimPC6mRCaOMgJVZSv7M4zj1GqiVwrStzzFOy5gK+0+Uexm+87osyscKN iAECIaK7A5Pxm1OqdhA8VLHYSnzWMFesJObSqmcyeayviCzDwbbRSXQY0 Xh6K4wKj/FwqOp/nMJQFmnc7rTvRbbQv0QWTIxNSrreSKgO1GoAv5lczC 2Tps1hUliD5yuQTjSvZ2ToOFHBSXqLw5IjdsFYuBWtbj54aGUFH5YaZE5 pWYObZfgY0QB9qGNXCmvpZcWesydWnx+3U2yDcm0GmxVTO4Av5s9GX8wH Q==; X-IronPort-AV: E=Sophos;i="5.92,209,1650902400"; d="scan'208";a="208550417" Received: from uls-op-cesaip02.wdc.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 21 Jun 2022 14:44:06 +0800 IronPort-SDR: tK5iB3fr5EOpCa8CPPge9jZWYqTUETRReBw4zuxpwci9wDojOgLPftc5JTLWB7sLlh3Xyfn/pI 8NFHPR6ALg3WC9vDTynWc/JuIS8HzwrGxSlwZ7F+sdsaJi/+o+CUWB8dqqmkY2zqtSGWMXgy3q bLG9U+jkq1WNWJfQJpudRUvjbiiA8fGbo8qlO+y31anfUC9diSBEYnQ0QJ6EUL6o7gQfs4hspL Thy/hoIg32fOtaWQHWAta8pxT6POXOf5OmbOZW9eDj+jmWJUaxccnDWb0UCof5hcHyfbbp2o0t iR0pzIjnN+nd+njhMrLeRY9D Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES128-GCM-SHA256; 20 Jun 2022 23:02:14 -0700 IronPort-SDR: 7V+C7vRHSNo7Xwh8rfQqkcOB36xcpur5FF+IoK2utGT4n0SHRMCPa5CpIkzXnkOYtbb+rXfXdm QWYFaHLLVJYgQOtUtHO0Zr+p8qkZF6/ZV2yhcFbKJMnNAkg6NKLljXkADz3qoT2i6eD5FGOh9V gUPpeU2IYHujjV+m/lnXTMKQkHV9Ij6HliO9O7DcmxY6EVFf5LTeO2VyZw+KYg5Z/l+PkTihxz 9uYuGBryIV+MEG4Y5oEArUAN9vyQN57gi+caTa3zm5/eBNfcjTvmrLb83aZ72jMHDUa/3WCuE0 9A4= WDCIronportException: Internal Received: from dtjcyy2.ad.shared (HELO naota-xeon.wdc.com) ([10.225.48.142]) by uls-op-cesaip02.wdc.com with ESMTP; 20 Jun 2022 23:44:06 -0700 From: Naohiro Aota To: linux-btrfs@vger.kernel.org Cc: fdmanana@kernel.org, Naohiro Aota , Filipe Manana Subject: [PATCH v2 4/4] btrfs: replace unnecessary goto with direct return at cow_file_range() Date: Tue, 21 Jun 2022 15:41:02 +0900 Message-Id: X-Mailer: git-send-email 2.35.1 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org The "goto out;"s in cow_file_range() just results in a simple "return ret;" which are not really useful. Replace them with proper direct "return"s. It also makes the success path vs failure path stands out. Signed-off-by: Naohiro Aota Reviewed-by: Filipe Manana --- fs/btrfs/inode.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 38d8e6d78e77..b9633b35a35f 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -1250,7 +1250,7 @@ static noinline int cow_file_range(struct btrfs_inode *inode, * inline extent or a compressed extent. */ unlock_page(locked_page); - goto out; + return 0; } else if (ret < 0) { goto out_unlock; } @@ -1359,8 +1359,7 @@ static noinline int cow_file_range(struct btrfs_inode *inode, if (ret) goto out_unlock; } -out: - return ret; + return 0; out_drop_extent_cache: btrfs_drop_extent_cache(inode, start, start + ram_size - 1, 0); @@ -1418,7 +1417,7 @@ static noinline int cow_file_range(struct btrfs_inode *inode, page_ops); start += cur_alloc_size; if (start >= end) - goto out; + return ret; } /* @@ -1430,7 +1429,7 @@ static noinline int cow_file_range(struct btrfs_inode *inode, extent_clear_unlock_delalloc(inode, start, end, locked_page, clear_bits | EXTENT_CLEAR_DATA_RESV, page_ops); - goto out; + return ret; } /*