From patchwork Mon Jun 6 14:32:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Amir Goldstein X-Patchwork-Id: 12870426 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 19894C43334 for ; Mon, 6 Jun 2022 14:33:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239753AbiFFOdK (ORCPT ); Mon, 6 Jun 2022 10:33:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51166 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239740AbiFFOdH (ORCPT ); Mon, 6 Jun 2022 10:33:07 -0400 Received: from mail-wr1-x42b.google.com (mail-wr1-x42b.google.com [IPv6:2a00:1450:4864:20::42b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AE6192AE2B; Mon, 6 Jun 2022 07:33:05 -0700 (PDT) Received: by mail-wr1-x42b.google.com with SMTP id k19so20109636wrd.8; Mon, 06 Jun 2022 07:33:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=DxI+KP/yP2c1dj72ZEf9XEZoYdpFMxSLMCxGwNL12Ug=; b=ZxjmAX1D1wGptVBPvIEy0WZ4RY48LUQLDudC+3I1cEwpRYGSLwv8N6C5DQi9BuiVRS I5rMYeFPtxm2N1gWgcZZUPgCm2yWgef718S03NaDH6Er7aVfmRK4Cdb3fE/HVSPW40lG Yh+vOPjiUTA2bwILsDfG/1tgo0O957E60vziSWS4rZXaj2t53I4pTJ7+qrIPSe133k2e k+ADnBZUBZYTDxBIPocWFfEchMk0cZuNBVqbqG+d9K/IXIs+dq7MGu6RNa5fql2xNG54 8mEjD0rT9N6WnKcIsBlyj8XST8Y3ZqNj7chfVPZWry574690n3mNuoZ3zUgBuGf+MhNB UWmA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=DxI+KP/yP2c1dj72ZEf9XEZoYdpFMxSLMCxGwNL12Ug=; b=ri6VUJXkDIAYCET0WcK3p2bmxFuO2XoMYeHY8ytXd6q7LDonC9KjQzOYw5t61MJkU1 JY1WdSaVHN+bHNIVByjDdhDhvNFt6wGvsBTKwAl2fZZM9IhXNOs/g8+94bvGdHbF8+Js qewFfbMop5XAAmoCADtubelDi2WjBRBKrfA/wQX/Ph46lXbF7PF0ceaNSneDqLKLvJkb HcwZ0qzfP5tY2ALHJbuBibu+7nQPHo6J7AOvSwK6mt49P8ribM1iFihMRwe8iiM0ZWbK My9KC2JL4OkWkYGKs67cZ6t7Inku0XrVg2GmfybMv8GyVVTnM3GwuvI4lJwn6VAEBkoj Zsxg== X-Gm-Message-State: AOAM530pkC8emzwPpU3l6vDWAralwNieawZNGVd4+A8+1DaVu5tkOXIE YuMHIdHMIPrEiD9knzlqrYdngQZHn/5VEA== X-Google-Smtp-Source: ABdhPJzTFxpKysC6TNqNDsDdVcL7T/zwy7+pR645h6jq5TR47wKLIJGik98Me059nC7Q4wj1TVptpA== X-Received: by 2002:adf:fb10:0:b0:207:af88:1eb9 with SMTP id c16-20020adffb10000000b00207af881eb9mr22351686wrr.238.1654525984228; Mon, 06 Jun 2022 07:33:04 -0700 (PDT) Received: from amir-ThinkPad-T480.ctera.local (bzq-166-168-31-246.red.bezeqint.net. [31.168.166.246]) by smtp.gmail.com with ESMTPSA id h24-20020a05600c145800b0039c54bb28f2sm1622958wmi.36.2022.06.06.07.33.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Jun 2022 07:33:03 -0700 (PDT) From: Amir Goldstein To: Greg Kroah-Hartman Cc: Sasha Levin , Dave Chinner , "Darrick J . Wong" , Christoph Hellwig , Brian Foster , Christian Brauner , Luis Chamberlain , Leah Rumancik , Adam Manzanares , linux-xfs@vger.kernel.org, stable@vger.kernel.org, Jeffrey Mitchell Subject: [PATCH 5.10 v2 1/8] xfs: set inode size after creating symlink Date: Mon, 6 Jun 2022 17:32:48 +0300 Message-Id: <20220606143255.685988-2-amir73il@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220606143255.685988-1-amir73il@gmail.com> References: <20220606143255.685988-1-amir73il@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Jeffrey Mitchell commit 8aa921a95335d0a8c8e2be35a44467e7c91ec3e4 upstream. When XFS creates a new symlink, it writes its size to disk but not to the VFS inode. This causes i_size_read() to return 0 for that symlink until it is re-read from disk, for example when the system is rebooted. I found this inconsistency while protecting directories with eCryptFS. The command "stat path/to/symlink/in/ecryptfs" will report "Size: 0" if the symlink was created after the last reboot on an XFS root. Call i_size_write() in xfs_symlink() Signed-off-by: Jeffrey Mitchell Reviewed-by: Darrick J. Wong Signed-off-by: Darrick J. Wong Reviewed-by: Christoph Hellwig Reviewed-by: Brian Foster Signed-off-by: Amir Goldstein --- fs/xfs/xfs_symlink.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/xfs/xfs_symlink.c b/fs/xfs/xfs_symlink.c index 8e88a7ca387e..8d3abf06c54f 100644 --- a/fs/xfs/xfs_symlink.c +++ b/fs/xfs/xfs_symlink.c @@ -300,6 +300,7 @@ xfs_symlink( } ASSERT(pathlen == 0); } + i_size_write(VFS_I(ip), ip->i_d.di_size); /* * Create the directory entry for the symlink. From patchwork Mon Jun 6 14:32:49 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Amir Goldstein X-Patchwork-Id: 12870427 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A0897CCA482 for ; Mon, 6 Jun 2022 14:33:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239760AbiFFOdK (ORCPT ); Mon, 6 Jun 2022 10:33:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51316 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239748AbiFFOdI (ORCPT ); Mon, 6 Jun 2022 10:33:08 -0400 Received: from mail-wm1-x32d.google.com (mail-wm1-x32d.google.com [IPv6:2a00:1450:4864:20::32d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B37EF2AE2C; Mon, 6 Jun 2022 07:33:07 -0700 (PDT) Received: by mail-wm1-x32d.google.com with SMTP id q15so4634172wmj.2; Mon, 06 Jun 2022 07:33:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=KtH8GLMcafEYSB20tONa+zJ9ioX5tusX1ZCZq7mMdkA=; b=gR2A8c7WofRsDAd9EPAoi+yXmS9SsEjsLvasD7U2oy6C89qQwmJJQgsKwDAmmMCIom 3SG6dXs0ANMl+te54nCmulT/E6wzc4cgnDp+d0ZEv8GNMtV7qhq/P+8u/WjfSUWd8rxn zJLT1dlNilBzLJW2hZQn/iqth63BWZJxTArm3JM+ofixVyGgGLKXtBYi4nfrW848eozh qDgXElzDZsnuZcgKYQD3S+Upka0f+obooNESRo3XENzp5hsMjJ1fIEpvfLACoh/i/L8C ALXdvKPG9MH72voghs2W0W/EJII4/BintopgP+ys1KyfXM2Hyz0KkComJH8izNBJpox/ QDww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=KtH8GLMcafEYSB20tONa+zJ9ioX5tusX1ZCZq7mMdkA=; b=IYUq56oLsqjwnAZkAjis3GwFnErPnUgD6aS6bb3v9Cqs/DP6PZBXqIGI+D/kfYLNnc 98Wwfa0RcLhxg2pFZCJ6IP/+uljL9qpbMzeehxBQdM9Xxzjy1/KYN4zwebnsyoVJ3lLv czHh708wxWRlTwnXk1S6ComlAtbZgJAP5OSlY58DCL+jb24qRowt7p1sUfJQyPkiw7z+ 0CGzumTjxJrKXOKimpp7DrSsYrDMcNaEgUHmpWffKKoBoo+TsgoIok7sor5HWwYSZ60H pLW8mJlx/hKMls807yIYtynEyRMQArgHQN3MJt5wFunlZ4vwmD5pSxXjOMJ6mYD48xv2 Z2yg== X-Gm-Message-State: AOAM530hOR3XkjR/Yvr5lL5Olvfldd2HGihXEtNF3BTkxBGy9AnlygpX xsDDg4XHJiElQk7Blwya8LPgmIH4Ky5QNg== X-Google-Smtp-Source: ABdhPJzHBSJy4GpckPylEDJZvQkhwZU0vMCaqZ0ZP+Ag7+wF7p2/TC6NpZzrDE3Ism5vYmg0adgKQQ== X-Received: by 2002:a7b:c777:0:b0:39c:4e1d:fd27 with SMTP id x23-20020a7bc777000000b0039c4e1dfd27mr6463863wmk.1.1654525985909; Mon, 06 Jun 2022 07:33:05 -0700 (PDT) Received: from amir-ThinkPad-T480.ctera.local (bzq-166-168-31-246.red.bezeqint.net. [31.168.166.246]) by smtp.gmail.com with ESMTPSA id h24-20020a05600c145800b0039c54bb28f2sm1622958wmi.36.2022.06.06.07.33.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Jun 2022 07:33:05 -0700 (PDT) From: Amir Goldstein To: Greg Kroah-Hartman Cc: Sasha Levin , Dave Chinner , "Darrick J . Wong" , Christoph Hellwig , Brian Foster , Christian Brauner , Luis Chamberlain , Leah Rumancik , Adam Manzanares , linux-xfs@vger.kernel.org, stable@vger.kernel.org, Gao Xiang , Allison Henderson , "Darrick J . Wong" , Bill O'Donnell Subject: [PATCH 5.10 v2 2/8] xfs: sync lazy sb accounting on quiesce of read-only mounts Date: Mon, 6 Jun 2022 17:32:49 +0300 Message-Id: <20220606143255.685988-3-amir73il@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220606143255.685988-1-amir73il@gmail.com> References: <20220606143255.685988-1-amir73il@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Brian Foster commit 50d25484bebe94320c49dd1347d3330c7063bbdb upstream. xfs_log_sbcount() syncs the superblock specifically to accumulate the in-core percpu superblock counters and commit them to disk. This is required to maintain filesystem consistency across quiesce (freeze, read-only mount/remount) or unmount when lazy superblock accounting is enabled because individual transactions do not update the superblock directly. This mechanism works as expected for writable mounts, but xfs_log_sbcount() skips the update for read-only mounts. Read-only mounts otherwise still allow log recovery and write out an unmount record during log quiesce. If a read-only mount performs log recovery, it can modify the in-core superblock counters and write an unmount record when the filesystem unmounts without ever syncing the in-core counters. This leaves the filesystem with a clean log but in an inconsistent state with regard to lazy sb counters. Update xfs_log_sbcount() to use the same logic xfs_log_unmount_write() uses to determine when to write an unmount record. This ensures that lazy accounting is always synced before the log is cleaned. Refactor this logic into a new helper to distinguish between a writable filesystem and a writable log. Specifically, the log is writable unless the filesystem is mounted with the norecovery mount option, the underlying log device is read-only, or the filesystem is shutdown. Drop the freeze state check because the update is already allowed during the freezing process and no context calls this function on an already frozen fs. Also, retain the shutdown check in xfs_log_unmount_write() to catch the case where the preceding log force might have triggered a shutdown. Signed-off-by: Brian Foster Reviewed-by: Gao Xiang Reviewed-by: Allison Henderson Reviewed-by: Darrick J. Wong Reviewed-by: Bill O'Donnell Reviewed-by: Darrick J. Wong Signed-off-by: Darrick J. Wong Signed-off-by: Amir Goldstein --- fs/xfs/xfs_log.c | 28 ++++++++++++++++++++-------- fs/xfs/xfs_log.h | 1 + fs/xfs/xfs_mount.c | 3 +-- 3 files changed, 22 insertions(+), 10 deletions(-) diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c index fa2d05e65ff1..b445e63cbc3c 100644 --- a/fs/xfs/xfs_log.c +++ b/fs/xfs/xfs_log.c @@ -347,6 +347,25 @@ xlog_tic_add_region(xlog_ticket_t *tic, uint len, uint type) tic->t_res_num++; } +bool +xfs_log_writable( + struct xfs_mount *mp) +{ + /* + * Never write to the log on norecovery mounts, if the block device is + * read-only, or if the filesystem is shutdown. Read-only mounts still + * allow internal writes for log recovery and unmount purposes, so don't + * restrict that case here. + */ + if (mp->m_flags & XFS_MOUNT_NORECOVERY) + return false; + if (xfs_readonly_buftarg(mp->m_log->l_targ)) + return false; + if (XFS_FORCED_SHUTDOWN(mp)) + return false; + return true; +} + /* * Replenish the byte reservation required by moving the grant write head. */ @@ -886,15 +905,8 @@ xfs_log_unmount_write( { struct xlog *log = mp->m_log; - /* - * Don't write out unmount record on norecovery mounts or ro devices. - * Or, if we are doing a forced umount (typically because of IO errors). - */ - if (mp->m_flags & XFS_MOUNT_NORECOVERY || - xfs_readonly_buftarg(log->l_targ)) { - ASSERT(mp->m_flags & XFS_MOUNT_RDONLY); + if (!xfs_log_writable(mp)) return; - } xfs_log_force(mp, XFS_LOG_SYNC); diff --git a/fs/xfs/xfs_log.h b/fs/xfs/xfs_log.h index 58c3fcbec94a..98c913da7587 100644 --- a/fs/xfs/xfs_log.h +++ b/fs/xfs/xfs_log.h @@ -127,6 +127,7 @@ int xfs_log_reserve(struct xfs_mount *mp, int xfs_log_regrant(struct xfs_mount *mp, struct xlog_ticket *tic); void xfs_log_unmount(struct xfs_mount *mp); int xfs_log_force_umount(struct xfs_mount *mp, int logerror); +bool xfs_log_writable(struct xfs_mount *mp); struct xlog_ticket *xfs_log_ticket_get(struct xlog_ticket *ticket); void xfs_log_ticket_put(struct xlog_ticket *ticket); diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c index 7110507a2b6b..a62b8a574409 100644 --- a/fs/xfs/xfs_mount.c +++ b/fs/xfs/xfs_mount.c @@ -1176,8 +1176,7 @@ xfs_fs_writable( int xfs_log_sbcount(xfs_mount_t *mp) { - /* allow this to proceed during the freeze sequence... */ - if (!xfs_fs_writable(mp, SB_FREEZE_COMPLETE)) + if (!xfs_log_writable(mp)) return 0; /* From patchwork Mon Jun 6 14:32:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Amir Goldstein X-Patchwork-Id: 12870428 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 34AA4CCA483 for ; Mon, 6 Jun 2022 14:33:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239748AbiFFOdL (ORCPT ); Mon, 6 Jun 2022 10:33:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51400 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239749AbiFFOdK (ORCPT ); Mon, 6 Jun 2022 10:33:10 -0400 Received: from mail-wm1-x332.google.com (mail-wm1-x332.google.com [IPv6:2a00:1450:4864:20::332]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E7DAA2AC5F; Mon, 6 Jun 2022 07:33:08 -0700 (PDT) Received: by mail-wm1-x332.google.com with SMTP id z17so2934020wmi.1; Mon, 06 Jun 2022 07:33:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=v9bZ/oogoWoPQX2w9BuFOFgcjiIV4xqYgwERl5n3obI=; b=e7oTz65RaFs+soZopzJhoh12f2yOpDyBOz5hlGHewlDIBVpTgGNjO5YNtwJlMg/LTe S1DBp1kZo6jBdGW1GJzlZpbIs9esnjos6E6EkqK4ut5rCjCkru2L/T7BB6dFKBkFC0PK ynWXNaX44E3rthVSUhrJv14oHB5RCTXbDE3tTnHudRznMGi6gTLhMJjOtUw6XEgejFFn iO662O9ThrLeS7o3zuSl4rdFQtMthtxvli9cCbRDcZSMK/ZpMjuk+1z0ps5vwk1b9Zpu dYigbVlClnhzxCsZxZ4Q6zvvXwuw/VNLR1jHfy/6t6ZFOd6o96MUWw2lz/S8cYSEuBNW AANQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=v9bZ/oogoWoPQX2w9BuFOFgcjiIV4xqYgwERl5n3obI=; b=x7Ag51+GzX5Q9s5z7bvbHmKXEIhE2kooaS6IkXA0N2DiTknZqCjMiXN1g1y83qgglC 7tByBapA/A/7LigsFpHTl1u++9y77ZxBx03Ih+UT4FQ3xl/y++jAC76iNL830PBUd+QX 6TZ2qQ69+AOxAA0XaJTFRM5GQYzHgGcQXPPer+voNitqZGlxczi4pCYGDrCVOuRkTHOi C/cQizrRByACm2SKZ0+s4ONYxVYvzPc3b77ulz1I1wSHTNxopNgFgeh+1/oct+XX5Hp7 vw5vVMN6epE7DC7b7lVNbZ5J15bfyJGRhy/AdOSO6n8gg6Q7XOOjKvtaNKOFex5cga1b leVA== X-Gm-Message-State: AOAM530IFWt+avCpKyahQtnC0XnB4Km2gkNwCBwgWZrMXHmoacwtnCAF mxWj4v6t4mlCDEzUEBUtw2A= X-Google-Smtp-Source: ABdhPJzIXLPzU24TwlJbx+3T4KQsSvX/FRp2+cWiy/Bo1wFczL6AgqIJKxutRY1UGxQiL6hCW+YVRg== X-Received: by 2002:a05:600c:1f05:b0:39c:51c6:7c85 with SMTP id bd5-20020a05600c1f0500b0039c51c67c85mr4264753wmb.33.1654525987313; Mon, 06 Jun 2022 07:33:07 -0700 (PDT) Received: from amir-ThinkPad-T480.ctera.local (bzq-166-168-31-246.red.bezeqint.net. [31.168.166.246]) by smtp.gmail.com with ESMTPSA id h24-20020a05600c145800b0039c54bb28f2sm1622958wmi.36.2022.06.06.07.33.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Jun 2022 07:33:06 -0700 (PDT) From: Amir Goldstein To: Greg Kroah-Hartman Cc: Sasha Levin , Dave Chinner , "Darrick J . Wong" , Christoph Hellwig , Brian Foster , Christian Brauner , Luis Chamberlain , Leah Rumancik , Adam Manzanares , linux-xfs@vger.kernel.org, stable@vger.kernel.org Subject: [PATCH 5.10 v2 3/8] xfs: fix chown leaking delalloc quota blocks when fssetxattr fails Date: Mon, 6 Jun 2022 17:32:50 +0300 Message-Id: <20220606143255.685988-4-amir73il@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220606143255.685988-1-amir73il@gmail.com> References: <20220606143255.685988-1-amir73il@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: "Darrick J. Wong" commit 1aecf3734a95f3c167d1495550ca57556d33f7ec upstream. While refactoring the quota code to create a function to allocate inode change transactions, I noticed that xfs_qm_vop_chown_reserve does more than just make reservations: it also *modifies* the incore counts directly to handle the owner id change for the delalloc blocks. I then observed that the fssetxattr code continues validating input arguments after making the quota reservation but before dirtying the transaction. If the routine decides to error out, it fails to undo the accounting switch! This leads to incorrect quota reservation and failure down the line. We can fix this by making the reservation function do only that -- for the new dquot, it reserves ondisk and delalloc blocks to the transaction, and the old dquot hangs on to its incore reservation for now. Once we actually switch the dquots, we can then update the incore reservations because we've dirtied the transaction and it's too late to turn back now. No fixes tag because this has been broken since the start of git. Signed-off-by: Darrick J. Wong Reviewed-by: Christoph Hellwig Reviewed-by: Brian Foster Signed-off-by: Amir Goldstein --- fs/xfs/xfs_qm.c | 92 +++++++++++++++++++------------------------------ 1 file changed, 35 insertions(+), 57 deletions(-) diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c index b2a9abee8b2b..64e5da33733b 100644 --- a/fs/xfs/xfs_qm.c +++ b/fs/xfs/xfs_qm.c @@ -1785,6 +1785,29 @@ xfs_qm_vop_chown( xfs_trans_mod_dquot(tp, newdq, bfield, ip->i_d.di_nblocks); xfs_trans_mod_dquot(tp, newdq, XFS_TRANS_DQ_ICOUNT, 1); + /* + * Back when we made quota reservations for the chown, we reserved the + * ondisk blocks + delalloc blocks with the new dquot. Now that we've + * switched the dquots, decrease the new dquot's block reservation + * (having already bumped up the real counter) so that we don't have + * any reservation to give back when we commit. + */ + xfs_trans_mod_dquot(tp, newdq, XFS_TRANS_DQ_RES_BLKS, + -ip->i_delayed_blks); + + /* + * Give the incore reservation for delalloc blocks back to the old + * dquot. We don't normally handle delalloc quota reservations + * transactionally, so just lock the dquot and subtract from the + * reservation. Dirty the transaction because it's too late to turn + * back now. + */ + tp->t_flags |= XFS_TRANS_DIRTY; + xfs_dqlock(prevdq); + ASSERT(prevdq->q_blk.reserved >= ip->i_delayed_blks); + prevdq->q_blk.reserved -= ip->i_delayed_blks; + xfs_dqunlock(prevdq); + /* * Take an extra reference, because the inode is going to keep * this dquot pointer even after the trans_commit. @@ -1807,84 +1830,39 @@ xfs_qm_vop_chown_reserve( uint flags) { struct xfs_mount *mp = ip->i_mount; - uint64_t delblks; unsigned int blkflags; - struct xfs_dquot *udq_unres = NULL; - struct xfs_dquot *gdq_unres = NULL; - struct xfs_dquot *pdq_unres = NULL; struct xfs_dquot *udq_delblks = NULL; struct xfs_dquot *gdq_delblks = NULL; struct xfs_dquot *pdq_delblks = NULL; - int error; - ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL|XFS_ILOCK_SHARED)); ASSERT(XFS_IS_QUOTA_RUNNING(mp)); - delblks = ip->i_delayed_blks; blkflags = XFS_IS_REALTIME_INODE(ip) ? XFS_QMOPT_RES_RTBLKS : XFS_QMOPT_RES_REGBLKS; if (XFS_IS_UQUOTA_ON(mp) && udqp && - i_uid_read(VFS_I(ip)) != udqp->q_id) { + i_uid_read(VFS_I(ip)) != udqp->q_id) udq_delblks = udqp; - /* - * If there are delayed allocation blocks, then we have to - * unreserve those from the old dquot, and add them to the - * new dquot. - */ - if (delblks) { - ASSERT(ip->i_udquot); - udq_unres = ip->i_udquot; - } - } + if (XFS_IS_GQUOTA_ON(ip->i_mount) && gdqp && - i_gid_read(VFS_I(ip)) != gdqp->q_id) { + i_gid_read(VFS_I(ip)) != gdqp->q_id) gdq_delblks = gdqp; - if (delblks) { - ASSERT(ip->i_gdquot); - gdq_unres = ip->i_gdquot; - } - } if (XFS_IS_PQUOTA_ON(ip->i_mount) && pdqp && - ip->i_d.di_projid != pdqp->q_id) { + ip->i_d.di_projid != pdqp->q_id) pdq_delblks = pdqp; - if (delblks) { - ASSERT(ip->i_pdquot); - pdq_unres = ip->i_pdquot; - } - } - - error = xfs_trans_reserve_quota_bydquots(tp, ip->i_mount, - udq_delblks, gdq_delblks, pdq_delblks, - ip->i_d.di_nblocks, 1, flags | blkflags); - if (error) - return error; /* - * Do the delayed blks reservations/unreservations now. Since, these - * are done without the help of a transaction, if a reservation fails - * its previous reservations won't be automatically undone by trans - * code. So, we have to do it manually here. + * Reserve enough quota to handle blocks on disk and reserved for a + * delayed allocation. We'll actually transfer the delalloc + * reservation between dquots at chown time, even though that part is + * only semi-transactional. */ - if (delblks) { - /* - * Do the reservations first. Unreservation can't fail. - */ - ASSERT(udq_delblks || gdq_delblks || pdq_delblks); - ASSERT(udq_unres || gdq_unres || pdq_unres); - error = xfs_trans_reserve_quota_bydquots(NULL, ip->i_mount, - udq_delblks, gdq_delblks, pdq_delblks, - (xfs_qcnt_t)delblks, 0, flags | blkflags); - if (error) - return error; - xfs_trans_reserve_quota_bydquots(NULL, ip->i_mount, - udq_unres, gdq_unres, pdq_unres, - -((xfs_qcnt_t)delblks), 0, blkflags); - } - - return 0; + return xfs_trans_reserve_quota_bydquots(tp, ip->i_mount, udq_delblks, + gdq_delblks, pdq_delblks, + ip->i_d.di_nblocks + ip->i_delayed_blks, + 1, blkflags | flags); } int From patchwork Mon Jun 6 14:32:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Amir Goldstein X-Patchwork-Id: 12870429 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F272EC433EF for ; Mon, 6 Jun 2022 14:33:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239762AbiFFOdM (ORCPT ); Mon, 6 Jun 2022 10:33:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51492 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239749AbiFFOdL (ORCPT ); Mon, 6 Jun 2022 10:33:11 -0400 Received: from mail-wr1-x42d.google.com (mail-wr1-x42d.google.com [IPv6:2a00:1450:4864:20::42d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 687362AC5F; Mon, 6 Jun 2022 07:33:10 -0700 (PDT) Received: by mail-wr1-x42d.google.com with SMTP id p10so20118989wrg.12; Mon, 06 Jun 2022 07:33:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=1QqcTEPtoODTXKz5bXn/012pkjfLEsGJrtDgaQa4Ijw=; b=Ib49PzdWGwEDkmiRo1kCrpEvD3rUNAHDiUAO9fAKNOejpqOXliEMniLhMv4BmSbMZ5 0KJ11+QorM3i8WCBn1JwbBQaLbrP5qgQ9MnkVYxZ8CCn7YxR3DOkRpsuhJVog9AVeWNz Tt5f8347aQrIySRHQ0y6h0Xy5xy9jF8Mt8JWDvjtTa2+Ob1jK6LCf4FBXUJox6MGUHT/ g194Y/utE9B8H+Q781LcyHdILTT89CxZd4ZiPxZ0j48Ak9C9M9BvPAHx0297qr4xWChw HClUHMHTHk7+xMrEbsMz/PUTeCo7+10wGDwbv9AHcJPteBN1cka05is9+C/osey+CORp bVJg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=1QqcTEPtoODTXKz5bXn/012pkjfLEsGJrtDgaQa4Ijw=; b=3vwJjlirwh5sZD0jRNqkrbmaji7Ip7xN1a8oC3He8C1rJ+sv+7AwTbWPa1mnE+mGgx MQwlCs7mjv86hcbWbu5VAmIBqMgBmiX0gCkBFxFcxClW4TpV7dg9u2h/lpm6UyBxD0az YJ/HRsE68pN3G+yytfavTQhN9A04K80w/j3Vtrk25twcHmO0NY9VbNkbSIarOkzUdWHK A5QAPZeSBztS0qr7VqiN4idGZkUQikQTKw1FZcjRPfzNdmK2JdYlKKQ+JVLXlGT7CSUB 6rcHvEhqnLTe2yM27EXryIU6O7qhZKv452yOJHQ7YiZSe3UsfX6Zh+dzUycpYpxhHM7H 4fyw== X-Gm-Message-State: AOAM530ypGga5UIwjoo06u0sQZXvzYmflL2ZP4ozF7Qz9teKjjnuApiL kboVFOR+citBtlpZSw7+skc= X-Google-Smtp-Source: ABdhPJwZm7a+WVaiXpXkbBFEypjjggoNyMCYBaYFbMik0IO+NmcOo+VVAkNMpipktdyQt24TMjcoBg== X-Received: by 2002:a05:6000:1542:b0:20f:f809:cf89 with SMTP id 2-20020a056000154200b0020ff809cf89mr21740382wry.361.1654525988797; Mon, 06 Jun 2022 07:33:08 -0700 (PDT) Received: from amir-ThinkPad-T480.ctera.local (bzq-166-168-31-246.red.bezeqint.net. [31.168.166.246]) by smtp.gmail.com with ESMTPSA id h24-20020a05600c145800b0039c54bb28f2sm1622958wmi.36.2022.06.06.07.33.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Jun 2022 07:33:08 -0700 (PDT) From: Amir Goldstein To: Greg Kroah-Hartman Cc: Sasha Levin , Dave Chinner , "Darrick J . Wong" , Christoph Hellwig , Brian Foster , Christian Brauner , Luis Chamberlain , Leah Rumancik , Adam Manzanares , linux-xfs@vger.kernel.org, stable@vger.kernel.org, Chandan Babu R Subject: [PATCH 5.10 v2 4/8] xfs: fix incorrect root dquot corruption error when switching group/project quota types Date: Mon, 6 Jun 2022 17:32:51 +0300 Message-Id: <20220606143255.685988-5-amir73il@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220606143255.685988-1-amir73il@gmail.com> References: <20220606143255.685988-1-amir73il@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: "Darrick J. Wong" commit 45068063efb7dd0a8d115c106aa05d9ab0946257 upstream. While writing up a regression test for broken behavior when a chprojid request fails, I noticed that we were logging corruption notices about the root dquot of the group/project quota file at mount time when testing V4 filesystems. In commit afeda6000b0c, I was trying to improve ondisk dquot validation by making sure that when we load an ondisk dquot into memory on behalf of an incore dquot, the dquot id and type matches. Unfortunately, I forgot that V4 filesystems only have two quota files, and can switch that file between group and project quota types at mount time. When we perform that switch, we'll try to load the default quota limits from the root dquot prior to running quotacheck and log a corruption error when the types don't match. This is inconsequential because quotacheck will reset the second quota file as part of doing the switch, but we shouldn't leave scary messages in the kernel log. Fixes: afeda6000b0c ("xfs: validate ondisk/incore dquot flags") Signed-off-by: Darrick J. Wong Reviewed-by: Brian Foster Reviewed-by: Chandan Babu R Signed-off-by: Amir Goldstein --- fs/xfs/xfs_dquot.c | 39 +++++++++++++++++++++++++++++++++++++-- 1 file changed, 37 insertions(+), 2 deletions(-) diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c index 1d95ed387d66..80c4579d6835 100644 --- a/fs/xfs/xfs_dquot.c +++ b/fs/xfs/xfs_dquot.c @@ -500,6 +500,42 @@ xfs_dquot_alloc( return dqp; } +/* Check the ondisk dquot's id and type match what the incore dquot expects. */ +static bool +xfs_dquot_check_type( + struct xfs_dquot *dqp, + struct xfs_disk_dquot *ddqp) +{ + uint8_t ddqp_type; + uint8_t dqp_type; + + ddqp_type = ddqp->d_type & XFS_DQTYPE_REC_MASK; + dqp_type = xfs_dquot_type(dqp); + + if (be32_to_cpu(ddqp->d_id) != dqp->q_id) + return false; + + /* + * V5 filesystems always expect an exact type match. V4 filesystems + * expect an exact match for user dquots and for non-root group and + * project dquots. + */ + if (xfs_sb_version_hascrc(&dqp->q_mount->m_sb) || + dqp_type == XFS_DQTYPE_USER || dqp->q_id != 0) + return ddqp_type == dqp_type; + + /* + * V4 filesystems support either group or project quotas, but not both + * at the same time. The non-user quota file can be switched between + * group and project quota uses depending on the mount options, which + * means that we can encounter the other type when we try to load quota + * defaults. Quotacheck will soon reset the the entire quota file + * (including the root dquot) anyway, but don't log scary corruption + * reports to dmesg. + */ + return ddqp_type == XFS_DQTYPE_GROUP || ddqp_type == XFS_DQTYPE_PROJ; +} + /* Copy the in-core quota fields in from the on-disk buffer. */ STATIC int xfs_dquot_from_disk( @@ -512,8 +548,7 @@ xfs_dquot_from_disk( * Ensure that we got the type and ID we were looking for. * Everything else was checked by the dquot buffer verifier. */ - if ((ddqp->d_type & XFS_DQTYPE_REC_MASK) != xfs_dquot_type(dqp) || - be32_to_cpu(ddqp->d_id) != dqp->q_id) { + if (!xfs_dquot_check_type(dqp, ddqp)) { xfs_alert_tag(bp->b_mount, XFS_PTAG_VERIFIER_ERROR, "Metadata corruption detected at %pS, quota %u", __this_address, dqp->q_id); From patchwork Mon Jun 6 14:32:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Amir Goldstein X-Patchwork-Id: 12870430 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F2B32CCA482 for ; Mon, 6 Jun 2022 14:33:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239766AbiFFOdO (ORCPT ); Mon, 6 Jun 2022 10:33:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51630 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239749AbiFFOdN (ORCPT ); Mon, 6 Jun 2022 10:33:13 -0400 Received: from mail-wm1-x32a.google.com (mail-wm1-x32a.google.com [IPv6:2a00:1450:4864:20::32a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2F6412B248; Mon, 6 Jun 2022 07:33:12 -0700 (PDT) Received: by mail-wm1-x32a.google.com with SMTP id l2-20020a05600c1d0200b0039c35ef94c4so5902227wms.4; Mon, 06 Jun 2022 07:33:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=SzyQsb1o4fSk85fpWvWY8/HRYw+PK4Y1RJ618LPBXOE=; b=T5CRrkmg3r1UzZG2LLjrDRF/l413u2riFgZL8u86FhGjCdozXKk/HYAuRGaOlB4Fol jfwsZ86tRuox/xpaMePq/T5DwznaWubv7whYvT13uSeGC9gNRJ71Bu1+ytkmoSG0lpcO bhNL/3ejdmfsi4gnQIHaXHFwlFaxnjK66X6BGzi68pWImdQnA40n60M1+n/AgcPme7hL 60dJMVG58x1SNdn82ToW6mOQL5TRfNQSGuauYW/VQPDljWU3SN5jHNO1EjEqIejcFSIK 6cgV6AFBT6XWWm/ubTUQWDKWHC+kFY7dF3bUhcFeGf1zSmNBEGxZqSVFdvGja6Yn7j6F eUfA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=SzyQsb1o4fSk85fpWvWY8/HRYw+PK4Y1RJ618LPBXOE=; b=n6y4F+FZnvCXpDlfB/8NMcbEjPAY+pHkmLOYpuknjILSnwWm0e3YUboGS6oklzgoED D4Zx3/lKEqDWKTD08+O+ma+toGPAoOmFiu7CmvzxEbqJZvGBA2pzRGQdM+wIHY7wuPFY XSGm51RFTNFbTtcxYCLcr9QBANidRsfZbxGy8WaLZsvx2UB4C5AuZDWVhSTqGFPMyOaJ XABdxvlMPvwgQ0sJLwZlH+hzRd8VwjB+UfzoRE8hav0pB+PbbygK7osovkcVJ1iKO2/8 fbE3WMTonrfPWTBkH4Dqb1357AnURYr8oU/UL8wIH+4wJOpDT5LdEcQcxl8Q08CWXZAD Q2cg== X-Gm-Message-State: AOAM532hudFGnOk2RBSkadBna4nx495pgOmj5HIeDKjG8TXBLkrDCtdB btjiMRteES8XHvnCPBkjxOQ= X-Google-Smtp-Source: ABdhPJxU0b1GxwHBraCdgM9PVN1ZncTUb4oO3fgL83YJfb6c1ByhYEyN2dSYBm/Cg0S2jShQuyHAlw== X-Received: by 2002:a7b:c758:0:b0:39c:44ce:f00f with SMTP id w24-20020a7bc758000000b0039c44cef00fmr13674785wmk.167.1654525990476; Mon, 06 Jun 2022 07:33:10 -0700 (PDT) Received: from amir-ThinkPad-T480.ctera.local (bzq-166-168-31-246.red.bezeqint.net. [31.168.166.246]) by smtp.gmail.com with ESMTPSA id h24-20020a05600c145800b0039c54bb28f2sm1622958wmi.36.2022.06.06.07.33.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Jun 2022 07:33:09 -0700 (PDT) From: Amir Goldstein To: Greg Kroah-Hartman Cc: Sasha Levin , Dave Chinner , "Darrick J . Wong" , Christoph Hellwig , Brian Foster , Christian Brauner , Luis Chamberlain , Leah Rumancik , Adam Manzanares , linux-xfs@vger.kernel.org, stable@vger.kernel.org, Eric Sandeen Subject: [PATCH 5.10 v2 5/8] xfs: restore shutdown check in mapped write fault path Date: Mon, 6 Jun 2022 17:32:52 +0300 Message-Id: <20220606143255.685988-6-amir73il@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220606143255.685988-1-amir73il@gmail.com> References: <20220606143255.685988-1-amir73il@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Brian Foster commit e4826691cc7e5458bcb659935d0092bcf3f08c20 upstream. XFS triggers an iomap warning in the write fault path due to a !PageUptodate() page if a write fault happens to occur on a page that recently failed writeback. The iomap writeback error handling code can clear the Uptodate flag if no portion of the page is submitted for I/O. This is reproduced by fstest generic/019, which combines various forms of I/O with simulated disk failures that inevitably lead to filesystem shutdown (which then unconditionally fails page writeback). This is a regression introduced by commit f150b4234397 ("xfs: split the iomap ops for buffered vs direct writes") due to the removal of a shutdown check and explicit error return in the ->iomap_begin() path used by the write fault path. The explicit error return historically translated to a SIGBUS, but now carries on with iomap processing where it complains about the unexpected state. Restore the shutdown check to xfs_buffered_write_iomap_begin() to restore historical behavior. Fixes: f150b4234397 ("xfs: split the iomap ops for buffered vs direct writes") Signed-off-by: Brian Foster Reviewed-by: Eric Sandeen Reviewed-by: Darrick J. Wong Signed-off-by: Darrick J. Wong Signed-off-by: Amir Goldstein --- fs/xfs/xfs_iomap.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c index 7b9ff824e82d..74bc2beadc23 100644 --- a/fs/xfs/xfs_iomap.c +++ b/fs/xfs/xfs_iomap.c @@ -870,6 +870,9 @@ xfs_buffered_write_iomap_begin( int allocfork = XFS_DATA_FORK; int error = 0; + if (XFS_FORCED_SHUTDOWN(mp)) + return -EIO; + /* we can't use delayed allocations when using extent size hints */ if (xfs_get_extsz_hint(ip)) return xfs_direct_write_iomap_begin(inode, offset, count, From patchwork Mon Jun 6 14:32:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Amir Goldstein X-Patchwork-Id: 12870431 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 932E6C43334 for ; Mon, 6 Jun 2022 14:33:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239749AbiFFOdP (ORCPT ); Mon, 6 Jun 2022 10:33:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51750 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239732AbiFFOdP (ORCPT ); Mon, 6 Jun 2022 10:33:15 -0400 Received: from mail-wm1-x32e.google.com (mail-wm1-x32e.google.com [IPv6:2a00:1450:4864:20::32e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A19952B26A; Mon, 6 Jun 2022 07:33:13 -0700 (PDT) Received: by mail-wm1-x32e.google.com with SMTP id l2-20020a05600c1d0200b0039c35ef94c4so5902275wms.4; Mon, 06 Jun 2022 07:33:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=b2pvl9UHk0Z4rkTZyge+oGFzFbTSQ4HA4DjZnHNbMzc=; b=g8urBqKX4RBTuND0IYv3jK3ybEgxB+uZRdqw6s9x3R7/5VQJYkMMNvF9wNKJNjlpp9 VozOfbUy5Dg1tdnRaqQwPi2EAMNq1uNRupobKHXlLjVD+niwu5NnuF8ZTiVsl7+Gr0a1 1XqST1Pl9U5P+C8xN1WcLV0msfNNI1MNhQkY3QFLzywgF4RYygXM6596QQPUxC1kxaE3 D6dGLK9xp7n83BJDfIjX1UgFCDphJ/ajLDy+7LG0PX7zYj5EyF+ZJyc3SiAOBooQI2HQ aYrMbw+zahFSrg4RAYAa3OhyWX3u9sUeRj57q21motqpEBRT2Z5khAR5LuaGFxjYbNZw PRyQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=b2pvl9UHk0Z4rkTZyge+oGFzFbTSQ4HA4DjZnHNbMzc=; b=6d8SVDW+4Cku6YnRs07ANWZyfEIfBxDaGT3C6TTNvoOJtmsnEjvEnrpaQMufiBrqsd Xpp2ouZgcjp1AESc9HBn4xc4P34y0hP4cF+tejEPGlEyltG3ZeCWKxgFgQnmK0dW+qTw rbtmKrc+LYM3it/A2jVudthMTwVzSCENkBU7mPtISY/JBMHVjw55C+81Zv/kjG9J/AGb b0VJdG30QpgCGp+u192l/D3YfZvod0YjEZV6fwG41DOrNLFLiWmDHx4wR4A6LP2duKQm aq/AkGH6GfdrjMkp9tQwQsH611dz2e1UCYffloGhTDTvgsdvclG987GZ+3DJ9hs/Bq2L MfLw== X-Gm-Message-State: AOAM5321AmV/UZ3nb9Thb2WVNbhkNusWZBFTX/j7uzWxJlB+BwBtb1Wn APAVfw6r6nDUZxOsr1TJdg8= X-Google-Smtp-Source: ABdhPJwbzjT61Mfy/LXqd9xaTaRZrjZCrBtwYWm4KRbnpFOnKqPKK+KChB85Fzms7yz0jfcWaCO2Ew== X-Received: by 2002:a05:600c:ac4:b0:39c:4f54:9c5f with SMTP id c4-20020a05600c0ac400b0039c4f549c5fmr5223211wmr.135.1654525992079; Mon, 06 Jun 2022 07:33:12 -0700 (PDT) Received: from amir-ThinkPad-T480.ctera.local (bzq-166-168-31-246.red.bezeqint.net. [31.168.166.246]) by smtp.gmail.com with ESMTPSA id h24-20020a05600c145800b0039c54bb28f2sm1622958wmi.36.2022.06.06.07.33.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Jun 2022 07:33:11 -0700 (PDT) From: Amir Goldstein To: Greg Kroah-Hartman Cc: Sasha Levin , Dave Chinner , "Darrick J . Wong" , Christoph Hellwig , Brian Foster , Christian Brauner , Luis Chamberlain , Leah Rumancik , Adam Manzanares , linux-xfs@vger.kernel.org, stable@vger.kernel.org, Dave Chinner Subject: [PATCH 5.10 v2 6/8] xfs: force log and push AIL to clear pinned inodes when aborting mount Date: Mon, 6 Jun 2022 17:32:53 +0300 Message-Id: <20220606143255.685988-7-amir73il@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220606143255.685988-1-amir73il@gmail.com> References: <20220606143255.685988-1-amir73il@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: "Darrick J. Wong" commit d336f7ebc65007f5831e2297e6f3383ae8dbf8ed upstream. If we allocate quota inodes in the process of mounting a filesystem but then decide to abort the mount, it's possible that the quota inodes are sitting around pinned by the log. Now that inode reclaim relies on the AIL to flush inodes, we have to force the log and push the AIL in between releasing the quota inodes and kicking off reclaim to tear down all the incore inodes. Do this by extracting the bits we need from the unmount path and reusing them. As an added bonus, failed writes during a failed mount will not retry forever now. This was originally found during a fuzz test of metadata directories (xfs/1546), but the actual symptom was that reclaim hung up on the quota inodes. Signed-off-by: Darrick J. Wong Reviewed-by: Christoph Hellwig Reviewed-by: Dave Chinner Signed-off-by: Amir Goldstein --- fs/xfs/xfs_mount.c | 90 +++++++++++++++++++++++----------------------- 1 file changed, 44 insertions(+), 46 deletions(-) diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c index a62b8a574409..44b05e1d5d32 100644 --- a/fs/xfs/xfs_mount.c +++ b/fs/xfs/xfs_mount.c @@ -631,6 +631,47 @@ xfs_check_summary_counts( return xfs_initialize_perag_data(mp, mp->m_sb.sb_agcount); } +/* + * Flush and reclaim dirty inodes in preparation for unmount. Inodes and + * internal inode structures can be sitting in the CIL and AIL at this point, + * so we need to unpin them, write them back and/or reclaim them before unmount + * can proceed. + * + * An inode cluster that has been freed can have its buffer still pinned in + * memory because the transaction is still sitting in a iclog. The stale inodes + * on that buffer will be pinned to the buffer until the transaction hits the + * disk and the callbacks run. Pushing the AIL will skip the stale inodes and + * may never see the pinned buffer, so nothing will push out the iclog and + * unpin the buffer. + * + * Hence we need to force the log to unpin everything first. However, log + * forces don't wait for the discards they issue to complete, so we have to + * explicitly wait for them to complete here as well. + * + * Then we can tell the world we are unmounting so that error handling knows + * that the filesystem is going away and we should error out anything that we + * have been retrying in the background. This will prevent never-ending + * retries in AIL pushing from hanging the unmount. + * + * Finally, we can push the AIL to clean all the remaining dirty objects, then + * reclaim the remaining inodes that are still in memory at this point in time. + */ +static void +xfs_unmount_flush_inodes( + struct xfs_mount *mp) +{ + xfs_log_force(mp, XFS_LOG_SYNC); + xfs_extent_busy_wait_all(mp); + flush_workqueue(xfs_discard_wq); + + mp->m_flags |= XFS_MOUNT_UNMOUNTING; + + xfs_ail_push_all_sync(mp->m_ail); + cancel_delayed_work_sync(&mp->m_reclaim_work); + xfs_reclaim_inodes(mp); + xfs_health_unmount(mp); +} + /* * This function does the following on an initial mount of a file system: * - reads the superblock from disk and init the mount struct @@ -1005,7 +1046,7 @@ xfs_mountfs( /* Clean out dquots that might be in memory after quotacheck. */ xfs_qm_unmount(mp); /* - * Cancel all delayed reclaim work and reclaim the inodes directly. + * Flush all inode reclamation work and flush the log. * We have to do this /after/ rtunmount and qm_unmount because those * two will have scheduled delayed reclaim for the rt/quota inodes. * @@ -1015,11 +1056,8 @@ xfs_mountfs( * qm_unmount_quotas and therefore rely on qm_unmount to release the * quota inodes. */ - cancel_delayed_work_sync(&mp->m_reclaim_work); - xfs_reclaim_inodes(mp); - xfs_health_unmount(mp); + xfs_unmount_flush_inodes(mp); out_log_dealloc: - mp->m_flags |= XFS_MOUNT_UNMOUNTING; xfs_log_mount_cancel(mp); out_fail_wait: if (mp->m_logdev_targp && mp->m_logdev_targp != mp->m_ddev_targp) @@ -1060,47 +1098,7 @@ xfs_unmountfs( xfs_rtunmount_inodes(mp); xfs_irele(mp->m_rootip); - /* - * We can potentially deadlock here if we have an inode cluster - * that has been freed has its buffer still pinned in memory because - * the transaction is still sitting in a iclog. The stale inodes - * on that buffer will be pinned to the buffer until the - * transaction hits the disk and the callbacks run. Pushing the AIL will - * skip the stale inodes and may never see the pinned buffer, so - * nothing will push out the iclog and unpin the buffer. Hence we - * need to force the log here to ensure all items are flushed into the - * AIL before we go any further. - */ - xfs_log_force(mp, XFS_LOG_SYNC); - - /* - * Wait for all busy extents to be freed, including completion of - * any discard operation. - */ - xfs_extent_busy_wait_all(mp); - flush_workqueue(xfs_discard_wq); - - /* - * We now need to tell the world we are unmounting. This will allow - * us to detect that the filesystem is going away and we should error - * out anything that we have been retrying in the background. This will - * prevent neverending retries in AIL pushing from hanging the unmount. - */ - mp->m_flags |= XFS_MOUNT_UNMOUNTING; - - /* - * Flush all pending changes from the AIL. - */ - xfs_ail_push_all_sync(mp->m_ail); - - /* - * Reclaim all inodes. At this point there should be no dirty inodes and - * none should be pinned or locked. Stop background inode reclaim here - * if it is still running. - */ - cancel_delayed_work_sync(&mp->m_reclaim_work); - xfs_reclaim_inodes(mp); - xfs_health_unmount(mp); + xfs_unmount_flush_inodes(mp); xfs_qm_unmount(mp); From patchwork Mon Jun 6 14:32:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Amir Goldstein X-Patchwork-Id: 12870432 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D8FA1CCA481 for ; Mon, 6 Jun 2022 14:33:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239768AbiFFOdQ (ORCPT ); Mon, 6 Jun 2022 10:33:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51818 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239732AbiFFOdQ (ORCPT ); Mon, 6 Jun 2022 10:33:16 -0400 Received: from mail-wm1-x32a.google.com (mail-wm1-x32a.google.com [IPv6:2a00:1450:4864:20::32a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 46C3D2BB1B; Mon, 6 Jun 2022 07:33:15 -0700 (PDT) Received: by mail-wm1-x32a.google.com with SMTP id d5-20020a05600c34c500b0039776acee62so7262809wmq.1; Mon, 06 Jun 2022 07:33:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=H8xbwTR16tvG4U/ddVZgdCLGD5EiTHqSz9fp9YwAxOk=; b=FXdeYNKMfp+gJSwO9mbfKLVfWObqC6NgNrZ7zFrCml4j+L8qsc+wNZzX7T7sFz0vh/ nHnf4e1tVkgF5QG+5PS5uoAWch4HIBhNUo0gW19IfIof0ThXbauTxVH/3WjhEdj02Nei oHz+TP5WsYJlOZEslT1yqyBemXzOgwIMWZf6gqwzy/o6SWzatmdP02EZcvmq4pJwb8q7 vcM+qvmYUzKvSMko6bsxn25d4RJoPAznVc/8vyNnht+DVLbIXHCxQ5Pj4dtohGvXWEih 5N5ph5SpAkb0DVqe0yFWCY1/7uCtX2PPKQFMlkaVSmx8lX4V+P/Nq1A9OHeA/DOeDgiR 9kYg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=H8xbwTR16tvG4U/ddVZgdCLGD5EiTHqSz9fp9YwAxOk=; b=Ll3LgZuvUH/ry62Xt7YjxwOKe43em8RkvUQl8BcXzXAxT+zQJAoVTN8UHGIa9imLW/ Y4VW3PO8L/KGLxuQc4ks8p36Gr8z/7ChHyxTa6bGfK5a4dkcMgicTEjV5zGhrkB25AUB r0ssin9+XioCspxisaNE5/+fAFxq0Fc3M2qU8AiskmwyTbG/Pyszh2hJFUu+9GgAl2rT xikoBmoqEQMBNcVloAq9Z/lVsD7Lfqqlc8lGEz2g64/FqtAJ5tNpY31nMFmKkyOsWo0n 7DYkt4z/WxqbwN5yQ6hlDSY9KrWmUMvmA3wRv+crDFNdxZdiRMmdOG/sFrKDBEZ97Wma h4Zg== X-Gm-Message-State: AOAM531l97Pj1IZ0t9R+mdwlFKlZqme8pEomjamlkVZ4CCQxmmVAAtmr YFA4DYmwrQgFoK6Ut/YchJk= X-Google-Smtp-Source: ABdhPJyaTYvxYliaA9PJPhTp779qgoAR8pWs0axIijaQA40qR6W1oUXK1V27x3BigtBSNzS5KhyVTQ== X-Received: by 2002:a05:600c:354a:b0:39c:4ebf:fb4c with SMTP id i10-20020a05600c354a00b0039c4ebffb4cmr6121926wmq.142.1654525993737; Mon, 06 Jun 2022 07:33:13 -0700 (PDT) Received: from amir-ThinkPad-T480.ctera.local (bzq-166-168-31-246.red.bezeqint.net. [31.168.166.246]) by smtp.gmail.com with ESMTPSA id h24-20020a05600c145800b0039c54bb28f2sm1622958wmi.36.2022.06.06.07.33.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Jun 2022 07:33:13 -0700 (PDT) From: Amir Goldstein To: Greg Kroah-Hartman Cc: Sasha Levin , Dave Chinner , "Darrick J . Wong" , Christoph Hellwig , Brian Foster , Christian Brauner , Luis Chamberlain , Leah Rumancik , Adam Manzanares , linux-xfs@vger.kernel.org, stable@vger.kernel.org Subject: [PATCH 5.10 v2 7/8] xfs: consider shutdown in bmapbt cursor delete assert Date: Mon, 6 Jun 2022 17:32:54 +0300 Message-Id: <20220606143255.685988-8-amir73il@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220606143255.685988-1-amir73il@gmail.com> References: <20220606143255.685988-1-amir73il@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Brian Foster commit 1cd738b13ae9b29e03d6149f0246c61f76e81fcf upstream. The assert in xfs_btree_del_cursor() checks that the bmapbt block allocation field has been handled correctly before the cursor is freed. This field is used for accurate calculation of indirect block reservation requirements (for delayed allocations), for example. generic/019 reproduces a scenario where this assert fails because the filesystem has shutdown while in the middle of a bmbt record insertion. This occurs after a bmbt block has been allocated via the cursor but before the higher level bmap function (i.e. xfs_bmap_add_extent_hole_real()) completes and resets the field. Update the assert to accommodate the transient state if the filesystem has shutdown. While here, clean up the indentation and comments in the function. Signed-off-by: Brian Foster Reviewed-by: Darrick J. Wong Signed-off-by: Darrick J. Wong Signed-off-by: Amir Goldstein --- fs/xfs/libxfs/xfs_btree.c | 33 ++++++++++++--------------------- 1 file changed, 12 insertions(+), 21 deletions(-) diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c index 2d25bab68764..9f9f9feccbcd 100644 --- a/fs/xfs/libxfs/xfs_btree.c +++ b/fs/xfs/libxfs/xfs_btree.c @@ -353,20 +353,17 @@ xfs_btree_free_block( */ void xfs_btree_del_cursor( - xfs_btree_cur_t *cur, /* btree cursor */ - int error) /* del because of error */ + struct xfs_btree_cur *cur, /* btree cursor */ + int error) /* del because of error */ { - int i; /* btree level */ + int i; /* btree level */ /* - * Clear the buffer pointers, and release the buffers. - * If we're doing this in the face of an error, we - * need to make sure to inspect all of the entries - * in the bc_bufs array for buffers to be unlocked. - * This is because some of the btree code works from - * level n down to 0, and if we get an error along - * the way we won't have initialized all the entries - * down to 0. + * Clear the buffer pointers and release the buffers. If we're doing + * this because of an error, inspect all of the entries in the bc_bufs + * array for buffers to be unlocked. This is because some of the btree + * code works from level n down to 0, and if we get an error along the + * way we won't have initialized all the entries down to 0. */ for (i = 0; i < cur->bc_nlevels; i++) { if (cur->bc_bufs[i]) @@ -374,17 +371,11 @@ xfs_btree_del_cursor( else if (!error) break; } - /* - * Can't free a bmap cursor without having dealt with the - * allocated indirect blocks' accounting. - */ - ASSERT(cur->bc_btnum != XFS_BTNUM_BMAP || - cur->bc_ino.allocated == 0); - /* - * Free the cursor. - */ + + ASSERT(cur->bc_btnum != XFS_BTNUM_BMAP || cur->bc_ino.allocated == 0 || + XFS_FORCED_SHUTDOWN(cur->bc_mp)); if (unlikely(cur->bc_flags & XFS_BTREE_STAGING)) - kmem_free((void *)cur->bc_ops); + kmem_free(cur->bc_ops); kmem_cache_free(xfs_btree_cur_zone, cur); } From patchwork Mon Jun 6 14:32:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Amir Goldstein X-Patchwork-Id: 12870433 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8D8A9C43334 for ; Mon, 6 Jun 2022 14:33:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239772AbiFFOdU (ORCPT ); Mon, 6 Jun 2022 10:33:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51948 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239732AbiFFOdS (ORCPT ); Mon, 6 Jun 2022 10:33:18 -0400 Received: from mail-wr1-x42b.google.com (mail-wr1-x42b.google.com [IPv6:2a00:1450:4864:20::42b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DEEAA2BB23; Mon, 6 Jun 2022 07:33:16 -0700 (PDT) Received: by mail-wr1-x42b.google.com with SMTP id d14so11126400wra.10; Mon, 06 Jun 2022 07:33:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=4ADZbtAzVYEXGEOSgrcAKT0QzFIP7E8T2DQIIPmKyqo=; b=n91mLdVImBJysmLKm/dk0x82GvFIUs8jAIrBxtIpsme3svrhknEFt/CQX0vN7VNG0W n654VVxKLreeeJMabslKb/Oong6DQaEqSmo78rexJjwVDxMKKAXd6vVWzQfMHe5Vb5Uc ln8rbLe+7Nb4aZCKY78Gp2r+ll7oUMj6jP+KyjX8KL4I7lXbQnb3Zt8DxCDGLi1PfZjP bayefeFOPXxWraz3HXPAoKdhKfOrBKKRsmdD+l1o8P+TaLlaQ11d9ZYLjYVz7c23HzOW LSkKbY0VLXrpFYi+6eGCiWTzXCHSTEr2WJbPcaElJK2HZlNjNL5IflXh53cVuR6l8RzC n9OQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=4ADZbtAzVYEXGEOSgrcAKT0QzFIP7E8T2DQIIPmKyqo=; b=AxkrQp3CdSsIXFbyrtc2yk1UPcS1wK2gvA2Upb+OFESMHZE6DyyOahakMTOwGe4NA4 a1YeCIos6oXj0K12ASI7+PgiegBXGF5wf5Nt/rfrVJExTQ5X0zdpadclLSctlNnrUig7 1ZPdm1o7DO1lSLWqm6xbQA/bA95NWpFHEMD9CFcn5ahRwIUV/+PgNrl7CAGpMUQIxMm+ XVK3sSNFMnNIv8m+t71zNyG8EzDdIQO2vyQaRU8g0A3b1il9xQxIy7LgiaJvndEc7QHJ ot9b+irxNTkqol/Lch5smtgcCH14IOdjfGyHTXzKWtMHsbDz9LidD/JIPhA4oDytgykM HvtQ== X-Gm-Message-State: AOAM532uY1R55p3pX2noIl/rMWS1gx/i9ttXnp5bALnfsGM2Spkyv9aE 6PM3P1/3TTP4g61Q+DBW9To= X-Google-Smtp-Source: ABdhPJwxKROoHDO8Ew1abJrLoRdz5x87Zlc+FxcuDafnpdXSBzE/qqKnsVAz4ahyAv6DpKnkpLF4xw== X-Received: by 2002:a5d:648a:0:b0:217:3552:eb2d with SMTP id o10-20020a5d648a000000b002173552eb2dmr9085512wri.78.1654525995382; Mon, 06 Jun 2022 07:33:15 -0700 (PDT) Received: from amir-ThinkPad-T480.ctera.local (bzq-166-168-31-246.red.bezeqint.net. [31.168.166.246]) by smtp.gmail.com with ESMTPSA id h24-20020a05600c145800b0039c54bb28f2sm1622958wmi.36.2022.06.06.07.33.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Jun 2022 07:33:14 -0700 (PDT) From: Amir Goldstein To: Greg Kroah-Hartman Cc: Sasha Levin , Dave Chinner , "Darrick J . Wong" , Christoph Hellwig , Brian Foster , Christian Brauner , Luis Chamberlain , Leah Rumancik , Adam Manzanares , linux-xfs@vger.kernel.org, stable@vger.kernel.org, Dave Chinner Subject: [PATCH 5.10 v2 8/8] xfs: assert in xfs_btree_del_cursor should take into account error Date: Mon, 6 Jun 2022 17:32:55 +0300 Message-Id: <20220606143255.685988-9-amir73il@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220606143255.685988-1-amir73il@gmail.com> References: <20220606143255.685988-1-amir73il@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org From: Dave Chinner commit 56486f307100e8fc66efa2ebd8a71941fa10bf6f upstream. xfs/538 on a 1kB block filesystem failed with this assert: XFS: Assertion failed: cur->bc_btnum != XFS_BTNUM_BMAP || cur->bc_ino.allocated == 0 || xfs_is_shutdown(cur->bc_mp), file: fs/xfs/libxfs/xfs_btree.c, line: 448 The problem was that an allocation failed unexpectedly in xfs_bmbt_alloc_block() after roughly 150,000 minlen allocation error injections, resulting in an EFSCORRUPTED error being returned to xfs_bmapi_write(). The error occurred on extent-to-btree format conversion allocating the new root block: RIP: 0010:xfs_bmbt_alloc_block+0x177/0x210 Call Trace: xfs_btree_new_iroot+0xdf/0x520 xfs_btree_make_block_unfull+0x10d/0x1c0 xfs_btree_insrec+0x364/0x790 xfs_btree_insert+0xaa/0x210 xfs_bmap_add_extent_hole_real+0x1fe/0x9a0 xfs_bmapi_allocate+0x34c/0x420 xfs_bmapi_write+0x53c/0x9c0 xfs_alloc_file_space+0xee/0x320 xfs_file_fallocate+0x36b/0x450 vfs_fallocate+0x148/0x340 __x64_sys_fallocate+0x3c/0x70 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xa Why the allocation failed at this point is unknown, but is likely that we ran the transaction out of reserved space and filesystem out of space with bmbt blocks because of all the minlen allocations being done causing worst case fragmentation of a large allocation. Regardless of the cause, we've then called xfs_bmapi_finish() which calls xfs_btree_del_cursor(cur, error) to tear down the cursor. So we have a failed operation, error != 0, cur->bc_ino.allocated > 0 and the filesystem is still up. The assert fails to take into account that allocation can fail with an error and the transaction teardown will shut the filesystem down if necessary. i.e. the assert needs to check "|| error != 0" as well, because at this point shutdown is pending because the current transaction is dirty.... Signed-off-by: Dave Chinner Reviewed-by: Darrick J. Wong Reviewed-by: Christoph Hellwig Signed-off-by: Dave Chinner Signed-off-by: Amir Goldstein --- fs/xfs/libxfs/xfs_btree.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c index 9f9f9feccbcd..98c82f4935e1 100644 --- a/fs/xfs/libxfs/xfs_btree.c +++ b/fs/xfs/libxfs/xfs_btree.c @@ -372,8 +372,14 @@ xfs_btree_del_cursor( break; } + /* + * If we are doing a BMBT update, the number of unaccounted blocks + * allocated during this cursor life time should be zero. If it's not + * zero, then we should be shut down or on our way to shutdown due to + * cancelling a dirty transaction on error. + */ ASSERT(cur->bc_btnum != XFS_BTNUM_BMAP || cur->bc_ino.allocated == 0 || - XFS_FORCED_SHUTDOWN(cur->bc_mp)); + XFS_FORCED_SHUTDOWN(cur->bc_mp) || error != 0); if (unlikely(cur->bc_flags & XFS_BTREE_STAGING)) kmem_free(cur->bc_ops); kmem_cache_free(xfs_btree_cur_zone, cur);