From patchwork Mon Jun  6 14:32:48 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Amir Goldstein <amir73il@gmail.com>
X-Patchwork-Id: 12870426
Return-Path: <linux-xfs-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 19894C43334
	for <linux-xfs@archiver.kernel.org>; Mon,  6 Jun 2022 14:33:11 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S239753AbiFFOdK (ORCPT <rfc822;linux-xfs@archiver.kernel.org>);
        Mon, 6 Jun 2022 10:33:10 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51166 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S239740AbiFFOdH (ORCPT
        <rfc822;linux-xfs@vger.kernel.org>); Mon, 6 Jun 2022 10:33:07 -0400
Received: from mail-wr1-x42b.google.com (mail-wr1-x42b.google.com
 [IPv6:2a00:1450:4864:20::42b])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AE6192AE2B;
        Mon,  6 Jun 2022 07:33:05 -0700 (PDT)
Received: by mail-wr1-x42b.google.com with SMTP id k19so20109636wrd.8;
        Mon, 06 Jun 2022 07:33:05 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20210112;
        h=from:to:cc:subject:date:message-id:in-reply-to:references
         :mime-version:content-transfer-encoding;
        bh=DxI+KP/yP2c1dj72ZEf9XEZoYdpFMxSLMCxGwNL12Ug=;
        b=ZxjmAX1D1wGptVBPvIEy0WZ4RY48LUQLDudC+3I1cEwpRYGSLwv8N6C5DQi9BuiVRS
         I5rMYeFPtxm2N1gWgcZZUPgCm2yWgef718S03NaDH6Er7aVfmRK4Cdb3fE/HVSPW40lG
         Yh+vOPjiUTA2bwILsDfG/1tgo0O957E60vziSWS4rZXaj2t53I4pTJ7+qrIPSe133k2e
         k+ADnBZUBZYTDxBIPocWFfEchMk0cZuNBVqbqG+d9K/IXIs+dq7MGu6RNa5fql2xNG54
         8mEjD0rT9N6WnKcIsBlyj8XST8Y3ZqNj7chfVPZWry574690n3mNuoZ3zUgBuGf+MhNB
         UWmA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to
         :references:mime-version:content-transfer-encoding;
        bh=DxI+KP/yP2c1dj72ZEf9XEZoYdpFMxSLMCxGwNL12Ug=;
        b=ri6VUJXkDIAYCET0WcK3p2bmxFuO2XoMYeHY8ytXd6q7LDonC9KjQzOYw5t61MJkU1
         JY1WdSaVHN+bHNIVByjDdhDhvNFt6wGvsBTKwAl2fZZM9IhXNOs/g8+94bvGdHbF8+Js
         qewFfbMop5XAAmoCADtubelDi2WjBRBKrfA/wQX/Ph46lXbF7PF0ceaNSneDqLKLvJkb
         HcwZ0qzfP5tY2ALHJbuBibu+7nQPHo6J7AOvSwK6mt49P8ribM1iFihMRwe8iiM0ZWbK
         My9KC2JL4OkWkYGKs67cZ6t7Inku0XrVg2GmfybMv8GyVVTnM3GwuvI4lJwn6VAEBkoj
         Zsxg==
X-Gm-Message-State: AOAM530pkC8emzwPpU3l6vDWAralwNieawZNGVd4+A8+1DaVu5tkOXIE
        YuMHIdHMIPrEiD9knzlqrYdngQZHn/5VEA==
X-Google-Smtp-Source: 
 ABdhPJzTFxpKysC6TNqNDsDdVcL7T/zwy7+pR645h6jq5TR47wKLIJGik98Me059nC7Q4wj1TVptpA==
X-Received: by 2002:adf:fb10:0:b0:207:af88:1eb9 with SMTP id
 c16-20020adffb10000000b00207af881eb9mr22351686wrr.238.1654525984228;
        Mon, 06 Jun 2022 07:33:04 -0700 (PDT)
Received: from amir-ThinkPad-T480.ctera.local
 (bzq-166-168-31-246.red.bezeqint.net. [31.168.166.246])
        by smtp.gmail.com with ESMTPSA id
 h24-20020a05600c145800b0039c54bb28f2sm1622958wmi.36.2022.06.06.07.33.02
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Mon, 06 Jun 2022 07:33:03 -0700 (PDT)
From: Amir Goldstein <amir73il@gmail.com>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Sasha Levin <sashal@kernel.org>,
        Dave Chinner <david@fromorbit.com>,
        "Darrick J . Wong" <djwong@kernel.org>,
        Christoph Hellwig <hch@lst.de>,
        Brian Foster <bfoster@redhat.com>,
        Christian Brauner <brauner@kernel.org>,
        Luis Chamberlain <mcgrof@kernel.org>,
        Leah Rumancik <leah.rumancik@gmail.com>,
        Adam Manzanares <a.manzanares@samsung.com>,
        linux-xfs@vger.kernel.org, stable@vger.kernel.org,
        Jeffrey Mitchell <jeffrey.mitchell@starlab.io>
Subject: [PATCH 5.10 v2 1/8] xfs: set inode size after creating symlink
Date: Mon,  6 Jun 2022 17:32:48 +0300
Message-Id: <20220606143255.685988-2-amir73il@gmail.com>
X-Mailer: git-send-email 2.25.1
In-Reply-To: <20220606143255.685988-1-amir73il@gmail.com>
References: <20220606143255.685988-1-amir73il@gmail.com>
MIME-Version: 1.0
Precedence: bulk
List-ID: <linux-xfs.vger.kernel.org>
X-Mailing-List: linux-xfs@vger.kernel.org

From: Jeffrey Mitchell <jeffrey.mitchell@starlab.io>

commit 8aa921a95335d0a8c8e2be35a44467e7c91ec3e4 upstream.

When XFS creates a new symlink, it writes its size to disk but not to the
VFS inode. This causes i_size_read() to return 0 for that symlink until
it is re-read from disk, for example when the system is rebooted.

I found this inconsistency while protecting directories with eCryptFS.
The command "stat path/to/symlink/in/ecryptfs" will report "Size: 0" if
the symlink was created after the last reboot on an XFS root.

Call i_size_write() in xfs_symlink()

Signed-off-by: Jeffrey Mitchell <jeffrey.mitchell@starlab.io>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/xfs/xfs_symlink.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/xfs/xfs_symlink.c b/fs/xfs/xfs_symlink.c
index 8e88a7ca387e..8d3abf06c54f 100644
--- a/fs/xfs/xfs_symlink.c
+++ b/fs/xfs/xfs_symlink.c
@@ -300,6 +300,7 @@ xfs_symlink(
 		}
 		ASSERT(pathlen == 0);
 	}
+	i_size_write(VFS_I(ip), ip->i_d.di_size);
 
 	/*
 	 * Create the directory entry for the symlink.

From patchwork Mon Jun  6 14:32:49 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Amir Goldstein <amir73il@gmail.com>
X-Patchwork-Id: 12870427
Return-Path: <linux-xfs-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id A0897CCA482
	for <linux-xfs@archiver.kernel.org>; Mon,  6 Jun 2022 14:33:11 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S239760AbiFFOdK (ORCPT <rfc822;linux-xfs@archiver.kernel.org>);
        Mon, 6 Jun 2022 10:33:10 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51316 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S239748AbiFFOdI (ORCPT
        <rfc822;linux-xfs@vger.kernel.org>); Mon, 6 Jun 2022 10:33:08 -0400
Received: from mail-wm1-x32d.google.com (mail-wm1-x32d.google.com
 [IPv6:2a00:1450:4864:20::32d])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B37EF2AE2C;
        Mon,  6 Jun 2022 07:33:07 -0700 (PDT)
Received: by mail-wm1-x32d.google.com with SMTP id q15so4634172wmj.2;
        Mon, 06 Jun 2022 07:33:07 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20210112;
        h=from:to:cc:subject:date:message-id:in-reply-to:references
         :mime-version:content-transfer-encoding;
        bh=KtH8GLMcafEYSB20tONa+zJ9ioX5tusX1ZCZq7mMdkA=;
        b=gR2A8c7WofRsDAd9EPAoi+yXmS9SsEjsLvasD7U2oy6C89qQwmJJQgsKwDAmmMCIom
         3SG6dXs0ANMl+te54nCmulT/E6wzc4cgnDp+d0ZEv8GNMtV7qhq/P+8u/WjfSUWd8rxn
         zJLT1dlNilBzLJW2hZQn/iqth63BWZJxTArm3JM+ofixVyGgGLKXtBYi4nfrW848eozh
         qDgXElzDZsnuZcgKYQD3S+Upka0f+obooNESRo3XENzp5hsMjJ1fIEpvfLACoh/i/L8C
         ALXdvKPG9MH72voghs2W0W/EJII4/BintopgP+ys1KyfXM2Hyz0KkComJH8izNBJpox/
         QDww==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to
         :references:mime-version:content-transfer-encoding;
        bh=KtH8GLMcafEYSB20tONa+zJ9ioX5tusX1ZCZq7mMdkA=;
        b=IYUq56oLsqjwnAZkAjis3GwFnErPnUgD6aS6bb3v9Cqs/DP6PZBXqIGI+D/kfYLNnc
         98Wwfa0RcLhxg2pFZCJ6IP/+uljL9qpbMzeehxBQdM9Xxzjy1/KYN4zwebnsyoVJ3lLv
         czHh708wxWRlTwnXk1S6ComlAtbZgJAP5OSlY58DCL+jb24qRowt7p1sUfJQyPkiw7z+
         0CGzumTjxJrKXOKimpp7DrSsYrDMcNaEgUHmpWffKKoBoo+TsgoIok7sor5HWwYSZ60H
         pLW8mJlx/hKMls807yIYtynEyRMQArgHQN3MJt5wFunlZ4vwmD5pSxXjOMJ6mYD48xv2
         Z2yg==
X-Gm-Message-State: AOAM530hOR3XkjR/Yvr5lL5Olvfldd2HGihXEtNF3BTkxBGy9AnlygpX
        xsDDg4XHJiElQk7Blwya8LPgmIH4Ky5QNg==
X-Google-Smtp-Source: 
 ABdhPJzHBSJy4GpckPylEDJZvQkhwZU0vMCaqZ0ZP+Ag7+wF7p2/TC6NpZzrDE3Ism5vYmg0adgKQQ==
X-Received: by 2002:a7b:c777:0:b0:39c:4e1d:fd27 with SMTP id
 x23-20020a7bc777000000b0039c4e1dfd27mr6463863wmk.1.1654525985909;
        Mon, 06 Jun 2022 07:33:05 -0700 (PDT)
Received: from amir-ThinkPad-T480.ctera.local
 (bzq-166-168-31-246.red.bezeqint.net. [31.168.166.246])
        by smtp.gmail.com with ESMTPSA id
 h24-20020a05600c145800b0039c54bb28f2sm1622958wmi.36.2022.06.06.07.33.04
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Mon, 06 Jun 2022 07:33:05 -0700 (PDT)
From: Amir Goldstein <amir73il@gmail.com>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Sasha Levin <sashal@kernel.org>,
        Dave Chinner <david@fromorbit.com>,
        "Darrick J . Wong" <djwong@kernel.org>,
        Christoph Hellwig <hch@lst.de>,
        Brian Foster <bfoster@redhat.com>,
        Christian Brauner <brauner@kernel.org>,
        Luis Chamberlain <mcgrof@kernel.org>,
        Leah Rumancik <leah.rumancik@gmail.com>,
        Adam Manzanares <a.manzanares@samsung.com>,
        linux-xfs@vger.kernel.org, stable@vger.kernel.org,
        Gao Xiang <hsiangkao@redhat.com>,
        Allison Henderson <allison.henderson@oracle.com>,
        "Darrick J . Wong" <darrick.wong@oracle.com>,
        Bill O'Donnell <billodo@redhat.com>
Subject: [PATCH 5.10 v2 2/8] xfs: sync lazy sb accounting on quiesce of
 read-only mounts
Date: Mon,  6 Jun 2022 17:32:49 +0300
Message-Id: <20220606143255.685988-3-amir73il@gmail.com>
X-Mailer: git-send-email 2.25.1
In-Reply-To: <20220606143255.685988-1-amir73il@gmail.com>
References: <20220606143255.685988-1-amir73il@gmail.com>
MIME-Version: 1.0
Precedence: bulk
List-ID: <linux-xfs.vger.kernel.org>
X-Mailing-List: linux-xfs@vger.kernel.org

From: Brian Foster <bfoster@redhat.com>

commit 50d25484bebe94320c49dd1347d3330c7063bbdb upstream.

xfs_log_sbcount() syncs the superblock specifically to accumulate
the in-core percpu superblock counters and commit them to disk. This
is required to maintain filesystem consistency across quiesce
(freeze, read-only mount/remount) or unmount when lazy superblock
accounting is enabled because individual transactions do not update
the superblock directly.

This mechanism works as expected for writable mounts, but
xfs_log_sbcount() skips the update for read-only mounts. Read-only
mounts otherwise still allow log recovery and write out an unmount
record during log quiesce. If a read-only mount performs log
recovery, it can modify the in-core superblock counters and write an
unmount record when the filesystem unmounts without ever syncing the
in-core counters. This leaves the filesystem with a clean log but in
an inconsistent state with regard to lazy sb counters.

Update xfs_log_sbcount() to use the same logic
xfs_log_unmount_write() uses to determine when to write an unmount
record. This ensures that lazy accounting is always synced before
the log is cleaned. Refactor this logic into a new helper to
distinguish between a writable filesystem and a writable log.
Specifically, the log is writable unless the filesystem is mounted
with the norecovery mount option, the underlying log device is
read-only, or the filesystem is shutdown. Drop the freeze state
check because the update is already allowed during the freezing
process and no context calls this function on an already frozen fs.
Also, retain the shutdown check in xfs_log_unmount_write() to catch
the case where the preceding log force might have triggered a
shutdown.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Gao Xiang <hsiangkao@redhat.com>
Reviewed-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Bill O'Donnell <billodo@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/xfs/xfs_log.c   | 28 ++++++++++++++++++++--------
 fs/xfs/xfs_log.h   |  1 +
 fs/xfs/xfs_mount.c |  3 +--
 3 files changed, 22 insertions(+), 10 deletions(-)

diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index fa2d05e65ff1..b445e63cbc3c 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -347,6 +347,25 @@ xlog_tic_add_region(xlog_ticket_t *tic, uint len, uint type)
 	tic->t_res_num++;
 }
 
+bool
+xfs_log_writable(
+	struct xfs_mount	*mp)
+{
+	/*
+	 * Never write to the log on norecovery mounts, if the block device is
+	 * read-only, or if the filesystem is shutdown. Read-only mounts still
+	 * allow internal writes for log recovery and unmount purposes, so don't
+	 * restrict that case here.
+	 */
+	if (mp->m_flags & XFS_MOUNT_NORECOVERY)
+		return false;
+	if (xfs_readonly_buftarg(mp->m_log->l_targ))
+		return false;
+	if (XFS_FORCED_SHUTDOWN(mp))
+		return false;
+	return true;
+}
+
 /*
  * Replenish the byte reservation required by moving the grant write head.
  */
@@ -886,15 +905,8 @@ xfs_log_unmount_write(
 {
 	struct xlog		*log = mp->m_log;
 
-	/*
-	 * Don't write out unmount record on norecovery mounts or ro devices.
-	 * Or, if we are doing a forced umount (typically because of IO errors).
-	 */
-	if (mp->m_flags & XFS_MOUNT_NORECOVERY ||
-	    xfs_readonly_buftarg(log->l_targ)) {
-		ASSERT(mp->m_flags & XFS_MOUNT_RDONLY);
+	if (!xfs_log_writable(mp))
 		return;
-	}
 
 	xfs_log_force(mp, XFS_LOG_SYNC);
 
diff --git a/fs/xfs/xfs_log.h b/fs/xfs/xfs_log.h
index 58c3fcbec94a..98c913da7587 100644
--- a/fs/xfs/xfs_log.h
+++ b/fs/xfs/xfs_log.h
@@ -127,6 +127,7 @@ int	  xfs_log_reserve(struct xfs_mount *mp,
 int	  xfs_log_regrant(struct xfs_mount *mp, struct xlog_ticket *tic);
 void      xfs_log_unmount(struct xfs_mount *mp);
 int	  xfs_log_force_umount(struct xfs_mount *mp, int logerror);
+bool	xfs_log_writable(struct xfs_mount *mp);
 
 struct xlog_ticket *xfs_log_ticket_get(struct xlog_ticket *ticket);
 void	  xfs_log_ticket_put(struct xlog_ticket *ticket);
diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index 7110507a2b6b..a62b8a574409 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -1176,8 +1176,7 @@ xfs_fs_writable(
 int
 xfs_log_sbcount(xfs_mount_t *mp)
 {
-	/* allow this to proceed during the freeze sequence... */
-	if (!xfs_fs_writable(mp, SB_FREEZE_COMPLETE))
+	if (!xfs_log_writable(mp))
 		return 0;
 
 	/*

From patchwork Mon Jun  6 14:32:50 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Amir Goldstein <amir73il@gmail.com>
X-Patchwork-Id: 12870428
Return-Path: <linux-xfs-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 34AA4CCA483
	for <linux-xfs@archiver.kernel.org>; Mon,  6 Jun 2022 14:33:12 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S239748AbiFFOdL (ORCPT <rfc822;linux-xfs@archiver.kernel.org>);
        Mon, 6 Jun 2022 10:33:11 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51400 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S239749AbiFFOdK (ORCPT
        <rfc822;linux-xfs@vger.kernel.org>); Mon, 6 Jun 2022 10:33:10 -0400
Received: from mail-wm1-x332.google.com (mail-wm1-x332.google.com
 [IPv6:2a00:1450:4864:20::332])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E7DAA2AC5F;
        Mon,  6 Jun 2022 07:33:08 -0700 (PDT)
Received: by mail-wm1-x332.google.com with SMTP id z17so2934020wmi.1;
        Mon, 06 Jun 2022 07:33:08 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20210112;
        h=from:to:cc:subject:date:message-id:in-reply-to:references
         :mime-version:content-transfer-encoding;
        bh=v9bZ/oogoWoPQX2w9BuFOFgcjiIV4xqYgwERl5n3obI=;
        b=e7oTz65RaFs+soZopzJhoh12f2yOpDyBOz5hlGHewlDIBVpTgGNjO5YNtwJlMg/LTe
         S1DBp1kZo6jBdGW1GJzlZpbIs9esnjos6E6EkqK4ut5rCjCkru2L/T7BB6dFKBkFC0PK
         ynWXNaX44E3rthVSUhrJv14oHB5RCTXbDE3tTnHudRznMGi6gTLhMJjOtUw6XEgejFFn
         iO662O9ThrLeS7o3zuSl4rdFQtMthtxvli9cCbRDcZSMK/ZpMjuk+1z0ps5vwk1b9Zpu
         dYigbVlClnhzxCsZxZ4Q6zvvXwuw/VNLR1jHfy/6t6ZFOd6o96MUWw2lz/S8cYSEuBNW
         AANQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to
         :references:mime-version:content-transfer-encoding;
        bh=v9bZ/oogoWoPQX2w9BuFOFgcjiIV4xqYgwERl5n3obI=;
        b=x7Ag51+GzX5Q9s5z7bvbHmKXEIhE2kooaS6IkXA0N2DiTknZqCjMiXN1g1y83qgglC
         7tByBapA/A/7LigsFpHTl1u++9y77ZxBx03Ih+UT4FQ3xl/y++jAC76iNL830PBUd+QX
         6TZ2qQ69+AOxAA0XaJTFRM5GQYzHgGcQXPPer+voNitqZGlxczi4pCYGDrCVOuRkTHOi
         C/cQizrRByACm2SKZ0+s4ONYxVYvzPc3b77ulz1I1wSHTNxopNgFgeh+1/oct+XX5Hp7
         vw5vVMN6epE7DC7b7lVNbZ5J15bfyJGRhy/AdOSO6n8gg6Q7XOOjKvtaNKOFex5cga1b
         leVA==
X-Gm-Message-State: AOAM530IFWt+avCpKyahQtnC0XnB4Km2gkNwCBwgWZrMXHmoacwtnCAF
        mxWj4v6t4mlCDEzUEBUtw2A=
X-Google-Smtp-Source: 
 ABdhPJzIXLPzU24TwlJbx+3T4KQsSvX/FRp2+cWiy/Bo1wFczL6AgqIJKxutRY1UGxQiL6hCW+YVRg==
X-Received: by 2002:a05:600c:1f05:b0:39c:51c6:7c85 with SMTP id
 bd5-20020a05600c1f0500b0039c51c67c85mr4264753wmb.33.1654525987313;
        Mon, 06 Jun 2022 07:33:07 -0700 (PDT)
Received: from amir-ThinkPad-T480.ctera.local
 (bzq-166-168-31-246.red.bezeqint.net. [31.168.166.246])
        by smtp.gmail.com with ESMTPSA id
 h24-20020a05600c145800b0039c54bb28f2sm1622958wmi.36.2022.06.06.07.33.06
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Mon, 06 Jun 2022 07:33:06 -0700 (PDT)
From: Amir Goldstein <amir73il@gmail.com>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Sasha Levin <sashal@kernel.org>,
        Dave Chinner <david@fromorbit.com>,
        "Darrick J . Wong" <djwong@kernel.org>,
        Christoph Hellwig <hch@lst.de>,
        Brian Foster <bfoster@redhat.com>,
        Christian Brauner <brauner@kernel.org>,
        Luis Chamberlain <mcgrof@kernel.org>,
        Leah Rumancik <leah.rumancik@gmail.com>,
        Adam Manzanares <a.manzanares@samsung.com>,
        linux-xfs@vger.kernel.org, stable@vger.kernel.org
Subject: [PATCH 5.10 v2 3/8] xfs: fix chown leaking delalloc quota blocks when
 fssetxattr fails
Date: Mon,  6 Jun 2022 17:32:50 +0300
Message-Id: <20220606143255.685988-4-amir73il@gmail.com>
X-Mailer: git-send-email 2.25.1
In-Reply-To: <20220606143255.685988-1-amir73il@gmail.com>
References: <20220606143255.685988-1-amir73il@gmail.com>
MIME-Version: 1.0
Precedence: bulk
List-ID: <linux-xfs.vger.kernel.org>
X-Mailing-List: linux-xfs@vger.kernel.org

From: "Darrick J. Wong" <djwong@kernel.org>

commit 1aecf3734a95f3c167d1495550ca57556d33f7ec upstream.

While refactoring the quota code to create a function to allocate inode
change transactions, I noticed that xfs_qm_vop_chown_reserve does more
than just make reservations: it also *modifies* the incore counts
directly to handle the owner id change for the delalloc blocks.

I then observed that the fssetxattr code continues validating input
arguments after making the quota reservation but before dirtying the
transaction.  If the routine decides to error out, it fails to undo the
accounting switch!  This leads to incorrect quota reservation and
failure down the line.

We can fix this by making the reservation function do only that -- for
the new dquot, it reserves ondisk and delalloc blocks to the
transaction, and the old dquot hangs on to its incore reservation for
now.  Once we actually switch the dquots, we can then update the incore
reservations because we've dirtied the transaction and it's too late to
turn back now.

No fixes tag because this has been broken since the start of git.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/xfs/xfs_qm.c | 92 +++++++++++++++++++------------------------------
 1 file changed, 35 insertions(+), 57 deletions(-)

diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
index b2a9abee8b2b..64e5da33733b 100644
--- a/fs/xfs/xfs_qm.c
+++ b/fs/xfs/xfs_qm.c
@@ -1785,6 +1785,29 @@ xfs_qm_vop_chown(
 	xfs_trans_mod_dquot(tp, newdq, bfield, ip->i_d.di_nblocks);
 	xfs_trans_mod_dquot(tp, newdq, XFS_TRANS_DQ_ICOUNT, 1);
 
+	/*
+	 * Back when we made quota reservations for the chown, we reserved the
+	 * ondisk blocks + delalloc blocks with the new dquot.  Now that we've
+	 * switched the dquots, decrease the new dquot's block reservation
+	 * (having already bumped up the real counter) so that we don't have
+	 * any reservation to give back when we commit.
+	 */
+	xfs_trans_mod_dquot(tp, newdq, XFS_TRANS_DQ_RES_BLKS,
+			-ip->i_delayed_blks);
+
+	/*
+	 * Give the incore reservation for delalloc blocks back to the old
+	 * dquot.  We don't normally handle delalloc quota reservations
+	 * transactionally, so just lock the dquot and subtract from the
+	 * reservation.  Dirty the transaction because it's too late to turn
+	 * back now.
+	 */
+	tp->t_flags |= XFS_TRANS_DIRTY;
+	xfs_dqlock(prevdq);
+	ASSERT(prevdq->q_blk.reserved >= ip->i_delayed_blks);
+	prevdq->q_blk.reserved -= ip->i_delayed_blks;
+	xfs_dqunlock(prevdq);
+
 	/*
 	 * Take an extra reference, because the inode is going to keep
 	 * this dquot pointer even after the trans_commit.
@@ -1807,84 +1830,39 @@ xfs_qm_vop_chown_reserve(
 	uint			flags)
 {
 	struct xfs_mount	*mp = ip->i_mount;
-	uint64_t		delblks;
 	unsigned int		blkflags;
-	struct xfs_dquot	*udq_unres = NULL;
-	struct xfs_dquot	*gdq_unres = NULL;
-	struct xfs_dquot	*pdq_unres = NULL;
 	struct xfs_dquot	*udq_delblks = NULL;
 	struct xfs_dquot	*gdq_delblks = NULL;
 	struct xfs_dquot	*pdq_delblks = NULL;
-	int			error;
-
 
 	ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL|XFS_ILOCK_SHARED));
 	ASSERT(XFS_IS_QUOTA_RUNNING(mp));
 
-	delblks = ip->i_delayed_blks;
 	blkflags = XFS_IS_REALTIME_INODE(ip) ?
 			XFS_QMOPT_RES_RTBLKS : XFS_QMOPT_RES_REGBLKS;
 
 	if (XFS_IS_UQUOTA_ON(mp) && udqp &&
-	    i_uid_read(VFS_I(ip)) != udqp->q_id) {
+	    i_uid_read(VFS_I(ip)) != udqp->q_id)
 		udq_delblks = udqp;
-		/*
-		 * If there are delayed allocation blocks, then we have to
-		 * unreserve those from the old dquot, and add them to the
-		 * new dquot.
-		 */
-		if (delblks) {
-			ASSERT(ip->i_udquot);
-			udq_unres = ip->i_udquot;
-		}
-	}
+
 	if (XFS_IS_GQUOTA_ON(ip->i_mount) && gdqp &&
-	    i_gid_read(VFS_I(ip)) != gdqp->q_id) {
+	    i_gid_read(VFS_I(ip)) != gdqp->q_id)
 		gdq_delblks = gdqp;
-		if (delblks) {
-			ASSERT(ip->i_gdquot);
-			gdq_unres = ip->i_gdquot;
-		}
-	}
 
 	if (XFS_IS_PQUOTA_ON(ip->i_mount) && pdqp &&
-	    ip->i_d.di_projid != pdqp->q_id) {
+	    ip->i_d.di_projid != pdqp->q_id)
 		pdq_delblks = pdqp;
-		if (delblks) {
-			ASSERT(ip->i_pdquot);
-			pdq_unres = ip->i_pdquot;
-		}
-	}
-
-	error = xfs_trans_reserve_quota_bydquots(tp, ip->i_mount,
-				udq_delblks, gdq_delblks, pdq_delblks,
-				ip->i_d.di_nblocks, 1, flags | blkflags);
-	if (error)
-		return error;
 
 	/*
-	 * Do the delayed blks reservations/unreservations now. Since, these
-	 * are done without the help of a transaction, if a reservation fails
-	 * its previous reservations won't be automatically undone by trans
-	 * code. So, we have to do it manually here.
+	 * Reserve enough quota to handle blocks on disk and reserved for a
+	 * delayed allocation.  We'll actually transfer the delalloc
+	 * reservation between dquots at chown time, even though that part is
+	 * only semi-transactional.
 	 */
-	if (delblks) {
-		/*
-		 * Do the reservations first. Unreservation can't fail.
-		 */
-		ASSERT(udq_delblks || gdq_delblks || pdq_delblks);
-		ASSERT(udq_unres || gdq_unres || pdq_unres);
-		error = xfs_trans_reserve_quota_bydquots(NULL, ip->i_mount,
-			    udq_delblks, gdq_delblks, pdq_delblks,
-			    (xfs_qcnt_t)delblks, 0, flags | blkflags);
-		if (error)
-			return error;
-		xfs_trans_reserve_quota_bydquots(NULL, ip->i_mount,
-				udq_unres, gdq_unres, pdq_unres,
-				-((xfs_qcnt_t)delblks), 0, blkflags);
-	}
-
-	return 0;
+	return xfs_trans_reserve_quota_bydquots(tp, ip->i_mount, udq_delblks,
+			gdq_delblks, pdq_delblks,
+			ip->i_d.di_nblocks + ip->i_delayed_blks,
+			1, blkflags | flags);
 }
 
 int

From patchwork Mon Jun  6 14:32:51 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Amir Goldstein <amir73il@gmail.com>
X-Patchwork-Id: 12870429
Return-Path: <linux-xfs-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id F272EC433EF
	for <linux-xfs@archiver.kernel.org>; Mon,  6 Jun 2022 14:33:13 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S239762AbiFFOdM (ORCPT <rfc822;linux-xfs@archiver.kernel.org>);
        Mon, 6 Jun 2022 10:33:12 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51492 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S239749AbiFFOdL (ORCPT
        <rfc822;linux-xfs@vger.kernel.org>); Mon, 6 Jun 2022 10:33:11 -0400
Received: from mail-wr1-x42d.google.com (mail-wr1-x42d.google.com
 [IPv6:2a00:1450:4864:20::42d])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 687362AC5F;
        Mon,  6 Jun 2022 07:33:10 -0700 (PDT)
Received: by mail-wr1-x42d.google.com with SMTP id p10so20118989wrg.12;
        Mon, 06 Jun 2022 07:33:10 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20210112;
        h=from:to:cc:subject:date:message-id:in-reply-to:references
         :mime-version:content-transfer-encoding;
        bh=1QqcTEPtoODTXKz5bXn/012pkjfLEsGJrtDgaQa4Ijw=;
        b=Ib49PzdWGwEDkmiRo1kCrpEvD3rUNAHDiUAO9fAKNOejpqOXliEMniLhMv4BmSbMZ5
         0KJ11+QorM3i8WCBn1JwbBQaLbrP5qgQ9MnkVYxZ8CCn7YxR3DOkRpsuhJVog9AVeWNz
         Tt5f8347aQrIySRHQ0y6h0Xy5xy9jF8Mt8JWDvjtTa2+Ob1jK6LCf4FBXUJox6MGUHT/
         g194Y/utE9B8H+Q781LcyHdILTT89CxZd4ZiPxZ0j48Ak9C9M9BvPAHx0297qr4xWChw
         HClUHMHTHk7+xMrEbsMz/PUTeCo7+10wGDwbv9AHcJPteBN1cka05is9+C/osey+CORp
         bVJg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to
         :references:mime-version:content-transfer-encoding;
        bh=1QqcTEPtoODTXKz5bXn/012pkjfLEsGJrtDgaQa4Ijw=;
        b=3vwJjlirwh5sZD0jRNqkrbmaji7Ip7xN1a8oC3He8C1rJ+sv+7AwTbWPa1mnE+mGgx
         MQwlCs7mjv86hcbWbu5VAmIBqMgBmiX0gCkBFxFcxClW4TpV7dg9u2h/lpm6UyBxD0az
         YJ/HRsE68pN3G+yytfavTQhN9A04K80w/j3Vtrk25twcHmO0NY9VbNkbSIarOkzUdWHK
         A5QAPZeSBztS0qr7VqiN4idGZkUQikQTKw1FZcjRPfzNdmK2JdYlKKQ+JVLXlGT7CSUB
         6rcHvEhqnLTe2yM27EXryIU6O7qhZKv452yOJHQ7YiZSe3UsfX6Zh+dzUycpYpxhHM7H
         4fyw==
X-Gm-Message-State: AOAM530ypGga5UIwjoo06u0sQZXvzYmflL2ZP4ozF7Qz9teKjjnuApiL
        kboVFOR+citBtlpZSw7+skc=
X-Google-Smtp-Source: 
 ABdhPJwZm7a+WVaiXpXkbBFEypjjggoNyMCYBaYFbMik0IO+NmcOo+VVAkNMpipktdyQt24TMjcoBg==
X-Received: by 2002:a05:6000:1542:b0:20f:f809:cf89 with SMTP id
 2-20020a056000154200b0020ff809cf89mr21740382wry.361.1654525988797;
        Mon, 06 Jun 2022 07:33:08 -0700 (PDT)
Received: from amir-ThinkPad-T480.ctera.local
 (bzq-166-168-31-246.red.bezeqint.net. [31.168.166.246])
        by smtp.gmail.com with ESMTPSA id
 h24-20020a05600c145800b0039c54bb28f2sm1622958wmi.36.2022.06.06.07.33.07
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Mon, 06 Jun 2022 07:33:08 -0700 (PDT)
From: Amir Goldstein <amir73il@gmail.com>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Sasha Levin <sashal@kernel.org>,
        Dave Chinner <david@fromorbit.com>,
        "Darrick J . Wong" <djwong@kernel.org>,
        Christoph Hellwig <hch@lst.de>,
        Brian Foster <bfoster@redhat.com>,
        Christian Brauner <brauner@kernel.org>,
        Luis Chamberlain <mcgrof@kernel.org>,
        Leah Rumancik <leah.rumancik@gmail.com>,
        Adam Manzanares <a.manzanares@samsung.com>,
        linux-xfs@vger.kernel.org, stable@vger.kernel.org,
        Chandan Babu R <chandanrlinux@gmail.com>
Subject: [PATCH 5.10 v2 4/8] xfs: fix incorrect root dquot corruption error
 when switching group/project quota types
Date: Mon,  6 Jun 2022 17:32:51 +0300
Message-Id: <20220606143255.685988-5-amir73il@gmail.com>
X-Mailer: git-send-email 2.25.1
In-Reply-To: <20220606143255.685988-1-amir73il@gmail.com>
References: <20220606143255.685988-1-amir73il@gmail.com>
MIME-Version: 1.0
Precedence: bulk
List-ID: <linux-xfs.vger.kernel.org>
X-Mailing-List: linux-xfs@vger.kernel.org

From: "Darrick J. Wong" <djwong@kernel.org>

commit 45068063efb7dd0a8d115c106aa05d9ab0946257 upstream.

While writing up a regression test for broken behavior when a chprojid
request fails, I noticed that we were logging corruption notices about
the root dquot of the group/project quota file at mount time when
testing V4 filesystems.

In commit afeda6000b0c, I was trying to improve ondisk dquot validation
by making sure that when we load an ondisk dquot into memory on behalf
of an incore dquot, the dquot id and type matches.  Unfortunately, I
forgot that V4 filesystems only have two quota files, and can switch
that file between group and project quota types at mount time.  When we
perform that switch, we'll try to load the default quota limits from the
root dquot prior to running quotacheck and log a corruption error when
the types don't match.

This is inconsequential because quotacheck will reset the second quota
file as part of doing the switch, but we shouldn't leave scary messages
in the kernel log.

Fixes: afeda6000b0c ("xfs: validate ondisk/incore dquot flags")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/xfs/xfs_dquot.c | 39 +++++++++++++++++++++++++++++++++++++--
 1 file changed, 37 insertions(+), 2 deletions(-)

diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c
index 1d95ed387d66..80c4579d6835 100644
--- a/fs/xfs/xfs_dquot.c
+++ b/fs/xfs/xfs_dquot.c
@@ -500,6 +500,42 @@ xfs_dquot_alloc(
 	return dqp;
 }
 
+/* Check the ondisk dquot's id and type match what the incore dquot expects. */
+static bool
+xfs_dquot_check_type(
+	struct xfs_dquot	*dqp,
+	struct xfs_disk_dquot	*ddqp)
+{
+	uint8_t			ddqp_type;
+	uint8_t			dqp_type;
+
+	ddqp_type = ddqp->d_type & XFS_DQTYPE_REC_MASK;
+	dqp_type = xfs_dquot_type(dqp);
+
+	if (be32_to_cpu(ddqp->d_id) != dqp->q_id)
+		return false;
+
+	/*
+	 * V5 filesystems always expect an exact type match.  V4 filesystems
+	 * expect an exact match for user dquots and for non-root group and
+	 * project dquots.
+	 */
+	if (xfs_sb_version_hascrc(&dqp->q_mount->m_sb) ||
+	    dqp_type == XFS_DQTYPE_USER || dqp->q_id != 0)
+		return ddqp_type == dqp_type;
+
+	/*
+	 * V4 filesystems support either group or project quotas, but not both
+	 * at the same time.  The non-user quota file can be switched between
+	 * group and project quota uses depending on the mount options, which
+	 * means that we can encounter the other type when we try to load quota
+	 * defaults.  Quotacheck will soon reset the the entire quota file
+	 * (including the root dquot) anyway, but don't log scary corruption
+	 * reports to dmesg.
+	 */
+	return ddqp_type == XFS_DQTYPE_GROUP || ddqp_type == XFS_DQTYPE_PROJ;
+}
+
 /* Copy the in-core quota fields in from the on-disk buffer. */
 STATIC int
 xfs_dquot_from_disk(
@@ -512,8 +548,7 @@ xfs_dquot_from_disk(
 	 * Ensure that we got the type and ID we were looking for.
 	 * Everything else was checked by the dquot buffer verifier.
 	 */
-	if ((ddqp->d_type & XFS_DQTYPE_REC_MASK) != xfs_dquot_type(dqp) ||
-	    be32_to_cpu(ddqp->d_id) != dqp->q_id) {
+	if (!xfs_dquot_check_type(dqp, ddqp)) {
 		xfs_alert_tag(bp->b_mount, XFS_PTAG_VERIFIER_ERROR,
 			  "Metadata corruption detected at %pS, quota %u",
 			  __this_address, dqp->q_id);

From patchwork Mon Jun  6 14:32:52 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Amir Goldstein <amir73il@gmail.com>
X-Patchwork-Id: 12870430
Return-Path: <linux-xfs-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id F2B32CCA482
	for <linux-xfs@archiver.kernel.org>; Mon,  6 Jun 2022 14:33:14 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S239766AbiFFOdO (ORCPT <rfc822;linux-xfs@archiver.kernel.org>);
        Mon, 6 Jun 2022 10:33:14 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51630 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S239749AbiFFOdN (ORCPT
        <rfc822;linux-xfs@vger.kernel.org>); Mon, 6 Jun 2022 10:33:13 -0400
Received: from mail-wm1-x32a.google.com (mail-wm1-x32a.google.com
 [IPv6:2a00:1450:4864:20::32a])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2F6412B248;
        Mon,  6 Jun 2022 07:33:12 -0700 (PDT)
Received: by mail-wm1-x32a.google.com with SMTP id
 l2-20020a05600c1d0200b0039c35ef94c4so5902227wms.4;
        Mon, 06 Jun 2022 07:33:12 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20210112;
        h=from:to:cc:subject:date:message-id:in-reply-to:references
         :mime-version:content-transfer-encoding;
        bh=SzyQsb1o4fSk85fpWvWY8/HRYw+PK4Y1RJ618LPBXOE=;
        b=T5CRrkmg3r1UzZG2LLjrDRF/l413u2riFgZL8u86FhGjCdozXKk/HYAuRGaOlB4Fol
         jfwsZ86tRuox/xpaMePq/T5DwznaWubv7whYvT13uSeGC9gNRJ71Bu1+ytkmoSG0lpcO
         bhNL/3ejdmfsi4gnQIHaXHFwlFaxnjK66X6BGzi68pWImdQnA40n60M1+n/AgcPme7hL
         60dJMVG58x1SNdn82ToW6mOQL5TRfNQSGuauYW/VQPDljWU3SN5jHNO1EjEqIejcFSIK
         6cgV6AFBT6XWWm/ubTUQWDKWHC+kFY7dF3bUhcFeGf1zSmNBEGxZqSVFdvGja6Yn7j6F
         eUfA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to
         :references:mime-version:content-transfer-encoding;
        bh=SzyQsb1o4fSk85fpWvWY8/HRYw+PK4Y1RJ618LPBXOE=;
        b=n6y4F+FZnvCXpDlfB/8NMcbEjPAY+pHkmLOYpuknjILSnwWm0e3YUboGS6oklzgoED
         D4Zx3/lKEqDWKTD08+O+ma+toGPAoOmFiu7CmvzxEbqJZvGBA2pzRGQdM+wIHY7wuPFY
         XSGm51RFTNFbTtcxYCLcr9QBANidRsfZbxGy8WaLZsvx2UB4C5AuZDWVhSTqGFPMyOaJ
         XABdxvlMPvwgQ0sJLwZlH+hzRd8VwjB+UfzoRE8hav0pB+PbbygK7osovkcVJ1iKO2/8
         fbE3WMTonrfPWTBkH4Dqb1357AnURYr8oU/UL8wIH+4wJOpDT5LdEcQcxl8Q08CWXZAD
         Q2cg==
X-Gm-Message-State: AOAM532hudFGnOk2RBSkadBna4nx495pgOmj5HIeDKjG8TXBLkrDCtdB
        btjiMRteES8XHvnCPBkjxOQ=
X-Google-Smtp-Source: 
 ABdhPJxU0b1GxwHBraCdgM9PVN1ZncTUb4oO3fgL83YJfb6c1ByhYEyN2dSYBm/Cg0S2jShQuyHAlw==
X-Received: by 2002:a7b:c758:0:b0:39c:44ce:f00f with SMTP id
 w24-20020a7bc758000000b0039c44cef00fmr13674785wmk.167.1654525990476;
        Mon, 06 Jun 2022 07:33:10 -0700 (PDT)
Received: from amir-ThinkPad-T480.ctera.local
 (bzq-166-168-31-246.red.bezeqint.net. [31.168.166.246])
        by smtp.gmail.com with ESMTPSA id
 h24-20020a05600c145800b0039c54bb28f2sm1622958wmi.36.2022.06.06.07.33.08
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Mon, 06 Jun 2022 07:33:09 -0700 (PDT)
From: Amir Goldstein <amir73il@gmail.com>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Sasha Levin <sashal@kernel.org>,
        Dave Chinner <david@fromorbit.com>,
        "Darrick J . Wong" <djwong@kernel.org>,
        Christoph Hellwig <hch@lst.de>,
        Brian Foster <bfoster@redhat.com>,
        Christian Brauner <brauner@kernel.org>,
        Luis Chamberlain <mcgrof@kernel.org>,
        Leah Rumancik <leah.rumancik@gmail.com>,
        Adam Manzanares <a.manzanares@samsung.com>,
        linux-xfs@vger.kernel.org, stable@vger.kernel.org,
        Eric Sandeen <sandeen@redhat.com>
Subject: [PATCH 5.10 v2 5/8] xfs: restore shutdown check in mapped write fault
 path
Date: Mon,  6 Jun 2022 17:32:52 +0300
Message-Id: <20220606143255.685988-6-amir73il@gmail.com>
X-Mailer: git-send-email 2.25.1
In-Reply-To: <20220606143255.685988-1-amir73il@gmail.com>
References: <20220606143255.685988-1-amir73il@gmail.com>
MIME-Version: 1.0
Precedence: bulk
List-ID: <linux-xfs.vger.kernel.org>
X-Mailing-List: linux-xfs@vger.kernel.org

From: Brian Foster <bfoster@redhat.com>

commit e4826691cc7e5458bcb659935d0092bcf3f08c20 upstream.

XFS triggers an iomap warning in the write fault path due to a
!PageUptodate() page if a write fault happens to occur on a page
that recently failed writeback. The iomap writeback error handling
code can clear the Uptodate flag if no portion of the page is
submitted for I/O. This is reproduced by fstest generic/019, which
combines various forms of I/O with simulated disk failures that
inevitably lead to filesystem shutdown (which then unconditionally
fails page writeback).

This is a regression introduced by commit f150b4234397 ("xfs: split
the iomap ops for buffered vs direct writes") due to the removal of
a shutdown check and explicit error return in the ->iomap_begin()
path used by the write fault path. The explicit error return
historically translated to a SIGBUS, but now carries on with iomap
processing where it complains about the unexpected state. Restore
the shutdown check to xfs_buffered_write_iomap_begin() to restore
historical behavior.

Fixes: f150b4234397 ("xfs: split the iomap ops for buffered vs direct writes")
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/xfs/xfs_iomap.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 7b9ff824e82d..74bc2beadc23 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -870,6 +870,9 @@ xfs_buffered_write_iomap_begin(
 	int			allocfork = XFS_DATA_FORK;
 	int			error = 0;
 
+	if (XFS_FORCED_SHUTDOWN(mp))
+		return -EIO;
+
 	/* we can't use delayed allocations when using extent size hints */
 	if (xfs_get_extsz_hint(ip))
 		return xfs_direct_write_iomap_begin(inode, offset, count,

From patchwork Mon Jun  6 14:32:53 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Amir Goldstein <amir73il@gmail.com>
X-Patchwork-Id: 12870431
Return-Path: <linux-xfs-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 932E6C43334
	for <linux-xfs@archiver.kernel.org>; Mon,  6 Jun 2022 14:33:16 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S239749AbiFFOdP (ORCPT <rfc822;linux-xfs@archiver.kernel.org>);
        Mon, 6 Jun 2022 10:33:15 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51750 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S239732AbiFFOdP (ORCPT
        <rfc822;linux-xfs@vger.kernel.org>); Mon, 6 Jun 2022 10:33:15 -0400
Received: from mail-wm1-x32e.google.com (mail-wm1-x32e.google.com
 [IPv6:2a00:1450:4864:20::32e])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A19952B26A;
        Mon,  6 Jun 2022 07:33:13 -0700 (PDT)
Received: by mail-wm1-x32e.google.com with SMTP id
 l2-20020a05600c1d0200b0039c35ef94c4so5902275wms.4;
        Mon, 06 Jun 2022 07:33:13 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20210112;
        h=from:to:cc:subject:date:message-id:in-reply-to:references
         :mime-version:content-transfer-encoding;
        bh=b2pvl9UHk0Z4rkTZyge+oGFzFbTSQ4HA4DjZnHNbMzc=;
        b=g8urBqKX4RBTuND0IYv3jK3ybEgxB+uZRdqw6s9x3R7/5VQJYkMMNvF9wNKJNjlpp9
         VozOfbUy5Dg1tdnRaqQwPi2EAMNq1uNRupobKHXlLjVD+niwu5NnuF8ZTiVsl7+Gr0a1
         1XqST1Pl9U5P+C8xN1WcLV0msfNNI1MNhQkY3QFLzywgF4RYygXM6596QQPUxC1kxaE3
         D6dGLK9xp7n83BJDfIjX1UgFCDphJ/ajLDy+7LG0PX7zYj5EyF+ZJyc3SiAOBooQI2HQ
         aYrMbw+zahFSrg4RAYAa3OhyWX3u9sUeRj57q21motqpEBRT2Z5khAR5LuaGFxjYbNZw
         PRyQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to
         :references:mime-version:content-transfer-encoding;
        bh=b2pvl9UHk0Z4rkTZyge+oGFzFbTSQ4HA4DjZnHNbMzc=;
        b=6d8SVDW+4Cku6YnRs07ANWZyfEIfBxDaGT3C6TTNvoOJtmsnEjvEnrpaQMufiBrqsd
         Xpp2ouZgcjp1AESc9HBn4xc4P34y0hP4cF+tejEPGlEyltG3ZeCWKxgFgQnmK0dW+qTw
         rbtmKrc+LYM3it/A2jVudthMTwVzSCENkBU7mPtISY/JBMHVjw55C+81Zv/kjG9J/AGb
         b0VJdG30QpgCGp+u192l/D3YfZvod0YjEZV6fwG41DOrNLFLiWmDHx4wR4A6LP2duKQm
         aq/AkGH6GfdrjMkp9tQwQsH611dz2e1UCYffloGhTDTvgsdvclG987GZ+3DJ9hs/Bq2L
         MfLw==
X-Gm-Message-State: AOAM5321AmV/UZ3nb9Thb2WVNbhkNusWZBFTX/j7uzWxJlB+BwBtb1Wn
        APAVfw6r6nDUZxOsr1TJdg8=
X-Google-Smtp-Source: 
 ABdhPJwbzjT61Mfy/LXqd9xaTaRZrjZCrBtwYWm4KRbnpFOnKqPKK+KChB85Fzms7yz0jfcWaCO2Ew==
X-Received: by 2002:a05:600c:ac4:b0:39c:4f54:9c5f with SMTP id
 c4-20020a05600c0ac400b0039c4f549c5fmr5223211wmr.135.1654525992079;
        Mon, 06 Jun 2022 07:33:12 -0700 (PDT)
Received: from amir-ThinkPad-T480.ctera.local
 (bzq-166-168-31-246.red.bezeqint.net. [31.168.166.246])
        by smtp.gmail.com with ESMTPSA id
 h24-20020a05600c145800b0039c54bb28f2sm1622958wmi.36.2022.06.06.07.33.10
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Mon, 06 Jun 2022 07:33:11 -0700 (PDT)
From: Amir Goldstein <amir73il@gmail.com>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Sasha Levin <sashal@kernel.org>,
        Dave Chinner <david@fromorbit.com>,
        "Darrick J . Wong" <djwong@kernel.org>,
        Christoph Hellwig <hch@lst.de>,
        Brian Foster <bfoster@redhat.com>,
        Christian Brauner <brauner@kernel.org>,
        Luis Chamberlain <mcgrof@kernel.org>,
        Leah Rumancik <leah.rumancik@gmail.com>,
        Adam Manzanares <a.manzanares@samsung.com>,
        linux-xfs@vger.kernel.org, stable@vger.kernel.org,
        Dave Chinner <dchinner@redhat.com>
Subject: [PATCH 5.10 v2 6/8] xfs: force log and push AIL to clear pinned
 inodes when aborting mount
Date: Mon,  6 Jun 2022 17:32:53 +0300
Message-Id: <20220606143255.685988-7-amir73il@gmail.com>
X-Mailer: git-send-email 2.25.1
In-Reply-To: <20220606143255.685988-1-amir73il@gmail.com>
References: <20220606143255.685988-1-amir73il@gmail.com>
MIME-Version: 1.0
Precedence: bulk
List-ID: <linux-xfs.vger.kernel.org>
X-Mailing-List: linux-xfs@vger.kernel.org

From: "Darrick J. Wong" <djwong@kernel.org>

commit d336f7ebc65007f5831e2297e6f3383ae8dbf8ed upstream.

If we allocate quota inodes in the process of mounting a filesystem but
then decide to abort the mount, it's possible that the quota inodes are
sitting around pinned by the log.  Now that inode reclaim relies on the
AIL to flush inodes, we have to force the log and push the AIL in
between releasing the quota inodes and kicking off reclaim to tear down
all the incore inodes.  Do this by extracting the bits we need from the
unmount path and reusing them.  As an added bonus, failed writes during
a failed mount will not retry forever now.

This was originally found during a fuzz test of metadata directories
(xfs/1546), but the actual symptom was that reclaim hung up on the quota
inodes.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/xfs/xfs_mount.c | 90 +++++++++++++++++++++++-----------------------
 1 file changed, 44 insertions(+), 46 deletions(-)

diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index a62b8a574409..44b05e1d5d32 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -631,6 +631,47 @@ xfs_check_summary_counts(
 	return xfs_initialize_perag_data(mp, mp->m_sb.sb_agcount);
 }
 
+/*
+ * Flush and reclaim dirty inodes in preparation for unmount. Inodes and
+ * internal inode structures can be sitting in the CIL and AIL at this point,
+ * so we need to unpin them, write them back and/or reclaim them before unmount
+ * can proceed.
+ *
+ * An inode cluster that has been freed can have its buffer still pinned in
+ * memory because the transaction is still sitting in a iclog. The stale inodes
+ * on that buffer will be pinned to the buffer until the transaction hits the
+ * disk and the callbacks run. Pushing the AIL will skip the stale inodes and
+ * may never see the pinned buffer, so nothing will push out the iclog and
+ * unpin the buffer.
+ *
+ * Hence we need to force the log to unpin everything first. However, log
+ * forces don't wait for the discards they issue to complete, so we have to
+ * explicitly wait for them to complete here as well.
+ *
+ * Then we can tell the world we are unmounting so that error handling knows
+ * that the filesystem is going away and we should error out anything that we
+ * have been retrying in the background.  This will prevent never-ending
+ * retries in AIL pushing from hanging the unmount.
+ *
+ * Finally, we can push the AIL to clean all the remaining dirty objects, then
+ * reclaim the remaining inodes that are still in memory at this point in time.
+ */
+static void
+xfs_unmount_flush_inodes(
+	struct xfs_mount	*mp)
+{
+	xfs_log_force(mp, XFS_LOG_SYNC);
+	xfs_extent_busy_wait_all(mp);
+	flush_workqueue(xfs_discard_wq);
+
+	mp->m_flags |= XFS_MOUNT_UNMOUNTING;
+
+	xfs_ail_push_all_sync(mp->m_ail);
+	cancel_delayed_work_sync(&mp->m_reclaim_work);
+	xfs_reclaim_inodes(mp);
+	xfs_health_unmount(mp);
+}
+
 /*
  * This function does the following on an initial mount of a file system:
  *	- reads the superblock from disk and init the mount struct
@@ -1005,7 +1046,7 @@ xfs_mountfs(
 	/* Clean out dquots that might be in memory after quotacheck. */
 	xfs_qm_unmount(mp);
 	/*
-	 * Cancel all delayed reclaim work and reclaim the inodes directly.
+	 * Flush all inode reclamation work and flush the log.
 	 * We have to do this /after/ rtunmount and qm_unmount because those
 	 * two will have scheduled delayed reclaim for the rt/quota inodes.
 	 *
@@ -1015,11 +1056,8 @@ xfs_mountfs(
 	 * qm_unmount_quotas and therefore rely on qm_unmount to release the
 	 * quota inodes.
 	 */
-	cancel_delayed_work_sync(&mp->m_reclaim_work);
-	xfs_reclaim_inodes(mp);
-	xfs_health_unmount(mp);
+	xfs_unmount_flush_inodes(mp);
  out_log_dealloc:
-	mp->m_flags |= XFS_MOUNT_UNMOUNTING;
 	xfs_log_mount_cancel(mp);
  out_fail_wait:
 	if (mp->m_logdev_targp && mp->m_logdev_targp != mp->m_ddev_targp)
@@ -1060,47 +1098,7 @@ xfs_unmountfs(
 	xfs_rtunmount_inodes(mp);
 	xfs_irele(mp->m_rootip);
 
-	/*
-	 * We can potentially deadlock here if we have an inode cluster
-	 * that has been freed has its buffer still pinned in memory because
-	 * the transaction is still sitting in a iclog. The stale inodes
-	 * on that buffer will be pinned to the buffer until the
-	 * transaction hits the disk and the callbacks run. Pushing the AIL will
-	 * skip the stale inodes and may never see the pinned buffer, so
-	 * nothing will push out the iclog and unpin the buffer. Hence we
-	 * need to force the log here to ensure all items are flushed into the
-	 * AIL before we go any further.
-	 */
-	xfs_log_force(mp, XFS_LOG_SYNC);
-
-	/*
-	 * Wait for all busy extents to be freed, including completion of
-	 * any discard operation.
-	 */
-	xfs_extent_busy_wait_all(mp);
-	flush_workqueue(xfs_discard_wq);
-
-	/*
-	 * We now need to tell the world we are unmounting. This will allow
-	 * us to detect that the filesystem is going away and we should error
-	 * out anything that we have been retrying in the background. This will
-	 * prevent neverending retries in AIL pushing from hanging the unmount.
-	 */
-	mp->m_flags |= XFS_MOUNT_UNMOUNTING;
-
-	/*
-	 * Flush all pending changes from the AIL.
-	 */
-	xfs_ail_push_all_sync(mp->m_ail);
-
-	/*
-	 * Reclaim all inodes. At this point there should be no dirty inodes and
-	 * none should be pinned or locked. Stop background inode reclaim here
-	 * if it is still running.
-	 */
-	cancel_delayed_work_sync(&mp->m_reclaim_work);
-	xfs_reclaim_inodes(mp);
-	xfs_health_unmount(mp);
+	xfs_unmount_flush_inodes(mp);
 
 	xfs_qm_unmount(mp);
 

From patchwork Mon Jun  6 14:32:54 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Amir Goldstein <amir73il@gmail.com>
X-Patchwork-Id: 12870432
Return-Path: <linux-xfs-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id D8FA1CCA481
	for <linux-xfs@archiver.kernel.org>; Mon,  6 Jun 2022 14:33:17 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S239768AbiFFOdQ (ORCPT <rfc822;linux-xfs@archiver.kernel.org>);
        Mon, 6 Jun 2022 10:33:16 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51818 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S239732AbiFFOdQ (ORCPT
        <rfc822;linux-xfs@vger.kernel.org>); Mon, 6 Jun 2022 10:33:16 -0400
Received: from mail-wm1-x32a.google.com (mail-wm1-x32a.google.com
 [IPv6:2a00:1450:4864:20::32a])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 46C3D2BB1B;
        Mon,  6 Jun 2022 07:33:15 -0700 (PDT)
Received: by mail-wm1-x32a.google.com with SMTP id
 d5-20020a05600c34c500b0039776acee62so7262809wmq.1;
        Mon, 06 Jun 2022 07:33:15 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20210112;
        h=from:to:cc:subject:date:message-id:in-reply-to:references
         :mime-version:content-transfer-encoding;
        bh=H8xbwTR16tvG4U/ddVZgdCLGD5EiTHqSz9fp9YwAxOk=;
        b=FXdeYNKMfp+gJSwO9mbfKLVfWObqC6NgNrZ7zFrCml4j+L8qsc+wNZzX7T7sFz0vh/
         nHnf4e1tVkgF5QG+5PS5uoAWch4HIBhNUo0gW19IfIof0ThXbauTxVH/3WjhEdj02Nei
         oHz+TP5WsYJlOZEslT1yqyBemXzOgwIMWZf6gqwzy/o6SWzatmdP02EZcvmq4pJwb8q7
         vcM+qvmYUzKvSMko6bsxn25d4RJoPAznVc/8vyNnht+DVLbIXHCxQ5Pj4dtohGvXWEih
         5N5ph5SpAkb0DVqe0yFWCY1/7uCtX2PPKQFMlkaVSmx8lX4V+P/Nq1A9OHeA/DOeDgiR
         9kYg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to
         :references:mime-version:content-transfer-encoding;
        bh=H8xbwTR16tvG4U/ddVZgdCLGD5EiTHqSz9fp9YwAxOk=;
        b=Ll3LgZuvUH/ry62Xt7YjxwOKe43em8RkvUQl8BcXzXAxT+zQJAoVTN8UHGIa9imLW/
         Y4VW3PO8L/KGLxuQc4ks8p36Gr8z/7ChHyxTa6bGfK5a4dkcMgicTEjV5zGhrkB25AUB
         r0ssin9+XioCspxisaNE5/+fAFxq0Fc3M2qU8AiskmwyTbG/Pyszh2hJFUu+9GgAl2rT
         xikoBmoqEQMBNcVloAq9Z/lVsD7Lfqqlc8lGEz2g64/FqtAJ5tNpY31nMFmKkyOsWo0n
         7DYkt4z/WxqbwN5yQ6hlDSY9KrWmUMvmA3wRv+crDFNdxZdiRMmdOG/sFrKDBEZ97Wma
         h4Zg==
X-Gm-Message-State: AOAM531l97Pj1IZ0t9R+mdwlFKlZqme8pEomjamlkVZ4CCQxmmVAAtmr
        YFA4DYmwrQgFoK6Ut/YchJk=
X-Google-Smtp-Source: 
 ABdhPJyaTYvxYliaA9PJPhTp779qgoAR8pWs0axIijaQA40qR6W1oUXK1V27x3BigtBSNzS5KhyVTQ==
X-Received: by 2002:a05:600c:354a:b0:39c:4ebf:fb4c with SMTP id
 i10-20020a05600c354a00b0039c4ebffb4cmr6121926wmq.142.1654525993737;
        Mon, 06 Jun 2022 07:33:13 -0700 (PDT)
Received: from amir-ThinkPad-T480.ctera.local
 (bzq-166-168-31-246.red.bezeqint.net. [31.168.166.246])
        by smtp.gmail.com with ESMTPSA id
 h24-20020a05600c145800b0039c54bb28f2sm1622958wmi.36.2022.06.06.07.33.12
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Mon, 06 Jun 2022 07:33:13 -0700 (PDT)
From: Amir Goldstein <amir73il@gmail.com>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Sasha Levin <sashal@kernel.org>,
        Dave Chinner <david@fromorbit.com>,
        "Darrick J . Wong" <djwong@kernel.org>,
        Christoph Hellwig <hch@lst.de>,
        Brian Foster <bfoster@redhat.com>,
        Christian Brauner <brauner@kernel.org>,
        Luis Chamberlain <mcgrof@kernel.org>,
        Leah Rumancik <leah.rumancik@gmail.com>,
        Adam Manzanares <a.manzanares@samsung.com>,
        linux-xfs@vger.kernel.org, stable@vger.kernel.org
Subject: [PATCH 5.10 v2 7/8] xfs: consider shutdown in bmapbt cursor delete
 assert
Date: Mon,  6 Jun 2022 17:32:54 +0300
Message-Id: <20220606143255.685988-8-amir73il@gmail.com>
X-Mailer: git-send-email 2.25.1
In-Reply-To: <20220606143255.685988-1-amir73il@gmail.com>
References: <20220606143255.685988-1-amir73il@gmail.com>
MIME-Version: 1.0
Precedence: bulk
List-ID: <linux-xfs.vger.kernel.org>
X-Mailing-List: linux-xfs@vger.kernel.org

From: Brian Foster <bfoster@redhat.com>

commit 1cd738b13ae9b29e03d6149f0246c61f76e81fcf upstream.

The assert in xfs_btree_del_cursor() checks that the bmapbt block
allocation field has been handled correctly before the cursor is
freed. This field is used for accurate calculation of indirect block
reservation requirements (for delayed allocations), for example.
generic/019 reproduces a scenario where this assert fails because
the filesystem has shutdown while in the middle of a bmbt record
insertion. This occurs after a bmbt block has been allocated via the
cursor but before the higher level bmap function (i.e.
xfs_bmap_add_extent_hole_real()) completes and resets the field.

Update the assert to accommodate the transient state if the
filesystem has shutdown. While here, clean up the indentation and
comments in the function.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/xfs/libxfs/xfs_btree.c | 33 ++++++++++++---------------------
 1 file changed, 12 insertions(+), 21 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c
index 2d25bab68764..9f9f9feccbcd 100644
--- a/fs/xfs/libxfs/xfs_btree.c
+++ b/fs/xfs/libxfs/xfs_btree.c
@@ -353,20 +353,17 @@ xfs_btree_free_block(
  */
 void
 xfs_btree_del_cursor(
-	xfs_btree_cur_t	*cur,		/* btree cursor */
-	int		error)		/* del because of error */
+	struct xfs_btree_cur	*cur,		/* btree cursor */
+	int			error)		/* del because of error */
 {
-	int		i;		/* btree level */
+	int			i;		/* btree level */
 
 	/*
-	 * Clear the buffer pointers, and release the buffers.
-	 * If we're doing this in the face of an error, we
-	 * need to make sure to inspect all of the entries
-	 * in the bc_bufs array for buffers to be unlocked.
-	 * This is because some of the btree code works from
-	 * level n down to 0, and if we get an error along
-	 * the way we won't have initialized all the entries
-	 * down to 0.
+	 * Clear the buffer pointers and release the buffers. If we're doing
+	 * this because of an error, inspect all of the entries in the bc_bufs
+	 * array for buffers to be unlocked. This is because some of the btree
+	 * code works from level n down to 0, and if we get an error along the
+	 * way we won't have initialized all the entries down to 0.
 	 */
 	for (i = 0; i < cur->bc_nlevels; i++) {
 		if (cur->bc_bufs[i])
@@ -374,17 +371,11 @@ xfs_btree_del_cursor(
 		else if (!error)
 			break;
 	}
-	/*
-	 * Can't free a bmap cursor without having dealt with the
-	 * allocated indirect blocks' accounting.
-	 */
-	ASSERT(cur->bc_btnum != XFS_BTNUM_BMAP ||
-	       cur->bc_ino.allocated == 0);
-	/*
-	 * Free the cursor.
-	 */
+
+	ASSERT(cur->bc_btnum != XFS_BTNUM_BMAP || cur->bc_ino.allocated == 0 ||
+	       XFS_FORCED_SHUTDOWN(cur->bc_mp));
 	if (unlikely(cur->bc_flags & XFS_BTREE_STAGING))
-		kmem_free((void *)cur->bc_ops);
+		kmem_free(cur->bc_ops);
 	kmem_cache_free(xfs_btree_cur_zone, cur);
 }
 

From patchwork Mon Jun  6 14:32:55 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Amir Goldstein <amir73il@gmail.com>
X-Patchwork-Id: 12870433
Return-Path: <linux-xfs-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 8D8A9C43334
	for <linux-xfs@archiver.kernel.org>; Mon,  6 Jun 2022 14:33:21 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S239772AbiFFOdU (ORCPT <rfc822;linux-xfs@archiver.kernel.org>);
        Mon, 6 Jun 2022 10:33:20 -0400
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51948 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S239732AbiFFOdS (ORCPT
        <rfc822;linux-xfs@vger.kernel.org>); Mon, 6 Jun 2022 10:33:18 -0400
Received: from mail-wr1-x42b.google.com (mail-wr1-x42b.google.com
 [IPv6:2a00:1450:4864:20::42b])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DEEAA2BB23;
        Mon,  6 Jun 2022 07:33:16 -0700 (PDT)
Received: by mail-wr1-x42b.google.com with SMTP id d14so11126400wra.10;
        Mon, 06 Jun 2022 07:33:16 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20210112;
        h=from:to:cc:subject:date:message-id:in-reply-to:references
         :mime-version:content-transfer-encoding;
        bh=4ADZbtAzVYEXGEOSgrcAKT0QzFIP7E8T2DQIIPmKyqo=;
        b=n91mLdVImBJysmLKm/dk0x82GvFIUs8jAIrBxtIpsme3svrhknEFt/CQX0vN7VNG0W
         n654VVxKLreeeJMabslKb/Oong6DQaEqSmo78rexJjwVDxMKKAXd6vVWzQfMHe5Vb5Uc
         ln8rbLe+7Nb4aZCKY78Gp2r+ll7oUMj6jP+KyjX8KL4I7lXbQnb3Zt8DxCDGLi1PfZjP
         bayefeFOPXxWraz3HXPAoKdhKfOrBKKRsmdD+l1o8P+TaLlaQ11d9ZYLjYVz7c23HzOW
         LSkKbY0VLXrpFYi+6eGCiWTzXCHSTEr2WJbPcaElJK2HZlNjNL5IflXh53cVuR6l8RzC
         n9OQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to
         :references:mime-version:content-transfer-encoding;
        bh=4ADZbtAzVYEXGEOSgrcAKT0QzFIP7E8T2DQIIPmKyqo=;
        b=AxkrQp3CdSsIXFbyrtc2yk1UPcS1wK2gvA2Upb+OFESMHZE6DyyOahakMTOwGe4NA4
         a1YeCIos6oXj0K12ASI7+PgiegBXGF5wf5Nt/rfrVJExTQ5X0zdpadclLSctlNnrUig7
         1ZPdm1o7DO1lSLWqm6xbQA/bA95NWpFHEMD9CFcn5ahRwIUV/+PgNrl7CAGpMUQIxMm+
         XVK3sSNFMnNIv8m+t71zNyG8EzDdIQO2vyQaRU8g0A3b1il9xQxIy7LgiaJvndEc7QHJ
         ot9b+irxNTkqol/Lch5smtgcCH14IOdjfGyHTXzKWtMHsbDz9LidD/JIPhA4oDytgykM
         HvtQ==
X-Gm-Message-State: AOAM532uY1R55p3pX2noIl/rMWS1gx/i9ttXnp5bALnfsGM2Spkyv9aE
        6PM3P1/3TTP4g61Q+DBW9To=
X-Google-Smtp-Source: 
 ABdhPJwxKROoHDO8Ew1abJrLoRdz5x87Zlc+FxcuDafnpdXSBzE/qqKnsVAz4ahyAv6DpKnkpLF4xw==
X-Received: by 2002:a5d:648a:0:b0:217:3552:eb2d with SMTP id
 o10-20020a5d648a000000b002173552eb2dmr9085512wri.78.1654525995382;
        Mon, 06 Jun 2022 07:33:15 -0700 (PDT)
Received: from amir-ThinkPad-T480.ctera.local
 (bzq-166-168-31-246.red.bezeqint.net. [31.168.166.246])
        by smtp.gmail.com with ESMTPSA id
 h24-20020a05600c145800b0039c54bb28f2sm1622958wmi.36.2022.06.06.07.33.13
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Mon, 06 Jun 2022 07:33:14 -0700 (PDT)
From: Amir Goldstein <amir73il@gmail.com>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Sasha Levin <sashal@kernel.org>,
        Dave Chinner <david@fromorbit.com>,
        "Darrick J . Wong" <djwong@kernel.org>,
        Christoph Hellwig <hch@lst.de>,
        Brian Foster <bfoster@redhat.com>,
        Christian Brauner <brauner@kernel.org>,
        Luis Chamberlain <mcgrof@kernel.org>,
        Leah Rumancik <leah.rumancik@gmail.com>,
        Adam Manzanares <a.manzanares@samsung.com>,
        linux-xfs@vger.kernel.org, stable@vger.kernel.org,
        Dave Chinner <dchinner@redhat.com>
Subject: [PATCH 5.10 v2 8/8] xfs: assert in xfs_btree_del_cursor should take
 into account error
Date: Mon,  6 Jun 2022 17:32:55 +0300
Message-Id: <20220606143255.685988-9-amir73il@gmail.com>
X-Mailer: git-send-email 2.25.1
In-Reply-To: <20220606143255.685988-1-amir73il@gmail.com>
References: <20220606143255.685988-1-amir73il@gmail.com>
MIME-Version: 1.0
Precedence: bulk
List-ID: <linux-xfs.vger.kernel.org>
X-Mailing-List: linux-xfs@vger.kernel.org

From: Dave Chinner <dchinner@redhat.com>

commit 56486f307100e8fc66efa2ebd8a71941fa10bf6f upstream.

xfs/538 on a 1kB block filesystem failed with this assert:

XFS: Assertion failed: cur->bc_btnum != XFS_BTNUM_BMAP || cur->bc_ino.allocated == 0 || xfs_is_shutdown(cur->bc_mp), file: fs/xfs/libxfs/xfs_btree.c, line: 448

The problem was that an allocation failed unexpectedly in
xfs_bmbt_alloc_block() after roughly 150,000 minlen allocation error
injections, resulting in an EFSCORRUPTED error being returned to
xfs_bmapi_write(). The error occurred on extent-to-btree format
conversion allocating the new root block:

 RIP: 0010:xfs_bmbt_alloc_block+0x177/0x210
 Call Trace:
  <TASK>
  xfs_btree_new_iroot+0xdf/0x520
  xfs_btree_make_block_unfull+0x10d/0x1c0
  xfs_btree_insrec+0x364/0x790
  xfs_btree_insert+0xaa/0x210
  xfs_bmap_add_extent_hole_real+0x1fe/0x9a0
  xfs_bmapi_allocate+0x34c/0x420
  xfs_bmapi_write+0x53c/0x9c0
  xfs_alloc_file_space+0xee/0x320
  xfs_file_fallocate+0x36b/0x450
  vfs_fallocate+0x148/0x340
  __x64_sys_fallocate+0x3c/0x70
  do_syscall_64+0x35/0x80
  entry_SYSCALL_64_after_hwframe+0x44/0xa

Why the allocation failed at this point is unknown, but is likely
that we ran the transaction out of reserved space and filesystem out
of space with bmbt blocks because of all the minlen allocations
being done causing worst case fragmentation of a large allocation.

Regardless of the cause, we've then called xfs_bmapi_finish() which
calls xfs_btree_del_cursor(cur, error) to tear down the cursor.

So we have a failed operation, error != 0, cur->bc_ino.allocated > 0
and the filesystem is still up. The assert fails to take into
account that allocation can fail with an error and the transaction
teardown will shut the filesystem down if necessary. i.e. the
assert needs to check "|| error != 0" as well, because at this point
shutdown is pending because the current transaction is dirty....

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/xfs/libxfs/xfs_btree.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/fs/xfs/libxfs/xfs_btree.c b/fs/xfs/libxfs/xfs_btree.c
index 9f9f9feccbcd..98c82f4935e1 100644
--- a/fs/xfs/libxfs/xfs_btree.c
+++ b/fs/xfs/libxfs/xfs_btree.c
@@ -372,8 +372,14 @@ xfs_btree_del_cursor(
 			break;
 	}
 
+	/*
+	 * If we are doing a BMBT update, the number of unaccounted blocks
+	 * allocated during this cursor life time should be zero. If it's not
+	 * zero, then we should be shut down or on our way to shutdown due to
+	 * cancelling a dirty transaction on error.
+	 */
 	ASSERT(cur->bc_btnum != XFS_BTNUM_BMAP || cur->bc_ino.allocated == 0 ||
-	       XFS_FORCED_SHUTDOWN(cur->bc_mp));
+	       XFS_FORCED_SHUTDOWN(cur->bc_mp) || error != 0);
 	if (unlikely(cur->bc_flags & XFS_BTREE_STAGING))
 		kmem_free(cur->bc_ops);
 	kmem_cache_free(xfs_btree_cur_zone, cur);