Infinite loop in pnfs_update_layout()

Message ID	170770037721.13976.15585299201457800900@noble.neil.brown.name (mailing list archive)
State	New
Headers	show Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4A10DECC for <linux-nfs@vger.kernel.org>; Mon, 12 Feb 2024 01:13:03 +0000 (UTC) Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk MIME-Version: 1.0 From: "NeilBrown" <neilb@suse.de> To: Trond Myklebust <trond.myklebust@hammerspace.com>, Anna Schumaker <anna@kernel.org> Cc: linux-nfs@vger.kernel.org Subject: Infinite loop in pnfs_update_layout() Date: Mon, 12 Feb 2024 12:12:57 +1100 Message-id: <170770037721.13976.15585299201457800900@noble.neil.brown.name>
Series	Infinite loop in pnfs_update_layout() \| expand Infinite loop in pnfs_update_layout()

Message ID

170770037721.13976.15585299201457800900@noble.neil.brown.name (mailing list archive)

State

New

Headers

Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Precedence: bulk
MIME-Version: 1.0
From: "NeilBrown" <neilb@suse.de>
To: Trond Myklebust <trond.myklebust@hammerspace.com>,
 Anna Schumaker <anna@kernel.org>
Cc: linux-nfs@vger.kernel.org
Subject: Infinite loop in pnfs_update_layout()
Date: Mon, 12 Feb 2024 12:12:57 +1100
Message-id: <170770037721.13976.15585299201457800900@noble.neil.brown.name>

Series

Infinite loop in pnfs_update_layout() | expand

Commit Message

NeilBrown Feb. 12, 2024, 1:12 a.m. UTC

hi,
 I have evidence from a customer of an infinite loop in
 pnfs_update_layout().  This has only happened once and I suspect it is
 unlikely to recur often.  We don't have a lot of tracing data, but I
 think we have enough...
 The evidence I do have is repeated "BUG: workqueue lockup" errors
 with sufficiently many samples that I can determine the code path of
 the loop (see below), and a message:

  NFSv4: state recovery failed for open file SVC_rapid7_dc33/.bash_history, error = -116

 The loop involves the "lookup_again" label and the "goto" on line 2112.
 This is the code where NFS_LAYOUT_INVALID_STID was found to be true and
 nfs4_select_rw_stateid() returned non-zero.

 I deduce that ctx->state is not a valid open stateid.  This leads to
 nfs4_select_rw_stateid() returned -EIO and
 nfs4_schedule_stateid_recovery() doing nothing.  This "doing nothing"
 is the only explanation I can find for the
 nfs4_client_recover_expired_lease() call at the top of the loop not
 waiting at all (if it did wait, we wouldn't get a workqueue lockup).

 The state being invalid also perfectly matches the "state recovery
 failed" error.

 So it seems likely that we should test
    nfs4_valid_open_stateid(ctx->state)
 somewhere in the loop, and return either NULL or and error.  I'm not
 certain what is best.
 My inclination is



Thoughts?

Thanks,
NeilBrown

diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 0c0fed1ecd0b..e702ac518205 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -2002,6 +2002,12 @@  pnfs_update_layout(struct inode *ino,
 	lseg = ERR_PTR(nfs4_client_recover_expired_lease(clp));
 	if (IS_ERR(lseg))
 		goto out;
+	if (!nfs4_valid_open_stateid(ctx->state)) {
+		lseq = ERR_PTR(-EIO);
+		trace_pnfs_update_layout(ino, pos, count, iomode, lo, lseg,
+					 PNFS_UPDATE_LAYOUT_INVALID_OPEN);
+		goto out;
+	}
 	first = false;
 	spin_lock(&ino->i_lock);
 	lo = pnfs_find_alloc_layout(ino, ctx, gfp_flags);


Does that seem reasonable?
Another possibility would be to check the status from
nfs4_select_rw_stateid() and "goto out_unlock" if it is EIO.

diff --git a/fs/nfs/pnfs.c b/fs/nfs/pnfs.c
index 0c0fed1ecd0b..7cc90ee86882 100644
--- a/fs/nfs/pnfs.c
+++ b/fs/nfs/pnfs.c
@@ -2106,6 +2106,8 @@  pnfs_update_layout(struct inode *ino,
 			trace_pnfs_update_layout(ino, pos, count,
 					iomode, lo, lseg,
 					PNFS_UPDATE_LAYOUT_INVALID_OPEN);
+			if (status == -EIO)
+				goto out_unlock;
 			nfs4_schedule_stateid_recovery(server, ctx->state);
 			pnfs_clear_first_layoutget(lo);
 			pnfs_put_layout_hdr(lo);

Infinite loop in pnfs_update_layout()

Commit Message

Patch