diff mbox series

[v2] NFSv4: fairly test all delegations on a SEQ4_ revocation

Message ID 6c85bc9cc6bd8d70fe0e74b096c72944f61874a7.1692903063.git.bcodding@redhat.com (mailing list archive)
State New, archived
Headers show
Series [v2] NFSv4: fairly test all delegations on a SEQ4_ revocation | expand

Commit Message

Benjamin Coddington Aug. 24, 2023, 6:52 p.m. UTC
When the client is required to use TEST_STATEID to discover which
delegation(s) have been revoked, it may continually test delegations at the
head of the list if the server continues to be unsatisfied and send
SEQ4_STATUS_RECALLABLE_STATE_REVOKED.  For a large number of delegations
this behavior is prone to live-lock because the client may never be able to
test and free revoked state at the end of the list since the
SEQ4_STATUS_RECALLABLE_STATE_REVOKED will cause us to flag delegations at
the head of the list to be tested.  This problem is further exacerbated by
the state manager's willingness to be scheduled out on a busy system while
testing the list of delegations.

Keep a generation counter for each attempt to test all delegations, and
skip delegations that have already been tested in the current pass.

Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
---
 fs/nfs/delegation.c       | 7 ++++++-
 fs/nfs/delegation.h       | 1 +
 include/linux/nfs_fs_sb.h | 1 +
 3 files changed, 8 insertions(+), 1 deletion(-)

--

Changed on v2: remove extra brackets that had my debug statements

Comments

Benjamin Coddington Aug. 25, 2023, 10:55 a.m. UTC | #1
On 24 Aug 2023, at 14:52, Benjamin Coddington wrote:

> When the client is required to use TEST_STATEID to discover which
> delegation(s) have been revoked, it may continually test delegations at the
> head of the list if the server continues to be unsatisfied and send
> SEQ4_STATUS_RECALLABLE_STATE_REVOKED.  For a large number of delegations
> this behavior is prone to live-lock because the client may never be able to
> test and free revoked state at the end of the list since the
> SEQ4_STATUS_RECALLABLE_STATE_REVOKED will cause us to flag delegations at
> the head of the list to be tested.  This problem is further exacerbated by
> the state manager's willingness to be scheduled out on a busy system while
> testing the list of delegations.
>
> Keep a generation counter for each attempt to test all delegations, and
> skip delegations that have already been tested in the current pass.
>
> Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
> ---
>  fs/nfs/delegation.c       | 7 ++++++-
>  fs/nfs/delegation.h       | 1 +
>  include/linux/nfs_fs_sb.h | 1 +
>  3 files changed, 8 insertions(+), 1 deletion(-)
>
> --
>
> Changed on v2: remove extra brackets that had my debug statements

Hi potential reviewers - this can also be fixed with the simple addition of
a flag instead of a counter on every delegation, the only cost being that
nfs_server_reap_expired_delegations() would need to walk the whole list a
final time to remove the flag before returning.

The unsigned long counter is probably overkill.  Any thoughts?

Ben
Benjamin Coddington Sept. 14, 2023, 1:18 p.m. UTC | #2
On 24 Aug 2023, at 14:52, Benjamin Coddington wrote:

> When the client is required to use TEST_STATEID to discover which
> delegation(s) have been revoked, it may continually test delegations at the
> head of the list if the server continues to be unsatisfied and send
> SEQ4_STATUS_RECALLABLE_STATE_REVOKED.  For a large number of delegations
> this behavior is prone to live-lock because the client may never be able to
> test and free revoked state at the end of the list since the
> SEQ4_STATUS_RECALLABLE_STATE_REVOKED will cause us to flag delegations at
> the head of the list to be tested.  This problem is further exacerbated by
> the state manager's willingness to be scheduled out on a busy system while
> testing the list of delegations.
>
> Keep a generation counter for each attempt to test all delegations, and
> skip delegations that have already been tested in the current pass.
>
> Signed-off-by: Benjamin Coddington <bcodding@redhat.com>

This one went through the ringer in an environment that saw multiple clients
live-locking, and resolves the problem for them.  They asked me to add:

Tested-by: Torkil Svensgaard <torkil@drcmr.dk>
Tested-by: Ruben Vestergaard <rubenv@drcmr.dk>

Ben
Benjamin Coddington Sept. 18, 2023, 8:36 p.m. UTC | #3
On 14 Sep 2023, at 9:18, Benjamin Coddington wrote:

> On 24 Aug 2023, at 14:52, Benjamin Coddington wrote:
>
>> When the client is required to use TEST_STATEID to discover which
>> delegation(s) have been revoked, it may continually test delegations at the
>> head of the list if the server continues to be unsatisfied and send
>> SEQ4_STATUS_RECALLABLE_STATE_REVOKED.  For a large number of delegations
>> this behavior is prone to live-lock because the client may never be able to
>> test and free revoked state at the end of the list since the
>> SEQ4_STATUS_RECALLABLE_STATE_REVOKED will cause us to flag delegations at
>> the head of the list to be tested.  This problem is further exacerbated by
>> the state manager's willingness to be scheduled out on a busy system while
>> testing the list of delegations.
>>
>> Keep a generation counter for each attempt to test all delegations, and
>> skip delegations that have already been tested in the current pass.
>>
>> Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
>
> This one went through the ringer in an environment that saw multiple clients
> live-locking, and resolves the problem for them.  They asked me to add:
>
> Tested-by: Torkil Svensgaard <torkil@drcmr.dk>
> Tested-by: Ruben Vestergaard <rubenv@drcmr.dk>
>
> Ben

Did this one get rejected with a reason?  This fix could also be implemented
with flag (as I mentioned in a reply on v2).

Ben
Trond Myklebust Sept. 18, 2023, 9:58 p.m. UTC | #4
On Mon, 2023-09-18 at 16:36 -0400, Benjamin Coddington wrote:
> On 14 Sep 2023, at 9:18, Benjamin Coddington wrote:
> 
> > On 24 Aug 2023, at 14:52, Benjamin Coddington wrote:
> > 
> > > When the client is required to use TEST_STATEID to discover which
> > > delegation(s) have been revoked, it may continually test
> > > delegations at the
> > > head of the list if the server continues to be unsatisfied and
> > > send
> > > SEQ4_STATUS_RECALLABLE_STATE_REVOKED.  For a large number of
> > > delegations
> > > this behavior is prone to live-lock because the client may never
> > > be able to
> > > test and free revoked state at the end of the list since the
> > > SEQ4_STATUS_RECALLABLE_STATE_REVOKED will cause us to flag
> > > delegations at
> > > the head of the list to be tested.  This problem is further
> > > exacerbated by
> > > the state manager's willingness to be scheduled out on a busy
> > > system while
> > > testing the list of delegations.
> > > 
> > > Keep a generation counter for each attempt to test all
> > > delegations, and
> > > skip delegations that have already been tested in the current
> > > pass.
> > > 
> > > Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
> > 
> > This one went through the ringer in an environment that saw
> > multiple clients
> > live-locking, and resolves the problem for them.  They asked me to
> > add:
> > 
> > Tested-by: Torkil Svensgaard <torkil@drcmr.dk>
> > Tested-by: Ruben Vestergaard <rubenv@drcmr.dk>
> > 
> > Ben
> 
> Did this one get rejected with a reason?  This fix could also be
> implemented
> with flag (as I mentioned in a reply on v2).
> 
> Ben
> 

Speaking only for myself, I haven't yet had time to properly review it.
So not yet rejected by me.
diff mbox series

Patch

diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
index cf7365581031..fa1a14def45c 100644
--- a/fs/nfs/delegation.c
+++ b/fs/nfs/delegation.c
@@ -448,6 +448,7 @@  int nfs_inode_set_delegation(struct inode *inode, const struct cred *cred,
 	delegation->cred = get_cred(cred);
 	delegation->inode = inode;
 	delegation->flags = 1<<NFS_DELEGATION_REFERENCED;
+	delegation->test_gen = 0;
 	spin_lock_init(&delegation->lock);
 
 	spin_lock(&clp->cl_lock);
@@ -1294,6 +1295,8 @@  static int nfs_server_reap_expired_delegations(struct nfs_server *server,
 	struct inode *inode;
 	const struct cred *cred;
 	nfs4_stateid stateid;
+	unsigned long gen = ++server->delegation_gen;
+
 restart:
 	rcu_read_lock();
 restart_locked:
@@ -1303,7 +1306,8 @@  static int nfs_server_reap_expired_delegations(struct nfs_server *server,
 		    test_bit(NFS_DELEGATION_RETURNING,
 					&delegation->flags) ||
 		    test_bit(NFS_DELEGATION_TEST_EXPIRED,
-					&delegation->flags) == 0)
+					&delegation->flags) == 0 ||
+			delegation->test_gen == gen)
 			continue;
 		inode = nfs_delegation_grab_inode(delegation);
 		if (inode == NULL)
@@ -1312,6 +1316,7 @@  static int nfs_server_reap_expired_delegations(struct nfs_server *server,
 		cred = get_cred_rcu(delegation->cred);
 		nfs4_stateid_copy(&stateid, &delegation->stateid);
 		spin_unlock(&delegation->lock);
+		delegation->test_gen = gen;
 		clear_bit(NFS_DELEGATION_TEST_EXPIRED, &delegation->flags);
 		rcu_read_unlock();
 		nfs_delegation_test_free_expired(inode, &stateid, cred);
diff --git a/fs/nfs/delegation.h b/fs/nfs/delegation.h
index 1c378992b7c0..a6f495d012cf 100644
--- a/fs/nfs/delegation.h
+++ b/fs/nfs/delegation.h
@@ -21,6 +21,7 @@  struct nfs_delegation {
 	fmode_t type;
 	unsigned long pagemod_limit;
 	__u64 change_attr;
+	unsigned long test_gen;
 	unsigned long flags;
 	refcount_t refcount;
 	spinlock_t lock;
diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h
index 20eeba8b009d..2f9d380b3439 100644
--- a/include/linux/nfs_fs_sb.h
+++ b/include/linux/nfs_fs_sb.h
@@ -238,6 +238,7 @@  struct nfs_server {
 	struct list_head	delegations;
 	struct list_head	ss_copies;
 
+	unsigned long		delegation_gen;
 	unsigned long		mig_gen;
 	unsigned long		mig_status;
 #define NFS_MIG_IN_TRANSITION		(1)