diff mbox series

[07/11] nfsd: allow admin-revoked NFSv4.0 state to be freed.

Message ID 20231124002925.1816-8-neilb@suse.de (mailing list archive)
State New
Headers show
Series nfsd: support admin-revocation of v4 state | expand

Commit Message

NeilBrown Nov. 24, 2023, 12:28 a.m. UTC
For NFSv4.1 and later the client easily discovers if there is any
admin-revoked state and will then find and explicitly free it.

For NFSv4.0 there is no such mechanism.  The client can only find that
state is admin-revoked if it tries to use that state, and there is no
way for it to explicitly free the state.  So the server must hold on to
the stateid (at least) for an indefinite amount of time.  A
RELEASE_LOCKOWNER request might justify forgetting some of these
stateids, as would the whole clients lease lapsing, but these are not
reliable.

This patch takes two approaches.

Whenever a client uses an revoked stateid, that stateid is then
discarded and will not be recognised again.  This might confuse a client
which expect to get NFS4ERR_ADMIN_REVOKED consistently once it get it at
all, but should mostly work.  Hopefully one error will lead to other
resources being closed (e.g.  process exits), which will result in more
stateid being freed when a CLOSE attempt gets NFS4ERR_ADMIN_REVOKED.

Also, any admin-revoked stateids that have been that way for more than
one lease time are periodically revoke.

No actual freeing of state happens in this patch.  That will come in
future patches which handle the different sorts of revoked state.

Signed-off-by: NeilBrown <neilb@suse.de>
---
 fs/nfsd/netns.h     |  4 ++
 fs/nfsd/nfs4state.c | 97 ++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 100 insertions(+), 1 deletion(-)

Comments

Chuck Lever Nov. 26, 2023, 5:54 p.m. UTC | #1
On Fri, Nov 24, 2023 at 11:28:42AM +1100, NeilBrown wrote:
> For NFSv4.1 and later the client easily discovers if there is any
> admin-revoked state and will then find and explicitly free it.
> 
> For NFSv4.0 there is no such mechanism.  The client can only find that
> state is admin-revoked if it tries to use that state, and there is no
> way for it to explicitly free the state.  So the server must hold on to
> the stateid (at least) for an indefinite amount of time.  A
> RELEASE_LOCKOWNER request might justify forgetting some of these
> stateids, as would the whole clients lease lapsing, but these are not
> reliable.

They aren't reliable, but what are the consequences of revoked
state left on the server? Seems like our implementation has a
number of mechanisms for cleaning up state over time. Do you feel
this is a denial-of-service vector?


> This patch takes two approaches.
> 
> Whenever a client uses an revoked stateid, that stateid is then
> discarded and will not be recognised again.  This might confuse a client
> which expect to get NFS4ERR_ADMIN_REVOKED consistently once it get it at
> all, but should mostly work.  Hopefully one error will lead to other
> resources being closed (e.g.  process exits), which will result in more
> stateid being freed when a CLOSE attempt gets NFS4ERR_ADMIN_REVOKED.

I'm leery of this: "This might confuse..." and "Hopefully..." suggest
we're not real sure how this will behave in practice with the current
cohort of client implementations.

Also, this paragraph in Section 10.2.1 of RFC 7530 is concerning:

>  A client normally finds out about revocation of a delegation when it
>  uses a stateid associated with a delegation and receives one of the
>  errors NFS4ERR_EXPIRED, NFS4ERR_BAD_STATEID, or NFS4ERR_ADMIN_REVOKED
>  (NFS4ERR_EXPIRED indicates that all lock state associated with the
>  client has been lost).  It also may find out about delegation
>  revocation after a client reboot when it attempts to reclaim a
>  delegation and receives NFS4ERR_EXPIRED.  Note that in the case of a
>  revoked OPEN_DELEGATE_WRITE delegation, there are issues because data
>  may have been modified by the client whose delegation is revoked and,
>  separately, by other clients.  See Section 10.5.1 for a discussion of
>  such issues.  Note also that when delegations are revoked,
>  information about the revoked delegation will be written by the
>  server to stable storage (as described in Section 9.6).  This is done
>  to deal with the case in which a server reboots after revoking a
>  delegation but before the client holding the revoked delegation is
>  notified about the revocation.

The text here suggests that the server persists the ADMIN_REVOKED
status, which suggests to me that the server is supposed to continue
returning ADMIN_REVOKED when presented with the revoked state,
until the state is freed.

AFAICT NFSD isn't recording this status persistently... Is there a
plan to add that (later) or some words suggesting that it is safe
and reasonable not to record it?


> Also, any admin-revoked stateids that have been that way for more than
> one lease time are periodically revoke.
> 
> No actual freeing of state happens in this patch.  That will come in
> future patches which handle the different sorts of revoked state.
>
> Signed-off-by: NeilBrown <neilb@suse.de>
> ---
>  fs/nfsd/netns.h     |  4 ++
>  fs/nfsd/nfs4state.c | 97 ++++++++++++++++++++++++++++++++++++++++++++-
>  2 files changed, 100 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
> index ab303a8b77d5..7458f672b33e 100644
> --- a/fs/nfsd/netns.h
> +++ b/fs/nfsd/netns.h
> @@ -197,6 +197,10 @@ struct nfsd_net {
>  	atomic_t		nfsd_courtesy_clients;
>  	struct shrinker		*nfsd_client_shrinker;
>  	struct work_struct	nfsd_shrinker_work;
> +
> +	/* last time an admin-revoke happened for NFSv4.0 */
> +	time64_t		nfs40_last_revoke;
> +
>  };
>  
>  /* Simple check to find out if a given net was properly initialized */
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 52e680235afe..c57f2ff954cb 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -1724,6 +1724,14 @@ void nfsd4_revoke_states(struct net *net, struct super_block *sb)
>  				}
>  				nfs4_put_stid(stid);
>  				spin_lock(&nn->client_lock);
> +				if (clp->cl_minorversion == 0)
> +					/* Allow cleanup after a lease period.
> +					 * store_release ensures cleanup will
> +					 * see any newly revoked states if it
> +					 * sees the time updated.
> +					 */
> +					nn->nfs40_last_revoke =
> +						ktime_get_boottime_seconds();
>  				goto retry;
>  			}
>  		}
> @@ -4648,6 +4656,39 @@ nfsd4_find_existing_open(struct nfs4_file *fp, struct nfsd4_open *open)
>  	return ret;
>  }
>  
> +static void nfsd_drop_revoked_stid(struct nfs4_stid *s)
> +{
> +	struct nfs4_client *cl = s->sc_client;
> +
> +	switch (s->sc_type) {
> +	default:
> +		spin_unlock(&cl->cl_lock);
> +	}
> +}
> +
> +static void nfs40_drop_revoked_stid(struct nfs4_client *cl,
> +				    stateid_t *stid)

Nits: I'd prefer nfsd4_drop_revoked_stid() and nfsd40_drop_revoked_stid()


> +{
> +	/* NFSv4.0 has no way for the client to tell the server
> +	 * that it can forget an admin-revoked stateid.
> +	 * So we keep it around until the first time that the
> +	 * client uses it, and drop it the first time
> +	 * nfserr_admin_revoked is returned.
> +	 * For v4.1 and later we wait until explicitly told
> +	 * to free the stateid.
> +	 */
> +	if (cl->cl_minorversion == 0) {
> +		struct nfs4_stid *st;
> +
> +		spin_lock(&cl->cl_lock);
> +		st = find_stateid_locked(cl, stid);
> +		if (st)
> +			nfsd_drop_revoked_stid(st);
> +		else
> +			spin_unlock(&cl->cl_lock);
> +	}
> +}
> +
>  static __be32
>  nfsd4_verify_open_stid(struct nfs4_stid *s)
>  {
> @@ -4670,6 +4711,10 @@ nfsd4_lock_ol_stateid(struct nfs4_ol_stateid *stp)
>  
>  	mutex_lock_nested(&stp->st_mutex, LOCK_STATEID_MUTEX);
>  	ret = nfsd4_verify_open_stid(&stp->st_stid);
> +	if (ret == nfserr_admin_revoked)
> +		nfs40_drop_revoked_stid(stp->st_stid.sc_client,
> +					&stp->st_stid.sc_stateid);
> +
>  	if (ret != nfs_ok)
>  		mutex_unlock(&stp->st_mutex);
>  	return ret;
> @@ -5253,6 +5298,7 @@ nfs4_check_deleg(struct nfs4_client *cl, struct nfsd4_open *open,
>  	}
>  	if (deleg->dl_stid.sc_status & NFS4_STID_REVOKED) {
>  		nfs4_put_stid(&deleg->dl_stid);
> +		nfs40_drop_revoked_stid(cl, &open->op_delegate_stateid);
>  		status = nfserr_deleg_revoked;
>  		goto out;
>  	}
> @@ -6251,6 +6297,43 @@ nfs4_process_client_reaplist(struct list_head *reaplist)
>  	}
>  }
>  
> +static void nfs40_clean_admin_revoked(struct nfsd_net *nn,
> +				      struct laundry_time *lt)
> +{
> +	struct nfs4_client *clp;
> +
> +	spin_lock(&nn->client_lock);
> +	if (nn->nfs40_last_revoke == 0 ||
> +	    nn->nfs40_last_revoke > lt->cutoff) {
> +		spin_unlock(&nn->client_lock);
> +		return;
> +	}
> +	nn->nfs40_last_revoke = 0;
> +
> +retry:
> +	list_for_each_entry(clp, &nn->client_lru, cl_lru) {
> +		unsigned long id, tmp;
> +		struct nfs4_stid *stid;
> +
> +		if (atomic_read(&clp->cl_admin_revoked) == 0)
> +			continue;
> +
> +		spin_lock(&clp->cl_lock);
> +		idr_for_each_entry_ul(&clp->cl_stateids, stid, tmp, id)
> +			if (stid->sc_status & NFS4_STID_ADMIN_REVOKED) {
> +				refcount_inc(&stid->sc_count);
> +				spin_unlock(&nn->client_lock);
> +				/* this function drops ->cl_lock */
> +				nfsd_drop_revoked_stid(stid);
> +				nfs4_put_stid(stid);
> +				spin_lock(&nn->client_lock);
> +				goto retry;
> +			}
> +		spin_unlock(&clp->cl_lock);
> +	}
> +	spin_unlock(&nn->client_lock);
> +}
> +
>  static time64_t
>  nfs4_laundromat(struct nfsd_net *nn)
>  {
> @@ -6284,6 +6367,8 @@ nfs4_laundromat(struct nfsd_net *nn)
>  	nfs4_get_client_reaplist(nn, &reaplist, &lt);
>  	nfs4_process_client_reaplist(&reaplist);
>  
> +	nfs40_clean_admin_revoked(nn, &lt);
> +
>  	spin_lock(&state_lock);
>  	list_for_each_safe(pos, next, &nn->del_recall_lru) {
>  		dp = list_entry (pos, struct nfs4_delegation, dl_recall_lru);
> @@ -6502,6 +6587,9 @@ static __be32 nfsd4_stid_check_stateid_generation(stateid_t *in, struct nfs4_sti
>  	if (ret == nfs_ok)
>  		ret = check_stateid_generation(in, &s->sc_stateid, has_session);
>  	spin_unlock(&s->sc_lock);
> +	if (ret == nfserr_admin_revoked)
> +		nfs40_drop_revoked_stid(s->sc_client,
> +					&s->sc_stateid);
>  	return ret;
>  }
>  
> @@ -6546,6 +6634,8 @@ static __be32 nfsd4_validate_stateid(struct nfs4_client *cl, stateid_t *stateid)
>  	}
>  out_unlock:
>  	spin_unlock(&cl->cl_lock);
> +	if (status == nfserr_admin_revoked)
> +		nfs40_drop_revoked_stid(cl, stateid);
>  	return status;
>  }
>  
> @@ -6592,6 +6682,7 @@ nfsd4_lookup_stateid(struct nfsd4_compound_state *cstate,
>  		return nfserr_deleg_revoked;
>  	}
>  	if (stid->sc_type & NFS4_STID_ADMIN_REVOKED) {
> +		nfs40_drop_revoked_stid(cstate->clp, stateid);
>  		nfs4_put_stid(stid);
>  		return nfserr_admin_revoked;
>  	}
> @@ -6884,6 +6975,11 @@ nfsd4_free_stateid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>  	s = find_stateid_locked(cl, stateid);
>  	if (!s || s->sc_status & NFS4_STID_CLOSED)
>  		goto out_unlock;
> +	if (s->sc_status & NFS4_STID_ADMIN_REVOKED) {
> +		nfsd_drop_revoked_stid(s);
> +		ret = nfs_ok;
> +		goto out;
> +	}
>  	spin_lock(&s->sc_lock);
>  	switch (s->sc_type) {
>  	case NFS4_DELEG_STID:
> @@ -6910,7 +7006,6 @@ nfsd4_free_stateid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>  		spin_unlock(&cl->cl_lock);
>  		ret = nfsd4_free_lock_stateid(stateid, s);
>  		goto out;
> -	/* Default falls through and returns nfserr_bad_stateid */
>  	}
>  	spin_unlock(&s->sc_lock);
>  out_unlock:
> -- 
> 2.42.1
>
Chuck Lever Nov. 26, 2023, 6:07 p.m. UTC | #2
On Fri, Nov 24, 2023 at 11:28:42AM +1100, NeilBrown wrote:
> For NFSv4.1 and later the client easily discovers if there is any
> admin-revoked state and will then find and explicitly free it.
> 
> For NFSv4.0 there is no such mechanism.  The client can only find that
> state is admin-revoked if it tries to use that state, and there is no
> way for it to explicitly free the state.  So the server must hold on to
> the stateid (at least) for an indefinite amount of time.  A
> RELEASE_LOCKOWNER request might justify forgetting some of these
> stateids, as would the whole clients lease lapsing, but these are not
> reliable.
> 
> This patch takes two approaches.
> 
> Whenever a client uses an revoked stateid, that stateid is then
> discarded and will not be recognised again.  This might confuse a client
> which expect to get NFS4ERR_ADMIN_REVOKED consistently once it get it at
> all, but should mostly work.  Hopefully one error will lead to other
> resources being closed (e.g.  process exits), which will result in more
> stateid being freed when a CLOSE attempt gets NFS4ERR_ADMIN_REVOKED.
> 
> Also, any admin-revoked stateids that have been that way for more than
> one lease time are periodically revoke.
> 
> No actual freeing of state happens in this patch.  That will come in
> future patches which handle the different sorts of revoked state.
> 
> Signed-off-by: NeilBrown <neilb@suse.de>
> ---
>  fs/nfsd/netns.h     |  4 ++
>  fs/nfsd/nfs4state.c | 97 ++++++++++++++++++++++++++++++++++++++++++++-
>  2 files changed, 100 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
> index ab303a8b77d5..7458f672b33e 100644
> --- a/fs/nfsd/netns.h
> +++ b/fs/nfsd/netns.h
> @@ -197,6 +197,10 @@ struct nfsd_net {
>  	atomic_t		nfsd_courtesy_clients;
>  	struct shrinker		*nfsd_client_shrinker;
>  	struct work_struct	nfsd_shrinker_work;
> +
> +	/* last time an admin-revoke happened for NFSv4.0 */
> +	time64_t		nfs40_last_revoke;
> +
>  };
>  
>  /* Simple check to find out if a given net was properly initialized */
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 52e680235afe..c57f2ff954cb 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -1724,6 +1724,14 @@ void nfsd4_revoke_states(struct net *net, struct super_block *sb)
>  				}
>  				nfs4_put_stid(stid);
>  				spin_lock(&nn->client_lock);
> +				if (clp->cl_minorversion == 0)
> +					/* Allow cleanup after a lease period.
> +					 * store_release ensures cleanup will
> +					 * see any newly revoked states if it
> +					 * sees the time updated.
> +					 */
> +					nn->nfs40_last_revoke =
> +						ktime_get_boottime_seconds();
>  				goto retry;
>  			}
>  		}
> @@ -4648,6 +4656,39 @@ nfsd4_find_existing_open(struct nfs4_file *fp, struct nfsd4_open *open)
>  	return ret;
>  }
>  
> +static void nfsd_drop_revoked_stid(struct nfs4_stid *s)
> +{
> +	struct nfs4_client *cl = s->sc_client;
> +
> +	switch (s->sc_type) {
> +	default:
> +		spin_unlock(&cl->cl_lock);
> +	}
> +}

I'm not in love with unlocking cl_lock inside nfsd_drop_revoked_stid,
but I understand why it's necessary. How about:

static void nfsd4_drop_revoked_stid_unlock(struct nfs4_client *cl,
					   struct nfs4_stid *s)
	__releases(&cl->cl_lock)
{
	....


> +
> +static void nfs40_drop_revoked_stid(struct nfs4_client *cl,
> +				    stateid_t *stid)
> +{
> +	/* NFSv4.0 has no way for the client to tell the server
> +	 * that it can forget an admin-revoked stateid.
> +	 * So we keep it around until the first time that the
> +	 * client uses it, and drop it the first time
> +	 * nfserr_admin_revoked is returned.
> +	 * For v4.1 and later we wait until explicitly told
> +	 * to free the stateid.
> +	 */
> +	if (cl->cl_minorversion == 0) {
> +		struct nfs4_stid *st;
> +
> +		spin_lock(&cl->cl_lock);
> +		st = find_stateid_locked(cl, stid);
> +		if (st)
> +			nfsd_drop_revoked_stid(st);
> +		else
> +			spin_unlock(&cl->cl_lock);
> +	}
> +}
> +
>  static __be32
>  nfsd4_verify_open_stid(struct nfs4_stid *s)
>  {
> @@ -4670,6 +4711,10 @@ nfsd4_lock_ol_stateid(struct nfs4_ol_stateid *stp)
>  
>  	mutex_lock_nested(&stp->st_mutex, LOCK_STATEID_MUTEX);
>  	ret = nfsd4_verify_open_stid(&stp->st_stid);
> +	if (ret == nfserr_admin_revoked)
> +		nfs40_drop_revoked_stid(stp->st_stid.sc_client,
> +					&stp->st_stid.sc_stateid);
> +
>  	if (ret != nfs_ok)
>  		mutex_unlock(&stp->st_mutex);
>  	return ret;
> @@ -5253,6 +5298,7 @@ nfs4_check_deleg(struct nfs4_client *cl, struct nfsd4_open *open,
>  	}
>  	if (deleg->dl_stid.sc_status & NFS4_STID_REVOKED) {
>  		nfs4_put_stid(&deleg->dl_stid);
> +		nfs40_drop_revoked_stid(cl, &open->op_delegate_stateid);
>  		status = nfserr_deleg_revoked;
>  		goto out;
>  	}
> @@ -6251,6 +6297,43 @@ nfs4_process_client_reaplist(struct list_head *reaplist)
>  	}
>  }
>  
> +static void nfs40_clean_admin_revoked(struct nfsd_net *nn,
> +				      struct laundry_time *lt)
> +{
> +	struct nfs4_client *clp;
> +
> +	spin_lock(&nn->client_lock);
> +	if (nn->nfs40_last_revoke == 0 ||
> +	    nn->nfs40_last_revoke > lt->cutoff) {
> +		spin_unlock(&nn->client_lock);
> +		return;
> +	}
> +	nn->nfs40_last_revoke = 0;
> +
> +retry:
> +	list_for_each_entry(clp, &nn->client_lru, cl_lru) {
> +		unsigned long id, tmp;
> +		struct nfs4_stid *stid;
> +
> +		if (atomic_read(&clp->cl_admin_revoked) == 0)
> +			continue;
> +
> +		spin_lock(&clp->cl_lock);
> +		idr_for_each_entry_ul(&clp->cl_stateids, stid, tmp, id)
> +			if (stid->sc_status & NFS4_STID_ADMIN_REVOKED) {
> +				refcount_inc(&stid->sc_count);
> +				spin_unlock(&nn->client_lock);
> +				/* this function drops ->cl_lock */
> +				nfsd_drop_revoked_stid(stid);
> +				nfs4_put_stid(stid);
> +				spin_lock(&nn->client_lock);
> +				goto retry;
> +			}
> +		spin_unlock(&clp->cl_lock);
> +	}
> +	spin_unlock(&nn->client_lock);
> +}
> +
>  static time64_t
>  nfs4_laundromat(struct nfsd_net *nn)
>  {
> @@ -6284,6 +6367,8 @@ nfs4_laundromat(struct nfsd_net *nn)
>  	nfs4_get_client_reaplist(nn, &reaplist, &lt);
>  	nfs4_process_client_reaplist(&reaplist);
>  
> +	nfs40_clean_admin_revoked(nn, &lt);
> +
>  	spin_lock(&state_lock);
>  	list_for_each_safe(pos, next, &nn->del_recall_lru) {
>  		dp = list_entry (pos, struct nfs4_delegation, dl_recall_lru);
> @@ -6502,6 +6587,9 @@ static __be32 nfsd4_stid_check_stateid_generation(stateid_t *in, struct nfs4_sti
>  	if (ret == nfs_ok)
>  		ret = check_stateid_generation(in, &s->sc_stateid, has_session);
>  	spin_unlock(&s->sc_lock);
> +	if (ret == nfserr_admin_revoked)
> +		nfs40_drop_revoked_stid(s->sc_client,
> +					&s->sc_stateid);
>  	return ret;
>  }
>  
> @@ -6546,6 +6634,8 @@ static __be32 nfsd4_validate_stateid(struct nfs4_client *cl, stateid_t *stateid)
>  	}
>  out_unlock:
>  	spin_unlock(&cl->cl_lock);
> +	if (status == nfserr_admin_revoked)
> +		nfs40_drop_revoked_stid(cl, stateid);
>  	return status;
>  }
>  
> @@ -6592,6 +6682,7 @@ nfsd4_lookup_stateid(struct nfsd4_compound_state *cstate,
>  		return nfserr_deleg_revoked;
>  	}
>  	if (stid->sc_type & NFS4_STID_ADMIN_REVOKED) {
> +		nfs40_drop_revoked_stid(cstate->clp, stateid);
>  		nfs4_put_stid(stid);
>  		return nfserr_admin_revoked;
>  	}
> @@ -6884,6 +6975,11 @@ nfsd4_free_stateid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>  	s = find_stateid_locked(cl, stateid);
>  	if (!s || s->sc_status & NFS4_STID_CLOSED)
>  		goto out_unlock;
> +	if (s->sc_status & NFS4_STID_ADMIN_REVOKED) {
> +		nfsd_drop_revoked_stid(s);
> +		ret = nfs_ok;
> +		goto out;
> +	}
>  	spin_lock(&s->sc_lock);
>  	switch (s->sc_type) {
>  	case NFS4_DELEG_STID:
> @@ -6910,7 +7006,6 @@ nfsd4_free_stateid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
>  		spin_unlock(&cl->cl_lock);
>  		ret = nfsd4_free_lock_stateid(stateid, s);
>  		goto out;
> -	/* Default falls through and returns nfserr_bad_stateid */
>  	}
>  	spin_unlock(&s->sc_lock);
>  out_unlock:
> -- 
> 2.42.1
>
NeilBrown Jan. 19, 2024, 1:41 a.m. UTC | #3
On Mon, 27 Nov 2023, Chuck Lever wrote:
> On Fri, Nov 24, 2023 at 11:28:42AM +1100, NeilBrown wrote:
> > For NFSv4.1 and later the client easily discovers if there is any
> > admin-revoked state and will then find and explicitly free it.
> > 
> > For NFSv4.0 there is no such mechanism.  The client can only find that
> > state is admin-revoked if it tries to use that state, and there is no
> > way for it to explicitly free the state.  So the server must hold on to
> > the stateid (at least) for an indefinite amount of time.  A
> > RELEASE_LOCKOWNER request might justify forgetting some of these
> > stateids, as would the whole clients lease lapsing, but these are not
> > reliable.
> 
> They aren't reliable, but what are the consequences of revoked
> state left on the server? Seems like our implementation has a
> number of mechanisms for cleaning up state over time. Do you feel
> this is a denial-of-service vector?

The consequence of revoked state being left on the server is only the
memory usage (and possible associated costs of indexing a data structure
that is larger than it needs to be).  The only existing mechanisms that
will clean it up is the cleanup when a client expires or when the server
exits.  These may not happen for years.

We might expect admin-revoke to be used rarely, but such expectations
are not reliable.  So the number of revoked states that accumulate could
grow without bound - unlikely though that is.

I don't think it is a denial-of-service "attack" vector as only the
admin can initiate the admin-revocation.  But an admin could unwittingly
bring denial of service upon themselves by revoking state repeatedly
over an extended period of time - if we did not proactively clean up old
revoked state.

> 
> 
> > This patch takes two approaches.
> > 
> > Whenever a client uses an revoked stateid, that stateid is then
> > discarded and will not be recognised again.  This might confuse a client
> > which expect to get NFS4ERR_ADMIN_REVOKED consistently once it get it at
> > all, but should mostly work.  Hopefully one error will lead to other
> > resources being closed (e.g.  process exits), which will result in more
> > stateid being freed when a CLOSE attempt gets NFS4ERR_ADMIN_REVOKED.
> 
> I'm leery of this: "This might confuse..." and "Hopefully..." suggest
> we're not real sure how this will behave in practice with the current
> cohort of client implementations.

It is true - we are not really sure.  There are many reason why we have
NFSv4.1 and I suspect this is one of them.
I would be happy to only support admin-revoke for v4.1 and later and put
v4.0 in the "too hard" basket.  But I think this "best effort" without
strong guarantees is better than not supporting it at all.

> 
> Also, this paragraph in Section 10.2.1 of RFC 7530 is concerning:
> 
> >  A client normally finds out about revocation of a delegation when it
> >  uses a stateid associated with a delegation and receives one of the
> >  errors NFS4ERR_EXPIRED, NFS4ERR_BAD_STATEID, or NFS4ERR_ADMIN_REVOKED
> >  (NFS4ERR_EXPIRED indicates that all lock state associated with the
> >  client has been lost).  It also may find out about delegation
> >  revocation after a client reboot when it attempts to reclaim a
> >  delegation and receives NFS4ERR_EXPIRED.  Note that in the case of a
> >  revoked OPEN_DELEGATE_WRITE delegation, there are issues because data
> >  may have been modified by the client whose delegation is revoked and,
> >  separately, by other clients.  See Section 10.5.1 for a discussion of
> >  such issues.  Note also that when delegations are revoked,
> >  information about the revoked delegation will be written by the
> >  server to stable storage (as described in Section 9.6).  This is done
> >  to deal with the case in which a server reboots after revoking a
> >  delegation but before the client holding the revoked delegation is
> >  notified about the revocation.
> 
> The text here suggests that the server persists the ADMIN_REVOKED
> status, which suggests to me that the server is supposed to continue
> returning ADMIN_REVOKED when presented with the revoked state,
> until the state is freed.

I agree - but when can the state be freed?  NFSv4.0 has no mechanism to
do this.  So the text suggests that the server is supposed to continue
returning ADMIN_REVOKED when presented with the revoked state
indefinitely.

We could do that.  I just don't feel comfortable storing an indefinite
amount of state for an indefinite amount of time.

> 
> AFAICT NFSD isn't recording this status persistently... Is there a
> plan to add that (later) or some words suggesting that it is safe
> and reasonable not to record it?

I hadn't thought about this...

The expectation (well ...  my expectation) is that state will only be
admin-revoked after the filesystem has been un-exported (and shortly
before it is unmounted).  In this case there is no chance for another
client to open the file and violate expectations reasonably held by the
first client.

But people might use the functionality differently to my expectations
(it happens all the time...).

I cannot see that Linux nfsd currently saves any per-state info to
stable storage.  There is only per-client information.
Based on this, section 9.6.3.4.3 Handling Server Edge Conditions
seems to suggest that all attempts to reclaim locks should get
NFS4ERR_NO_GRACE.  But I don't think we do this.

So maybe I can justify the lack of recording admin-revoke state to
stable storage on the basis that the whole "record state to stable
storage" idea is currently ignored by nfsd ????

> 
> 
> > Also, any admin-revoked stateids that have been that way for more than
> > one lease time are periodically revoke.
> > 
> > No actual freeing of state happens in this patch.  That will come in
> > future patches which handle the different sorts of revoked state.
> >
> > Signed-off-by: NeilBrown <neilb@suse.de>
> > ---
> >  fs/nfsd/netns.h     |  4 ++
> >  fs/nfsd/nfs4state.c | 97 ++++++++++++++++++++++++++++++++++++++++++++-
> >  2 files changed, 100 insertions(+), 1 deletion(-)
> > 
> > diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
> > index ab303a8b77d5..7458f672b33e 100644
> > --- a/fs/nfsd/netns.h
> > +++ b/fs/nfsd/netns.h
> > @@ -197,6 +197,10 @@ struct nfsd_net {
> >  	atomic_t		nfsd_courtesy_clients;
> >  	struct shrinker		*nfsd_client_shrinker;
> >  	struct work_struct	nfsd_shrinker_work;
> > +
> > +	/* last time an admin-revoke happened for NFSv4.0 */
> > +	time64_t		nfs40_last_revoke;
> > +
> >  };
> >  
> >  /* Simple check to find out if a given net was properly initialized */
> > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > index 52e680235afe..c57f2ff954cb 100644
> > --- a/fs/nfsd/nfs4state.c
> > +++ b/fs/nfsd/nfs4state.c
> > @@ -1724,6 +1724,14 @@ void nfsd4_revoke_states(struct net *net, struct super_block *sb)
> >  				}
> >  				nfs4_put_stid(stid);
> >  				spin_lock(&nn->client_lock);
> > +				if (clp->cl_minorversion == 0)
> > +					/* Allow cleanup after a lease period.
> > +					 * store_release ensures cleanup will
> > +					 * see any newly revoked states if it
> > +					 * sees the time updated.
> > +					 */
> > +					nn->nfs40_last_revoke =
> > +						ktime_get_boottime_seconds();
> >  				goto retry;
> >  			}
> >  		}
> > @@ -4648,6 +4656,39 @@ nfsd4_find_existing_open(struct nfs4_file *fp, struct nfsd4_open *open)
> >  	return ret;
> >  }
> >  
> > +static void nfsd_drop_revoked_stid(struct nfs4_stid *s)
> > +{
> > +	struct nfs4_client *cl = s->sc_client;
> > +
> > +	switch (s->sc_type) {
> > +	default:
> > +		spin_unlock(&cl->cl_lock);
> > +	}
> > +}
> > +
> > +static void nfs40_drop_revoked_stid(struct nfs4_client *cl,
> > +				    stateid_t *stid)
> 
> Nits: I'd prefer nfsd4_drop_revoked_stid() and nfsd40_drop_revoked_stid()
> 

I guess..

nfsd code sometimes uses "nfsd" and sometimes "nfs" and sometimes adds a
"4" and it doesn't seem to be at all consistent.  Standardising on the
longest such form doesn't fill me with joy.  I would prefer to remove
the prefix completely - at least for names local to a file and probably
for names local to the module..

I've made the change you suggest.

Thanks,
NeilBrown
NeilBrown Jan. 19, 2024, 1:43 a.m. UTC | #4
On Mon, 27 Nov 2023, Chuck Lever wrote:

> > +static void nfsd_drop_revoked_stid(struct nfs4_stid *s)
> > +{
> > +	struct nfs4_client *cl = s->sc_client;
> > +
> > +	switch (s->sc_type) {
> > +	default:
> > +		spin_unlock(&cl->cl_lock);
> > +	}
> > +}
> 
> I'm not in love with unlocking cl_lock inside nfsd_drop_revoked_stid,
> but I understand why it's necessary. How about:
> 
> static void nfsd4_drop_revoked_stid_unlock(struct nfs4_client *cl,
> 					   struct nfs4_stid *s)
> 	__releases(&cl->cl_lock)
> {
> 	....
> 

I made it
	__releases(&s->sc_client->cl_lock)

thanks.
NeilBrown
diff mbox series

Patch

diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
index ab303a8b77d5..7458f672b33e 100644
--- a/fs/nfsd/netns.h
+++ b/fs/nfsd/netns.h
@@ -197,6 +197,10 @@  struct nfsd_net {
 	atomic_t		nfsd_courtesy_clients;
 	struct shrinker		*nfsd_client_shrinker;
 	struct work_struct	nfsd_shrinker_work;
+
+	/* last time an admin-revoke happened for NFSv4.0 */
+	time64_t		nfs40_last_revoke;
+
 };
 
 /* Simple check to find out if a given net was properly initialized */
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 52e680235afe..c57f2ff954cb 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -1724,6 +1724,14 @@  void nfsd4_revoke_states(struct net *net, struct super_block *sb)
 				}
 				nfs4_put_stid(stid);
 				spin_lock(&nn->client_lock);
+				if (clp->cl_minorversion == 0)
+					/* Allow cleanup after a lease period.
+					 * store_release ensures cleanup will
+					 * see any newly revoked states if it
+					 * sees the time updated.
+					 */
+					nn->nfs40_last_revoke =
+						ktime_get_boottime_seconds();
 				goto retry;
 			}
 		}
@@ -4648,6 +4656,39 @@  nfsd4_find_existing_open(struct nfs4_file *fp, struct nfsd4_open *open)
 	return ret;
 }
 
+static void nfsd_drop_revoked_stid(struct nfs4_stid *s)
+{
+	struct nfs4_client *cl = s->sc_client;
+
+	switch (s->sc_type) {
+	default:
+		spin_unlock(&cl->cl_lock);
+	}
+}
+
+static void nfs40_drop_revoked_stid(struct nfs4_client *cl,
+				    stateid_t *stid)
+{
+	/* NFSv4.0 has no way for the client to tell the server
+	 * that it can forget an admin-revoked stateid.
+	 * So we keep it around until the first time that the
+	 * client uses it, and drop it the first time
+	 * nfserr_admin_revoked is returned.
+	 * For v4.1 and later we wait until explicitly told
+	 * to free the stateid.
+	 */
+	if (cl->cl_minorversion == 0) {
+		struct nfs4_stid *st;
+
+		spin_lock(&cl->cl_lock);
+		st = find_stateid_locked(cl, stid);
+		if (st)
+			nfsd_drop_revoked_stid(st);
+		else
+			spin_unlock(&cl->cl_lock);
+	}
+}
+
 static __be32
 nfsd4_verify_open_stid(struct nfs4_stid *s)
 {
@@ -4670,6 +4711,10 @@  nfsd4_lock_ol_stateid(struct nfs4_ol_stateid *stp)
 
 	mutex_lock_nested(&stp->st_mutex, LOCK_STATEID_MUTEX);
 	ret = nfsd4_verify_open_stid(&stp->st_stid);
+	if (ret == nfserr_admin_revoked)
+		nfs40_drop_revoked_stid(stp->st_stid.sc_client,
+					&stp->st_stid.sc_stateid);
+
 	if (ret != nfs_ok)
 		mutex_unlock(&stp->st_mutex);
 	return ret;
@@ -5253,6 +5298,7 @@  nfs4_check_deleg(struct nfs4_client *cl, struct nfsd4_open *open,
 	}
 	if (deleg->dl_stid.sc_status & NFS4_STID_REVOKED) {
 		nfs4_put_stid(&deleg->dl_stid);
+		nfs40_drop_revoked_stid(cl, &open->op_delegate_stateid);
 		status = nfserr_deleg_revoked;
 		goto out;
 	}
@@ -6251,6 +6297,43 @@  nfs4_process_client_reaplist(struct list_head *reaplist)
 	}
 }
 
+static void nfs40_clean_admin_revoked(struct nfsd_net *nn,
+				      struct laundry_time *lt)
+{
+	struct nfs4_client *clp;
+
+	spin_lock(&nn->client_lock);
+	if (nn->nfs40_last_revoke == 0 ||
+	    nn->nfs40_last_revoke > lt->cutoff) {
+		spin_unlock(&nn->client_lock);
+		return;
+	}
+	nn->nfs40_last_revoke = 0;
+
+retry:
+	list_for_each_entry(clp, &nn->client_lru, cl_lru) {
+		unsigned long id, tmp;
+		struct nfs4_stid *stid;
+
+		if (atomic_read(&clp->cl_admin_revoked) == 0)
+			continue;
+
+		spin_lock(&clp->cl_lock);
+		idr_for_each_entry_ul(&clp->cl_stateids, stid, tmp, id)
+			if (stid->sc_status & NFS4_STID_ADMIN_REVOKED) {
+				refcount_inc(&stid->sc_count);
+				spin_unlock(&nn->client_lock);
+				/* this function drops ->cl_lock */
+				nfsd_drop_revoked_stid(stid);
+				nfs4_put_stid(stid);
+				spin_lock(&nn->client_lock);
+				goto retry;
+			}
+		spin_unlock(&clp->cl_lock);
+	}
+	spin_unlock(&nn->client_lock);
+}
+
 static time64_t
 nfs4_laundromat(struct nfsd_net *nn)
 {
@@ -6284,6 +6367,8 @@  nfs4_laundromat(struct nfsd_net *nn)
 	nfs4_get_client_reaplist(nn, &reaplist, &lt);
 	nfs4_process_client_reaplist(&reaplist);
 
+	nfs40_clean_admin_revoked(nn, &lt);
+
 	spin_lock(&state_lock);
 	list_for_each_safe(pos, next, &nn->del_recall_lru) {
 		dp = list_entry (pos, struct nfs4_delegation, dl_recall_lru);
@@ -6502,6 +6587,9 @@  static __be32 nfsd4_stid_check_stateid_generation(stateid_t *in, struct nfs4_sti
 	if (ret == nfs_ok)
 		ret = check_stateid_generation(in, &s->sc_stateid, has_session);
 	spin_unlock(&s->sc_lock);
+	if (ret == nfserr_admin_revoked)
+		nfs40_drop_revoked_stid(s->sc_client,
+					&s->sc_stateid);
 	return ret;
 }
 
@@ -6546,6 +6634,8 @@  static __be32 nfsd4_validate_stateid(struct nfs4_client *cl, stateid_t *stateid)
 	}
 out_unlock:
 	spin_unlock(&cl->cl_lock);
+	if (status == nfserr_admin_revoked)
+		nfs40_drop_revoked_stid(cl, stateid);
 	return status;
 }
 
@@ -6592,6 +6682,7 @@  nfsd4_lookup_stateid(struct nfsd4_compound_state *cstate,
 		return nfserr_deleg_revoked;
 	}
 	if (stid->sc_type & NFS4_STID_ADMIN_REVOKED) {
+		nfs40_drop_revoked_stid(cstate->clp, stateid);
 		nfs4_put_stid(stid);
 		return nfserr_admin_revoked;
 	}
@@ -6884,6 +6975,11 @@  nfsd4_free_stateid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 	s = find_stateid_locked(cl, stateid);
 	if (!s || s->sc_status & NFS4_STID_CLOSED)
 		goto out_unlock;
+	if (s->sc_status & NFS4_STID_ADMIN_REVOKED) {
+		nfsd_drop_revoked_stid(s);
+		ret = nfs_ok;
+		goto out;
+	}
 	spin_lock(&s->sc_lock);
 	switch (s->sc_type) {
 	case NFS4_DELEG_STID:
@@ -6910,7 +7006,6 @@  nfsd4_free_stateid(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
 		spin_unlock(&cl->cl_lock);
 		ret = nfsd4_free_lock_stateid(stateid, s);
 		goto out;
-	/* Default falls through and returns nfserr_bad_stateid */
 	}
 	spin_unlock(&s->sc_lock);
 out_unlock: