[v4,5/8] NFSD check stateids against copy stateids

Message ID	20190708192352.12614-6-olga.kornievskaia@gmail.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-nfs-owner@kernel.org> From: Olga Kornievskaia <olga.kornievskaia@gmail.com> To: bfields@redhat.com Cc: linux-nfs@vger.kernel.org Subject: [PATCH v4 5/8] NFSD check stateids against copy stateids Date: Mon, 8 Jul 2019 15:23:49 -0400 Message-Id: <20190708192352.12614-6-olga.kornievskaia@gmail.com> In-Reply-To: <20190708192352.12614-1-olga.kornievskaia@gmail.com> References: <20190708192352.12614-1-olga.kornievskaia@gmail.com> Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk
Series	server-side support for "inter" SSC copy \| expand [v4,0/8] server-side support for "inter" SSC copy [v4,1/8] NFSD fill-in netloc4 structure [v4,2/8] NFSD add ca_source_server<> to COPY [v4,3/8] NFSD return nfs4_stid in nfs4_preprocess_stateid_op [v4,4/8] NFSD add COPY_NOTIFY operation [v4,5/8] NFSD check stateids against copy stateids [v4,6/8] NFSD generalize nfsd4_compound_state flag names [v4,7/8] NFSD: allow inter server COPY to have a STALE source server fh [v4,8/8] NFSD add nfs4 inter ssc to nfsd4_copy

Olga Kornievskaia July 8, 2019, 7:23 p.m. UTC

Incoming stateid (used by a READ) could be a saved copy stateid.
On first use make it active and check that the copy has started
within the allowable lease time.

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
---
 fs/nfsd/nfs4state.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 45 insertions(+)

J. Bruce Fields July 19, 2019, 10:01 p.m. UTC | #1

On Mon, Jul 08, 2019 at 03:23:49PM -0400, Olga Kornievskaia wrote:
> Incoming stateid (used by a READ) could be a saved copy stateid.
> On first use make it active and check that the copy has started
> within the allowable lease time.
> 
> Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
> ---
>  fs/nfsd/nfs4state.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 45 insertions(+)
> 
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 2555eb9..b786625 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -5232,6 +5232,49 @@ static __be32 nfsd4_validate_stateid(struct nfs4_client *cl, stateid_t *stateid)
>  
>  	return 0;
>  }
> +/*
> + * A READ from an inter server to server COPY will have a
> + * copy stateid. Return the parent nfs4_stid.
> + */
> +static __be32 _find_cpntf_state(struct nfsd_net *nn, stateid_t *st,
> +		     struct nfs4_cpntf_state **cps)
> +{
> +	struct nfs4_cpntf_state *state = NULL;
> +
> +	if (st->si_opaque.so_clid.cl_id != nn->s2s_cp_cl_id)
> +		return nfserr_bad_stateid;
> +	spin_lock(&nn->s2s_cp_lock);
> +	state = idr_find(&nn->s2s_cp_stateids, st->si_opaque.so_id);
> +	if (state)
> +		refcount_inc(&state->cp_p_stid->sc_count);
> +	spin_unlock(&nn->s2s_cp_lock);
> +	if (!state)
> +		return nfserr_bad_stateid;
> +	*cps = state;
> +	return 0;
> +}
> +
> +static __be32 find_cpntf_state(struct nfsd_net *nn, stateid_t *st,
> +			       struct nfs4_stid **stid)
> +{
> +	__be32 status;
> +	struct nfs4_cpntf_state *cps = NULL;
> +
> +	status = _find_cpntf_state(nn, st, &cps);
> +	if (status)
> +		return status;
> +
> +	/* Did the inter server to server copy start in time? */
> +	if (cps->cp_active == false && !time_after(cps->cp_timeout, jiffies)) {
> +		nfs4_put_stid(cps->cp_p_stid);
> +		return nfserr_partner_no_auth;

I wonder whether instead of checking the time we should instead be
destroying copy stateid's as they expire, so the fact that you were
still able to look up the stateid suggests that it's good.  Or would
that result in returning the wrong error here?  Just curious.

> +	} else
> +		cps->cp_active = true;
> +
> +	*stid = cps->cp_p_stid;

What guarantees that cp_p_stid still points to a valid stateid?  (E.g.
if this is an open stateid that has since been closed.)

--b.

> +
> +	return nfs_ok;
> +}
>  
>  /*
>   * Checks for stateid operations
> @@ -5264,6 +5307,8 @@ static __be32 nfsd4_validate_stateid(struct nfs4_client *cl, stateid_t *stateid)
>  	status = nfsd4_lookup_stateid(cstate, stateid,
>  				NFS4_DELEG_STID|NFS4_OPEN_STID|NFS4_LOCK_STID,
>  				&s, nn);
> +	if (status == nfserr_bad_stateid)
> +		status = find_cpntf_state(nn, stateid, &s);
>  	if (status)
>  		return status;
>  	status = nfsd4_stid_check_stateid_generation(stateid, s,
> -- 
> 1.8.3.1

Olga Kornievskaia July 22, 2019, 8:24 p.m. UTC | #2

On Fri, Jul 19, 2019 at 6:01 PM J. Bruce Fields <bfields@fieldses.org> wrote:
>
> On Mon, Jul 08, 2019 at 03:23:49PM -0400, Olga Kornievskaia wrote:
> > Incoming stateid (used by a READ) could be a saved copy stateid.
> > On first use make it active and check that the copy has started
> > within the allowable lease time.
> >
> > Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
> > ---
> >  fs/nfsd/nfs4state.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 45 insertions(+)
> >
> > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > index 2555eb9..b786625 100644
> > --- a/fs/nfsd/nfs4state.c
> > +++ b/fs/nfsd/nfs4state.c
> > @@ -5232,6 +5232,49 @@ static __be32 nfsd4_validate_stateid(struct nfs4_client *cl, stateid_t *stateid)
> >
> >       return 0;
> >  }
> > +/*
> > + * A READ from an inter server to server COPY will have a
> > + * copy stateid. Return the parent nfs4_stid.
> > + */
> > +static __be32 _find_cpntf_state(struct nfsd_net *nn, stateid_t *st,
> > +                  struct nfs4_cpntf_state **cps)
> > +{
> > +     struct nfs4_cpntf_state *state = NULL;
> > +
> > +     if (st->si_opaque.so_clid.cl_id != nn->s2s_cp_cl_id)
> > +             return nfserr_bad_stateid;
> > +     spin_lock(&nn->s2s_cp_lock);
> > +     state = idr_find(&nn->s2s_cp_stateids, st->si_opaque.so_id);
> > +     if (state)
> > +             refcount_inc(&state->cp_p_stid->sc_count);
> > +     spin_unlock(&nn->s2s_cp_lock);
> > +     if (!state)
> > +             return nfserr_bad_stateid;
> > +     *cps = state;
> > +     return 0;
> > +}
> > +
> > +static __be32 find_cpntf_state(struct nfsd_net *nn, stateid_t *st,
> > +                            struct nfs4_stid **stid)
> > +{
> > +     __be32 status;
> > +     struct nfs4_cpntf_state *cps = NULL;
> > +
> > +     status = _find_cpntf_state(nn, st, &cps);
> > +     if (status)
> > +             return status;
> > +
> > +     /* Did the inter server to server copy start in time? */
> > +     if (cps->cp_active == false && !time_after(cps->cp_timeout, jiffies)) {
> > +             nfs4_put_stid(cps->cp_p_stid);
> > +             return nfserr_partner_no_auth;
>
> I wonder whether instead of checking the time we should instead be
> destroying copy stateid's as they expire, so the fact that you were
> still able to look up the stateid suggests that it's good.  Or would
> that result in returning the wrong error here?  Just curious.

In order to destroy copy stateid as they expire we need some thread
monitoring the copies and then remove the expired one. That seems like
a lot more work than what's currently there. The spec says that the
use of the copy has to start without a certain timeout and that's what
this is suppose to enforce. If the client took too long start the
copy, it'll get an error. I don't think it matters what error code is
returned BAD_STATEID or PARTNER_NO_AUTH both imply the stateid is bad.

>
> > +     } else
> > +             cps->cp_active = true;
> > +
> > +     *stid = cps->cp_p_stid;
>
> What guarantees that cp_p_stid still points to a valid stateid?  (E.g.
> if this is an open stateid that has since been closed.)

A copy (or copy_notify) stateid takes a reference on the parent, thus
we guaranteed that pointer is still a valid stateid.

>
> --b.
>
> > +
> > +     return nfs_ok;
> > +}
> >
> >  /*
> >   * Checks for stateid operations
> > @@ -5264,6 +5307,8 @@ static __be32 nfsd4_validate_stateid(struct nfs4_client *cl, stateid_t *stateid)
> >       status = nfsd4_lookup_stateid(cstate, stateid,
> >                               NFS4_DELEG_STID|NFS4_OPEN_STID|NFS4_LOCK_STID,
> >                               &s, nn);
> > +     if (status == nfserr_bad_stateid)
> > +             status = find_cpntf_state(nn, stateid, &s);
> >       if (status)
> >               return status;
> >       status = nfsd4_stid_check_stateid_generation(stateid, s,
> > --
> > 1.8.3.1

J. Bruce Fields July 23, 2019, 8:58 p.m. UTC | #3

On Mon, Jul 22, 2019 at 04:24:08PM -0400, Olga Kornievskaia wrote:
> On Fri, Jul 19, 2019 at 6:01 PM J. Bruce Fields <bfields@fieldses.org> wrote:
> >
> > On Mon, Jul 08, 2019 at 03:23:49PM -0400, Olga Kornievskaia wrote:
> > > Incoming stateid (used by a READ) could be a saved copy stateid.
> > > On first use make it active and check that the copy has started
> > > within the allowable lease time.
> > >
> > > Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
> > > ---
> > >  fs/nfsd/nfs4state.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
> > >  1 file changed, 45 insertions(+)
> > >
> > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > > index 2555eb9..b786625 100644
> > > --- a/fs/nfsd/nfs4state.c
> > > +++ b/fs/nfsd/nfs4state.c
> > > @@ -5232,6 +5232,49 @@ static __be32 nfsd4_validate_stateid(struct nfs4_client *cl, stateid_t *stateid)
> > >
> > >       return 0;
> > >  }
> > > +/*
> > > + * A READ from an inter server to server COPY will have a
> > > + * copy stateid. Return the parent nfs4_stid.
> > > + */
> > > +static __be32 _find_cpntf_state(struct nfsd_net *nn, stateid_t *st,
> > > +                  struct nfs4_cpntf_state **cps)
> > > +{
> > > +     struct nfs4_cpntf_state *state = NULL;
> > > +
> > > +     if (st->si_opaque.so_clid.cl_id != nn->s2s_cp_cl_id)
> > > +             return nfserr_bad_stateid;
> > > +     spin_lock(&nn->s2s_cp_lock);
> > > +     state = idr_find(&nn->s2s_cp_stateids, st->si_opaque.so_id);
> > > +     if (state)
> > > +             refcount_inc(&state->cp_p_stid->sc_count);
> > > +     spin_unlock(&nn->s2s_cp_lock);
> > > +     if (!state)
> > > +             return nfserr_bad_stateid;
> > > +     *cps = state;
> > > +     return 0;
> > > +}
> > > +
> > > +static __be32 find_cpntf_state(struct nfsd_net *nn, stateid_t *st,
> > > +                            struct nfs4_stid **stid)
> > > +{
> > > +     __be32 status;
> > > +     struct nfs4_cpntf_state *cps = NULL;
> > > +
> > > +     status = _find_cpntf_state(nn, st, &cps);
> > > +     if (status)
> > > +             return status;
> > > +
> > > +     /* Did the inter server to server copy start in time? */
> > > +     if (cps->cp_active == false && !time_after(cps->cp_timeout, jiffies)) {
> > > +             nfs4_put_stid(cps->cp_p_stid);
> > > +             return nfserr_partner_no_auth;
> >
> > I wonder whether instead of checking the time we should instead be
> > destroying copy stateid's as they expire, so the fact that you were
> > still able to look up the stateid suggests that it's good.  Or would
> > that result in returning the wrong error here?  Just curious.
> 
> In order to destroy copy stateid as they expire we need some thread
> monitoring the copies and then remove the expired one.

It would be just another thing to do in the laundromat thread.

So when do we free these things?  The only free_cpntf_state() caller I
can find is in nfsd4_offload_cancel, but I think the client only calls
those in case of interrupts or other unusual events.  What about a copy
that terminates normally?

> That seems like
> a lot more work than what's currently there. The spec says that the
> use of the copy has to start without a certain timeout and that's what
> this is suppose to enforce. If the client took too long start the
> copy, it'll get an error. I don't think it matters what error code is
> returned BAD_STATEID or PARTNER_NO_AUTH both imply the stateid is bad.
> 
> >
> > > +     } else
> > > +             cps->cp_active = true;
> > > +
> > > +     *stid = cps->cp_p_stid;
> >
> > What guarantees that cp_p_stid still points to a valid stateid?  (E.g.
> > if this is an open stateid that has since been closed.)
> 
> A copy (or copy_notify) stateid takes a reference on the parent, thus
> we guaranteed that pointer is still a valid stateid.

I only see a reference count taken when one is looked up, in
find_internal_cpntf_state.  That's too late.

--b.

> 
> >
> > --b.
> >
> > > +
> > > +     return nfs_ok;
> > > +}
> > >
> > >  /*
> > >   * Checks for stateid operations
> > > @@ -5264,6 +5307,8 @@ static __be32 nfsd4_validate_stateid(struct nfs4_client *cl, stateid_t *stateid)
> > >       status = nfsd4_lookup_stateid(cstate, stateid,
> > >                               NFS4_DELEG_STID|NFS4_OPEN_STID|NFS4_LOCK_STID,
> > >                               &s, nn);
> > > +     if (status == nfserr_bad_stateid)
> > > +             status = find_cpntf_state(nn, stateid, &s);
> > >       if (status)
> > >               return status;
> > >       status = nfsd4_stid_check_stateid_generation(stateid, s,
> > > --
> > > 1.8.3.1

Olga Kornievskaia July 30, 2019, 4:03 p.m. UTC | #4

On Tue, Jul 23, 2019 at 4:59 PM J. Bruce Fields <bfields@fieldses.org> wrote:
>
> On Mon, Jul 22, 2019 at 04:24:08PM -0400, Olga Kornievskaia wrote:
> > On Fri, Jul 19, 2019 at 6:01 PM J. Bruce Fields <bfields@fieldses.org> wrote:
> > >
> > > On Mon, Jul 08, 2019 at 03:23:49PM -0400, Olga Kornievskaia wrote:
> > > > Incoming stateid (used by a READ) could be a saved copy stateid.
> > > > On first use make it active and check that the copy has started
> > > > within the allowable lease time.
> > > >
> > > > Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
> > > > ---
> > > >  fs/nfsd/nfs4state.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
> > > >  1 file changed, 45 insertions(+)
> > > >
> > > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > > > index 2555eb9..b786625 100644
> > > > --- a/fs/nfsd/nfs4state.c
> > > > +++ b/fs/nfsd/nfs4state.c
> > > > @@ -5232,6 +5232,49 @@ static __be32 nfsd4_validate_stateid(struct nfs4_client *cl, stateid_t *stateid)
> > > >
> > > >       return 0;
> > > >  }
> > > > +/*
> > > > + * A READ from an inter server to server COPY will have a
> > > > + * copy stateid. Return the parent nfs4_stid.
> > > > + */
> > > > +static __be32 _find_cpntf_state(struct nfsd_net *nn, stateid_t *st,
> > > > +                  struct nfs4_cpntf_state **cps)
> > > > +{
> > > > +     struct nfs4_cpntf_state *state = NULL;
> > > > +
> > > > +     if (st->si_opaque.so_clid.cl_id != nn->s2s_cp_cl_id)
> > > > +             return nfserr_bad_stateid;
> > > > +     spin_lock(&nn->s2s_cp_lock);
> > > > +     state = idr_find(&nn->s2s_cp_stateids, st->si_opaque.so_id);
> > > > +     if (state)
> > > > +             refcount_inc(&state->cp_p_stid->sc_count);
> > > > +     spin_unlock(&nn->s2s_cp_lock);
> > > > +     if (!state)
> > > > +             return nfserr_bad_stateid;
> > > > +     *cps = state;
> > > > +     return 0;
> > > > +}
> > > > +
> > > > +static __be32 find_cpntf_state(struct nfsd_net *nn, stateid_t *st,
> > > > +                            struct nfs4_stid **stid)
> > > > +{
> > > > +     __be32 status;
> > > > +     struct nfs4_cpntf_state *cps = NULL;
> > > > +
> > > > +     status = _find_cpntf_state(nn, st, &cps);
> > > > +     if (status)
> > > > +             return status;
> > > > +
> > > > +     /* Did the inter server to server copy start in time? */
> > > > +     if (cps->cp_active == false && !time_after(cps->cp_timeout, jiffies)) {
> > > > +             nfs4_put_stid(cps->cp_p_stid);
> > > > +             return nfserr_partner_no_auth;
> > >
> > > I wonder whether instead of checking the time we should instead be
> > > destroying copy stateid's as they expire, so the fact that you were
> > > still able to look up the stateid suggests that it's good.  Or would
> > > that result in returning the wrong error here?  Just curious.
> >
> > In order to destroy copy stateid as they expire we need some thread
> > monitoring the copies and then remove the expired one.
>
> It would be just another thing to do in the laundromat thread.

This still seems simpler. You'd need to traverse the list and do more
work? What's the advantage of laundry vs this? Given that laundry
thread doesn't run all the time, there might still be a gap with it
was last run and stateid expiring before the next run.

>
> So when do we free these things?  The only free_cpntf_state() caller I
> can find is in nfsd4_offload_cancel,

There is a caller in the nfs4_put_stid. Copy notify state is freed
when the associated stateid going away.

> but I think the client only calls
> those in case of interrupts or other unusual events.  What about a copy
> that terminates normally?

At this point, are you asking about a copy state or a copy_notify
state? When the copy is done, then the destination server will free
the copy state. However, source server doesn't keep track of when the
source server is done with the copy (I don't think we want to do that
to store how much is read and state of the file seems like
unnecessary).

>
> > That seems like
> > a lot more work than what's currently there. The spec says that the
> > use of the copy has to start without a certain timeout and that's what
> > this is suppose to enforce. If the client took too long start the
> > copy, it'll get an error. I don't think it matters what error code is
> > returned BAD_STATEID or PARTNER_NO_AUTH both imply the stateid is bad.
> >
> > >
> > > > +     } else
> > > > +             cps->cp_active = true;
> > > > +
> > > > +     *stid = cps->cp_p_stid;
> > >
> > > What guarantees that cp_p_stid still points to a valid stateid?  (E.g.
> > > if this is an open stateid that has since been closed.)
> >
> > A copy (or copy_notify) stateid takes a reference on the parent, thus
> > we guaranteed that pointer is still a valid stateid.
>
> I only see a reference count taken when one is looked up, in
> find_internal_cpntf_state.  That's too late.

Hm, right so this is tricky. With copy_notify, if I were to take a
reference on the parent when copy_notify is processed, there is no way
to free this reference because the source server never knows when the
copy was done.



>
> --b.
>
> >
> > >
> > > --b.
> > >
> > > > +
> > > > +     return nfs_ok;
> > > > +}
> > > >
> > > >  /*
> > > >   * Checks for stateid operations
> > > > @@ -5264,6 +5307,8 @@ static __be32 nfsd4_validate_stateid(struct nfs4_client *cl, stateid_t *stateid)
> > > >       status = nfsd4_lookup_stateid(cstate, stateid,
> > > >                               NFS4_DELEG_STID|NFS4_OPEN_STID|NFS4_LOCK_STID,
> > > >                               &s, nn);
> > > > +     if (status == nfserr_bad_stateid)
> > > > +             status = find_cpntf_state(nn, stateid, &s);
> > > >       if (status)
> > > >               return status;
> > > >       status = nfsd4_stid_check_stateid_generation(stateid, s,
> > > > --
> > > > 1.8.3.1

Olga Kornievskaia July 31, 2019, 9:10 p.m. UTC | #5

On Tue, Jul 30, 2019 at 12:03 PM Olga Kornievskaia
<olga.kornievskaia@gmail.com> wrote:
>
> On Tue, Jul 23, 2019 at 4:59 PM J. Bruce Fields <bfields@fieldses.org> wrote:
> >
> > On Mon, Jul 22, 2019 at 04:24:08PM -0400, Olga Kornievskaia wrote:
> > > On Fri, Jul 19, 2019 at 6:01 PM J. Bruce Fields <bfields@fieldses.org> wrote:
> > > >
> > > > On Mon, Jul 08, 2019 at 03:23:49PM -0400, Olga Kornievskaia wrote:
> > > > > Incoming stateid (used by a READ) could be a saved copy stateid.
> > > > > On first use make it active and check that the copy has started
> > > > > within the allowable lease time.
> > > > >
> > > > > Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
> > > > > ---
> > > > >  fs/nfsd/nfs4state.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
> > > > >  1 file changed, 45 insertions(+)
> > > > >
> > > > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> > > > > index 2555eb9..b786625 100644
> > > > > --- a/fs/nfsd/nfs4state.c
> > > > > +++ b/fs/nfsd/nfs4state.c
> > > > > @@ -5232,6 +5232,49 @@ static __be32 nfsd4_validate_stateid(struct nfs4_client *cl, stateid_t *stateid)
> > > > >
> > > > >       return 0;
> > > > >  }
> > > > > +/*
> > > > > + * A READ from an inter server to server COPY will have a
> > > > > + * copy stateid. Return the parent nfs4_stid.
> > > > > + */
> > > > > +static __be32 _find_cpntf_state(struct nfsd_net *nn, stateid_t *st,
> > > > > +                  struct nfs4_cpntf_state **cps)
> > > > > +{
> > > > > +     struct nfs4_cpntf_state *state = NULL;
> > > > > +
> > > > > +     if (st->si_opaque.so_clid.cl_id != nn->s2s_cp_cl_id)
> > > > > +             return nfserr_bad_stateid;
> > > > > +     spin_lock(&nn->s2s_cp_lock);
> > > > > +     state = idr_find(&nn->s2s_cp_stateids, st->si_opaque.so_id);
> > > > > +     if (state)
> > > > > +             refcount_inc(&state->cp_p_stid->sc_count);
> > > > > +     spin_unlock(&nn->s2s_cp_lock);
> > > > > +     if (!state)
> > > > > +             return nfserr_bad_stateid;
> > > > > +     *cps = state;
> > > > > +     return 0;
> > > > > +}
> > > > > +
> > > > > +static __be32 find_cpntf_state(struct nfsd_net *nn, stateid_t *st,
> > > > > +                            struct nfs4_stid **stid)
> > > > > +{
> > > > > +     __be32 status;
> > > > > +     struct nfs4_cpntf_state *cps = NULL;
> > > > > +
> > > > > +     status = _find_cpntf_state(nn, st, &cps);
> > > > > +     if (status)
> > > > > +             return status;
> > > > > +
> > > > > +     /* Did the inter server to server copy start in time? */
> > > > > +     if (cps->cp_active == false && !time_after(cps->cp_timeout, jiffies)) {
> > > > > +             nfs4_put_stid(cps->cp_p_stid);
> > > > > +             return nfserr_partner_no_auth;
> > > >
> > > > I wonder whether instead of checking the time we should instead be
> > > > destroying copy stateid's as they expire, so the fact that you were
> > > > still able to look up the stateid suggests that it's good.  Or would
> > > > that result in returning the wrong error here?  Just curious.
> > >
> > > In order to destroy copy stateid as they expire we need some thread
> > > monitoring the copies and then remove the expired one.
> >
> > It would be just another thing to do in the laundromat thread.
>
> This still seems simpler. You'd need to traverse the list and do more
> work? What's the advantage of laundry vs this? Given that laundry
> thread doesn't run all the time, there might still be a gap with it
> was last run and stateid expiring before the next run.
>
> >
> > So when do we free these things?  The only free_cpntf_state() caller I
> > can find is in nfsd4_offload_cancel,
>
> There is a caller in the nfs4_put_stid. Copy notify state is freed
> when the associated stateid going away.
>
> > but I think the client only calls
> > those in case of interrupts or other unusual events.  What about a copy
> > that terminates normally?
>
> At this point, are you asking about a copy state or a copy_notify
> state? When the copy is done, then the destination server will free
> the copy state. However, source server doesn't keep track of when the
> source server is done with the copy (I don't think we want to do that
> to store how much is read and state of the file seems like
> unnecessary).
>
> >
> > > That seems like
> > > a lot more work than what's currently there. The spec says that the
> > > use of the copy has to start without a certain timeout and that's what
> > > this is suppose to enforce. If the client took too long start the
> > > copy, it'll get an error. I don't think it matters what error code is
> > > returned BAD_STATEID or PARTNER_NO_AUTH both imply the stateid is bad.
> > >
> > > >
> > > > > +     } else
> > > > > +             cps->cp_active = true;
> > > > > +
> > > > > +     *stid = cps->cp_p_stid;
> > > >
> > > > What guarantees that cp_p_stid still points to a valid stateid?  (E.g.
> > > > if this is an open stateid that has since been closed.)
> > >
> > > A copy (or copy_notify) stateid takes a reference on the parent, thus
> > > we guaranteed that pointer is still a valid stateid.
> >
> > I only see a reference count taken when one is looked up, in
> > find_internal_cpntf_state.  That's too late.
>
> Hm, right so this is tricky. With copy_notify, if I were to take a
> reference on the parent when copy_notify is processed, there is no way
> to free this reference because the source server never knows when the
> copy was done.

I'm having difficulty with this patch because there is no good way to
know when the copy_notify stateid can be freed. What I can propose is
to have the linux client send a FREE_STATEID with the copy_notify
stateid and use that as the trigger to free the state. In that case,
I'll keep a reference on the parent until the FREE_STATEID is
received.

This is not in the spec (though seems like a good idea to tell the
source server it's ok to clean up) so other implementations might not
choose this approach so we'll have problems with stateids sticking
around.

Thoughts?

>
>
>
> >
> > --b.
> >
> > >
> > > >
> > > > --b.
> > > >
> > > > > +
> > > > > +     return nfs_ok;
> > > > > +}
> > > > >
> > > > >  /*
> > > > >   * Checks for stateid operations
> > > > > @@ -5264,6 +5307,8 @@ static __be32 nfsd4_validate_stateid(struct nfs4_client *cl, stateid_t *stateid)
> > > > >       status = nfsd4_lookup_stateid(cstate, stateid,
> > > > >                               NFS4_DELEG_STID|NFS4_OPEN_STID|NFS4_LOCK_STID,
> > > > >                               &s, nn);
> > > > > +     if (status == nfserr_bad_stateid)
> > > > > +             status = find_cpntf_state(nn, stateid, &s);
> > > > >       if (status)
> > > > >               return status;
> > > > >       status = nfsd4_stid_check_stateid_generation(stateid, s,
> > > > > --
> > > > > 1.8.3.1

Bruce Fields July 31, 2019, 9:51 p.m. UTC | #6

On Wed, Jul 31, 2019 at 05:10:01PM -0400, Olga Kornievskaia wrote:
> I'm having difficulty with this patch because there is no good way to
> know when the copy_notify stateid can be freed. What I can propose is
> to have the linux client send a FREE_STATEID with the copy_notify
> stateid and use that as the trigger to free the state. In that case,
> I'll keep a reference on the parent until the FREE_STATEID is
> received.
> 
> This is not in the spec (though seems like a good idea to tell the
> source server it's ok to clean up) so other implementations might not
> choose this approach so we'll have problems with stateids sticking
> around.

https://tools.ietf.org/html/rfc7862#page-71

	"If the cnr_lease_time expires while the destination server is
	still reading the source file, the destination server is allowed
	to finish reading the file.  If the cnr_lease_time expires
	before the destination server uses READ or READ_PLUS to begin
	the transfer, the source server can use NFS4ERR_PARTNER_NO_AUTH
	to inform the destination server that the cnr_lease_time has
	expired."

The spec doesn't really define what "is allowed to finish reading the
file" means, but I think the source server should decide somehow whether
the target's done.  And "hasn't sent a read in cnr_lease_time" seems
like a pretty good conservative definition that would be easy to
enforce.  Worst case, if the network goes down for a couple minutes and
the target tries to pick up a copy where it left off, it'll get
PARTNER_NO_AUTH.  I assume that results in the same error being returned
the client, at which point the client knows that the copy_notify stateid
may have installed and can do what it chooses to recover (like send a
new copy_notify).

The FREE_STATEID might also be a good idea, but I guess we can't count
on it.

Maybe the spec could use some errata to clarify that FREE_STATEID is
allowed on copy_notify stateids, that clients should send it when
they're done, and that servers are allowed to expire copy_notify
stateid's even after their first use.

--b.

Olga Kornievskaia Aug. 1, 2019, 2:12 p.m. UTC | #7

On Wed, Jul 31, 2019 at 5:51 PM J. Bruce Fields <bfields@redhat.com> wrote:
>
> On Wed, Jul 31, 2019 at 05:10:01PM -0400, Olga Kornievskaia wrote:
> > I'm having difficulty with this patch because there is no good way to
> > know when the copy_notify stateid can be freed. What I can propose is
> > to have the linux client send a FREE_STATEID with the copy_notify
> > stateid and use that as the trigger to free the state. In that case,
> > I'll keep a reference on the parent until the FREE_STATEID is
> > received.
> >
> > This is not in the spec (though seems like a good idea to tell the
> > source server it's ok to clean up) so other implementations might not
> > choose this approach so we'll have problems with stateids sticking
> > around.
>
> https://tools.ietf.org/html/rfc7862#page-71
>
>         "If the cnr_lease_time expires while the destination server is
>         still reading the source file, the destination server is allowed
>         to finish reading the file.  If the cnr_lease_time expires
>         before the destination server uses READ or READ_PLUS to begin
>         the transfer, the source server can use NFS4ERR_PARTNER_NO_AUTH
>         to inform the destination server that the cnr_lease_time has
>         expired."
>
> The spec doesn't really define what "is allowed to finish reading the
> file" means, but I think the source server should decide somehow whether
> the target's done.  And "hasn't sent a read in cnr_lease_time" seems
> like a pretty good conservative definition that would be easy to
> enforce.

"hasn't send a read in cnr_lease_time" is already enforced.

The problem is when the copy did start in normal time, it might take
unknown time to complete. If we limit copies to all be done with in a
cnr_lease_time or even some number of that, we'll get into problems
when files are large enough or network is slow enough that it will
make this method unusable.

> Worst case, if the network goes down for a couple minutes and
> the target tries to pick up a copy where it left off, it'll get
> PARTNER_NO_AUTH.  I assume that results in the same error being returned
> the client, at which point the client knows that the copy_notify stateid
> may have installed and can do what it chooses to recover (like send a
> new copy_notify).

Yes the client recovers but the cost of setting up the source server
to destination is huge so any retries would kill the performance.

>
> The FREE_STATEID might also be a good idea, but I guess we can't count
> on it.
>
> Maybe the spec could use some errata to clarify that FREE_STATEID is
> allowed on copy_notify stateids, that clients should send it when
> they're done, and that servers are allowed to expire copy_notify
> stateid's even after their first use.

FREE_STATEID is for a stateid which a copy_notify (or copy) stateid is
so I don't see anything that really needs any extra stating. I think
what's needed is specifying that for COPY_NOTIFY a client must do a
FREE_STATEID when its done with a stateid.

>
> --b.

J. Bruce Fields Aug. 1, 2019, 3:12 p.m. UTC | #8

On Thu, Aug 01, 2019 at 10:12:11AM -0400, Olga Kornievskaia wrote:
> On Wed, Jul 31, 2019 at 5:51 PM J. Bruce Fields <bfields@redhat.com> wrote:
> >
> > On Wed, Jul 31, 2019 at 05:10:01PM -0400, Olga Kornievskaia wrote:
> > > I'm having difficulty with this patch because there is no good way to
> > > know when the copy_notify stateid can be freed. What I can propose is
> > > to have the linux client send a FREE_STATEID with the copy_notify
> > > stateid and use that as the trigger to free the state. In that case,
> > > I'll keep a reference on the parent until the FREE_STATEID is
> > > received.
> > >
> > > This is not in the spec (though seems like a good idea to tell the
> > > source server it's ok to clean up) so other implementations might not
> > > choose this approach so we'll have problems with stateids sticking
> > > around.
> >
> > https://tools.ietf.org/html/rfc7862#page-71
> >
> >         "If the cnr_lease_time expires while the destination server is
> >         still reading the source file, the destination server is allowed
> >         to finish reading the file.  If the cnr_lease_time expires
> >         before the destination server uses READ or READ_PLUS to begin
> >         the transfer, the source server can use NFS4ERR_PARTNER_NO_AUTH
> >         to inform the destination server that the cnr_lease_time has
> >         expired."
> >
> > The spec doesn't really define what "is allowed to finish reading the
> > file" means, but I think the source server should decide somehow whether
> > the target's done.  And "hasn't sent a read in cnr_lease_time" seems
> > like a pretty good conservative definition that would be easy to
> > enforce.
> 
> "hasn't send a read in cnr_lease_time" is already enforced.
> 
> The problem is when the copy did start in normal time, it might take
> unknown time to complete. If we limit copies to all be done with in a
> cnr_lease_time or even some number of that, we'll get into problems
> when files are large enough or network is slow enough that it will
> make this method unusable.

No, I'm just suggesting that if it's been more than cnr_lease_time since
the target server last sent a read using this stateid, then we could
free the stateid.

> > Worst case, if the network goes down for a couple minutes and
> > the target tries to pick up a copy where it left off, it'll get
> > PARTNER_NO_AUTH.  I assume that results in the same error being returned
> > the client, at which point the client knows that the copy_notify stateid
> > may have installed and can do what it chooses to recover (like send a
> > new copy_notify).
> 
> Yes the client recovers but the cost of setting up the source server
> to destination is huge so any retries would kill the performance.

In the rare case when the server goes an entire cnr_lease_time between
reads, the performance hit of recovery won't be an issue.

> > The FREE_STATEID might also be a good idea, but I guess we can't count
> > on it.
> >
> > Maybe the spec could use some errata to clarify that FREE_STATEID is
> > allowed on copy_notify stateids, that clients should send it when
> > they're done, and that servers are allowed to expire copy_notify
> > stateid's even after their first use.
> 
> FREE_STATEID is for a stateid

The discussion of FREE_STATEID in 4.1 says "The FREE_STATEID operation
is used to free a stateid that no longer has any associated locks
(including opens, byte-range locks, delegations, and layouts)."  A
clarification that it can be used for any stateid would be nice.  (Is
that true?  Do we want it for COPY stateid's too?)

--b.

> which a copy_notify (or copy) stateid is so I don't see anything that
> really needs any extra stating.
>
> I think what's needed is specifying that for COPY_NOTIFY a client must
> do a FREE_STATEID when its done with a stateid.

Olga Kornievskaia Aug. 1, 2019, 3:41 p.m. UTC | #9

On Thu, Aug 1, 2019 at 11:13 AM J. Bruce Fields <bfields@fieldses.org> wrote:
>
> On Thu, Aug 01, 2019 at 10:12:11AM -0400, Olga Kornievskaia wrote:
> > On Wed, Jul 31, 2019 at 5:51 PM J. Bruce Fields <bfields@redhat.com> wrote:
> > >
> > > On Wed, Jul 31, 2019 at 05:10:01PM -0400, Olga Kornievskaia wrote:
> > > > I'm having difficulty with this patch because there is no good way to
> > > > know when the copy_notify stateid can be freed. What I can propose is
> > > > to have the linux client send a FREE_STATEID with the copy_notify
> > > > stateid and use that as the trigger to free the state. In that case,
> > > > I'll keep a reference on the parent until the FREE_STATEID is
> > > > received.
> > > >
> > > > This is not in the spec (though seems like a good idea to tell the
> > > > source server it's ok to clean up) so other implementations might not
> > > > choose this approach so we'll have problems with stateids sticking
> > > > around.
> > >
> > > https://tools.ietf.org/html/rfc7862#page-71
> > >
> > >         "If the cnr_lease_time expires while the destination server is
> > >         still reading the source file, the destination server is allowed
> > >         to finish reading the file.  If the cnr_lease_time expires
> > >         before the destination server uses READ or READ_PLUS to begin
> > >         the transfer, the source server can use NFS4ERR_PARTNER_NO_AUTH
> > >         to inform the destination server that the cnr_lease_time has
> > >         expired."
> > >
> > > The spec doesn't really define what "is allowed to finish reading the
> > > file" means, but I think the source server should decide somehow whether
> > > the target's done.  And "hasn't sent a read in cnr_lease_time" seems
> > > like a pretty good conservative definition that would be easy to
> > > enforce.
> >
> > "hasn't send a read in cnr_lease_time" is already enforced.
> >
> > The problem is when the copy did start in normal time, it might take
> > unknown time to complete. If we limit copies to all be done with in a
> > cnr_lease_time or even some number of that, we'll get into problems
> > when files are large enough or network is slow enough that it will
> > make this method unusable.
>
> No, I'm just suggesting that if it's been more than cnr_lease_time since
> the target server last sent a read using this stateid, then we could
> free the stateid.

That's reasonable. Let me do that.

> > > Worst case, if the network goes down for a couple minutes and
> > > the target tries to pick up a copy where it left off, it'll get
> > > PARTNER_NO_AUTH.  I assume that results in the same error being returned
> > > the client, at which point the client knows that the copy_notify stateid
> > > may have installed and can do what it chooses to recover (like send a
> > > new copy_notify).
> >
> > Yes the client recovers but the cost of setting up the source server
> > to destination is huge so any retries would kill the performance.
>
> In the rare case when the server goes an entire cnr_lease_time between
> reads, the performance hit of recovery won't be an issue.
>
> > > The FREE_STATEID might also be a good idea, but I guess we can't count
> > > on it.
> > >
> > > Maybe the spec could use some errata to clarify that FREE_STATEID is
> > > allowed on copy_notify stateids, that clients should send it when
> > > they're done, and that servers are allowed to expire copy_notify
> > > stateid's even after their first use.
> >
> > FREE_STATEID is for a stateid
>
> The discussion of FREE_STATEID in 4.1 says "The FREE_STATEID operation
> is used to free a stateid that no longer has any associated locks
> (including opens, byte-range locks, delegations, and layouts)."  A
> clarification that it can be used for any stateid would be nice.  (Is
> that true?  Do we want it for COPY stateid's too?)

We don't need it for the COPY stateids as there is a OFFLOAD_CANCEL if
the client wants to stop, otherwise, the destination server has no
problems with knowing when to free the copy stateid.

>
> --b.
>
> > which a copy_notify (or copy) stateid is so I don't see anything that
> > really needs any extra stating.
> >
> > I think what's needed is specifying that for COPY_NOTIFY a client must
> > do a FREE_STATEID when its done with a stateid.

Olga Kornievskaia Aug. 1, 2019, 6:06 p.m. UTC | #10

On Thu, Aug 1, 2019 at 11:41 AM Olga Kornievskaia
<olga.kornievskaia@gmail.com> wrote:
>
> On Thu, Aug 1, 2019 at 11:13 AM J. Bruce Fields <bfields@fieldses.org> wrote:
> >
> > On Thu, Aug 01, 2019 at 10:12:11AM -0400, Olga Kornievskaia wrote:
> > > On Wed, Jul 31, 2019 at 5:51 PM J. Bruce Fields <bfields@redhat.com> wrote:
> > > >
> > > > On Wed, Jul 31, 2019 at 05:10:01PM -0400, Olga Kornievskaia wrote:
> > > > > I'm having difficulty with this patch because there is no good way to
> > > > > know when the copy_notify stateid can be freed. What I can propose is
> > > > > to have the linux client send a FREE_STATEID with the copy_notify
> > > > > stateid and use that as the trigger to free the state. In that case,
> > > > > I'll keep a reference on the parent until the FREE_STATEID is
> > > > > received.
> > > > >
> > > > > This is not in the spec (though seems like a good idea to tell the
> > > > > source server it's ok to clean up) so other implementations might not
> > > > > choose this approach so we'll have problems with stateids sticking
> > > > > around.
> > > >
> > > > https://tools.ietf.org/html/rfc7862#page-71
> > > >
> > > >         "If the cnr_lease_time expires while the destination server is
> > > >         still reading the source file, the destination server is allowed
> > > >         to finish reading the file.  If the cnr_lease_time expires
> > > >         before the destination server uses READ or READ_PLUS to begin
> > > >         the transfer, the source server can use NFS4ERR_PARTNER_NO_AUTH
> > > >         to inform the destination server that the cnr_lease_time has
> > > >         expired."
> > > >
> > > > The spec doesn't really define what "is allowed to finish reading the
> > > > file" means, but I think the source server should decide somehow whether
> > > > the target's done.  And "hasn't sent a read in cnr_lease_time" seems
> > > > like a pretty good conservative definition that would be easy to
> > > > enforce.
> > >
> > > "hasn't send a read in cnr_lease_time" is already enforced.
> > >
> > > The problem is when the copy did start in normal time, it might take
> > > unknown time to complete. If we limit copies to all be done with in a
> > > cnr_lease_time or even some number of that, we'll get into problems
> > > when files are large enough or network is slow enough that it will
> > > make this method unusable.
> >
> > No, I'm just suggesting that if it's been more than cnr_lease_time since
> > the target server last sent a read using this stateid, then we could
> > free the stateid.
>
> That's reasonable. Let me do that.

Now that I need a global list for the copy_notify stateids, do you
have a preference for either to keep it of the nfs4_client structure
or the nfsd_net structure? I store async copies under the nfs4_client
structure but the laundromat traverses things in nfsd_net structure.

>
> > > > Worst case, if the network goes down for a couple minutes and
> > > > the target tries to pick up a copy where it left off, it'll get
> > > > PARTNER_NO_AUTH.  I assume that results in the same error being returned
> > > > the client, at which point the client knows that the copy_notify stateid
> > > > may have installed and can do what it chooses to recover (like send a
> > > > new copy_notify).
> > >
> > > Yes the client recovers but the cost of setting up the source server
> > > to destination is huge so any retries would kill the performance.
> >
> > In the rare case when the server goes an entire cnr_lease_time between
> > reads, the performance hit of recovery won't be an issue.
> >
> > > > The FREE_STATEID might also be a good idea, but I guess we can't count
> > > > on it.
> > > >
> > > > Maybe the spec could use some errata to clarify that FREE_STATEID is
> > > > allowed on copy_notify stateids, that clients should send it when
> > > > they're done, and that servers are allowed to expire copy_notify
> > > > stateid's even after their first use.
> > >
> > > FREE_STATEID is for a stateid
> >
> > The discussion of FREE_STATEID in 4.1 says "The FREE_STATEID operation
> > is used to free a stateid that no longer has any associated locks
> > (including opens, byte-range locks, delegations, and layouts)."  A
> > clarification that it can be used for any stateid would be nice.  (Is
> > that true?  Do we want it for COPY stateid's too?)
>
> We don't need it for the COPY stateids as there is a OFFLOAD_CANCEL if
> the client wants to stop, otherwise, the destination server has no
> problems with knowing when to free the copy stateid.
>
> >
> > --b.
> >
> > > which a copy_notify (or copy) stateid is so I don't see anything that
> > > really needs any extra stating.
> > >
> > > I think what's needed is specifying that for COPY_NOTIFY a client must
> > > do a FREE_STATEID when its done with a stateid.

J. Bruce Fields Aug. 1, 2019, 6:11 p.m. UTC | #11

On Thu, Aug 01, 2019 at 02:06:46PM -0400, Olga Kornievskaia wrote:
> On Thu, Aug 1, 2019 at 11:41 AM Olga Kornievskaia
> <olga.kornievskaia@gmail.com> wrote:
> >
> > On Thu, Aug 1, 2019 at 11:13 AM J. Bruce Fields <bfields@fieldses.org> wrote:
> > >
> > > On Thu, Aug 01, 2019 at 10:12:11AM -0400, Olga Kornievskaia wrote:
> > > > On Wed, Jul 31, 2019 at 5:51 PM J. Bruce Fields <bfields@redhat.com> wrote:
> > > > >
> > > > > On Wed, Jul 31, 2019 at 05:10:01PM -0400, Olga Kornievskaia wrote:
> > > > > > I'm having difficulty with this patch because there is no good way to
> > > > > > know when the copy_notify stateid can be freed. What I can propose is
> > > > > > to have the linux client send a FREE_STATEID with the copy_notify
> > > > > > stateid and use that as the trigger to free the state. In that case,
> > > > > > I'll keep a reference on the parent until the FREE_STATEID is
> > > > > > received.
> > > > > >
> > > > > > This is not in the spec (though seems like a good idea to tell the
> > > > > > source server it's ok to clean up) so other implementations might not
> > > > > > choose this approach so we'll have problems with stateids sticking
> > > > > > around.
> > > > >
> > > > > https://tools.ietf.org/html/rfc7862#page-71
> > > > >
> > > > >         "If the cnr_lease_time expires while the destination server is
> > > > >         still reading the source file, the destination server is allowed
> > > > >         to finish reading the file.  If the cnr_lease_time expires
> > > > >         before the destination server uses READ or READ_PLUS to begin
> > > > >         the transfer, the source server can use NFS4ERR_PARTNER_NO_AUTH
> > > > >         to inform the destination server that the cnr_lease_time has
> > > > >         expired."
> > > > >
> > > > > The spec doesn't really define what "is allowed to finish reading the
> > > > > file" means, but I think the source server should decide somehow whether
> > > > > the target's done.  And "hasn't sent a read in cnr_lease_time" seems
> > > > > like a pretty good conservative definition that would be easy to
> > > > > enforce.
> > > >
> > > > "hasn't send a read in cnr_lease_time" is already enforced.
> > > >
> > > > The problem is when the copy did start in normal time, it might take
> > > > unknown time to complete. If we limit copies to all be done with in a
> > > > cnr_lease_time or even some number of that, we'll get into problems
> > > > when files are large enough or network is slow enough that it will
> > > > make this method unusable.
> > >
> > > No, I'm just suggesting that if it's been more than cnr_lease_time since
> > > the target server last sent a read using this stateid, then we could
> > > free the stateid.
> >
> > That's reasonable. Let me do that.
> 
> Now that I need a global list for the copy_notify stateids, do you
> have a preference for either to keep it of the nfs4_client structure
> or the nfsd_net structure? I store async copies under the nfs4_client
> structure but the laundromat traverses things in nfsd_net structure.

If copy_notify stateids are associated with a client, then they must
already be reachable from the client somehow so they can be destroyed at
the time the client is, right?  I'm saying that without looking at the
code....

--b.

Olga Kornievskaia Aug. 1, 2019, 6:24 p.m. UTC | #12

On Thu, Aug 1, 2019 at 2:12 PM J. Bruce Fields <bfields@fieldses.org> wrote:
>
> On Thu, Aug 01, 2019 at 02:06:46PM -0400, Olga Kornievskaia wrote:
> > On Thu, Aug 1, 2019 at 11:41 AM Olga Kornievskaia
> > <olga.kornievskaia@gmail.com> wrote:
> > >
> > > On Thu, Aug 1, 2019 at 11:13 AM J. Bruce Fields <bfields@fieldses.org> wrote:
> > > >
> > > > On Thu, Aug 01, 2019 at 10:12:11AM -0400, Olga Kornievskaia wrote:
> > > > > On Wed, Jul 31, 2019 at 5:51 PM J. Bruce Fields <bfields@redhat.com> wrote:
> > > > > >
> > > > > > On Wed, Jul 31, 2019 at 05:10:01PM -0400, Olga Kornievskaia wrote:
> > > > > > > I'm having difficulty with this patch because there is no good way to
> > > > > > > know when the copy_notify stateid can be freed. What I can propose is
> > > > > > > to have the linux client send a FREE_STATEID with the copy_notify
> > > > > > > stateid and use that as the trigger to free the state. In that case,
> > > > > > > I'll keep a reference on the parent until the FREE_STATEID is
> > > > > > > received.
> > > > > > >
> > > > > > > This is not in the spec (though seems like a good idea to tell the
> > > > > > > source server it's ok to clean up) so other implementations might not
> > > > > > > choose this approach so we'll have problems with stateids sticking
> > > > > > > around.
> > > > > >
> > > > > > https://tools.ietf.org/html/rfc7862#page-71
> > > > > >
> > > > > >         "If the cnr_lease_time expires while the destination server is
> > > > > >         still reading the source file, the destination server is allowed
> > > > > >         to finish reading the file.  If the cnr_lease_time expires
> > > > > >         before the destination server uses READ or READ_PLUS to begin
> > > > > >         the transfer, the source server can use NFS4ERR_PARTNER_NO_AUTH
> > > > > >         to inform the destination server that the cnr_lease_time has
> > > > > >         expired."
> > > > > >
> > > > > > The spec doesn't really define what "is allowed to finish reading the
> > > > > > file" means, but I think the source server should decide somehow whether
> > > > > > the target's done.  And "hasn't sent a read in cnr_lease_time" seems
> > > > > > like a pretty good conservative definition that would be easy to
> > > > > > enforce.
> > > > >
> > > > > "hasn't send a read in cnr_lease_time" is already enforced.
> > > > >
> > > > > The problem is when the copy did start in normal time, it might take
> > > > > unknown time to complete. If we limit copies to all be done with in a
> > > > > cnr_lease_time or even some number of that, we'll get into problems
> > > > > when files are large enough or network is slow enough that it will
> > > > > make this method unusable.
> > > >
> > > > No, I'm just suggesting that if it's been more than cnr_lease_time since
> > > > the target server last sent a read using this stateid, then we could
> > > > free the stateid.
> > >
> > > That's reasonable. Let me do that.
> >
> > Now that I need a global list for the copy_notify stateids, do you
> > have a preference for either to keep it of the nfs4_client structure
> > or the nfsd_net structure? I store async copies under the nfs4_client
> > structure but the laundromat traverses things in nfsd_net structure.
>
> If copy_notify stateids are associated with a client, then they must
> already be reachable from the client somehow so they can be destroyed at
> the time the client is, right?  I'm saying that without looking at the
> code....

yes, i agree. but since we are taking a reference on a parent stateid
and the copy_notify state is destroyed at the destruction of the
parent id, then we'll never get there (or shouldn't get there). But I
can add something to the client destruction to make sure to delete
anything if it's there.

i was just looking at close_lru and delegation_lru but I guess that's
not a list of delegation or open stateids but rather some complex of
not deleting the stateid right away but moving it to nfs4_ol_stateid
and the list on the nfsd_net. Are you looking for something similar
for the copy_notify state or can I just keep a global list of the
nfs4_client and add and delete of that (not move to the delete later)?

> --b.

Bruce Fields Aug. 1, 2019, 7:36 p.m. UTC | #13

On Thu, Aug 01, 2019 at 02:24:04PM -0400, Olga Kornievskaia wrote:
> i was just looking at close_lru and delegation_lru but I guess that's
> not a list of delegation or open stateids but rather some complex of
> not deleting the stateid right away but moving it to nfs4_ol_stateid
> and the list on the nfsd_net. Are you looking for something similar
> for the copy_notify state or can I just keep a global list of the
> nfs4_client and add and delete of that (not move to the delete later)?

A global list seems like it should work if the locking's OK.

--b.

Olga Kornievskaia Aug. 7, 2019, 4:02 p.m. UTC | #14

On Thu, Aug 1, 2019 at 3:36 PM J. Bruce Fields <bfields@redhat.com> wrote:
>
> On Thu, Aug 01, 2019 at 02:24:04PM -0400, Olga Kornievskaia wrote:
> > i was just looking at close_lru and delegation_lru but I guess that's
> > not a list of delegation or open stateids but rather some complex of
> > not deleting the stateid right away but moving it to nfs4_ol_stateid
> > and the list on the nfsd_net. Are you looking for something similar
> > for the copy_notify state or can I just keep a global list of the
> > nfs4_client and add and delete of that (not move to the delete later)?
>
> A global list seems like it should work if the locking's OK.

I'm having issues taking a reference on a parent stateid and being
able to clean it. Let me try to explain.

Since I take a reference on the stateid, then during what would have
been the last put (due to say a close operation), stateid isn't
released. Now that stateid is sticking around. I personally would have
liked on what would have been a close and release of the stateid to
release the copy notify state(s) (which was being done before but
having a reference makes it hard? i want to count number of copy
notify states and if then somehow if the num_copies-1 is going to make
it 0, then decrement by num_copies (and the normal -1) but if it's not
the last reference then it shouldn't be decremented.

Now say no fancy logic happens on close so we have these stateids left
over . What to do on unmount? It will error with err_client_busy since
there are non-zero copy notify states and only after a lease period it
will release the resources (when the close of the file should have
removed any copy notify state)?

Question: would it be acceptable to do something like this on freeing
of the parent stateid?

@@ -896,8 +931,12 @@ static void block_delegations(struct knfsd_fh *fh)
        might_lock(&clp->cl_lock);

        if (!refcount_dec_and_lock(&s->sc_count, &clp->cl_lock)) {
-               wake_up_all(&close_wq);
-               return;
+               if (!refcount_sub_and_test_checked(s->sc_cp_list_size,
+                               &s->sc_count)) {
+                       refcount_add_checked(s->sc_cp_list_size, &s->sc_count);
+                       wake_up_all(&close_wq);
+                       return;
+               }
        }
        idr_remove(&clp->cl_stateids, s->sc_stateid.si_opaque.so_id);
        spin_unlock(&clp->cl_lock);

then free the copy notify stateids associated with stateid.

Laundromat would still be checking the copy_notify stateids for
anything that's been not active for a while (but not closed).

>
> --b.

J. Bruce Fields Aug. 7, 2019, 4:08 p.m. UTC | #15

On Wed, Aug 07, 2019 at 12:02:40PM -0400, Olga Kornievskaia wrote:
> On Thu, Aug 1, 2019 at 3:36 PM J. Bruce Fields <bfields@redhat.com> wrote:
> >
> > On Thu, Aug 01, 2019 at 02:24:04PM -0400, Olga Kornievskaia wrote:
> > > i was just looking at close_lru and delegation_lru but I guess that's
> > > not a list of delegation or open stateids but rather some complex of
> > > not deleting the stateid right away but moving it to nfs4_ol_stateid
> > > and the list on the nfsd_net. Are you looking for something similar
> > > for the copy_notify state or can I just keep a global list of the
> > > nfs4_client and add and delete of that (not move to the delete later)?
> >
> > A global list seems like it should work if the locking's OK.
> 
> I'm having issues taking a reference on a parent stateid and being
> able to clean it. Let me try to explain.

With other stateid parent relationships I believe what we do is: instead
of the child taking a reference on the parent, we ensure that the child
is destroyed, and that nobody can be holding a pointer to it, before we
destroy the parent.

--b.

> Since I take a reference on the stateid, then during what would have
> been the last put (due to say a close operation), stateid isn't
> released. Now that stateid is sticking around. I personally would have
> liked on what would have been a close and release of the stateid to
> release the copy notify state(s) (which was being done before but
> having a reference makes it hard? i want to count number of copy
> notify states and if then somehow if the num_copies-1 is going to make
> it 0, then decrement by num_copies (and the normal -1) but if it's not
> the last reference then it shouldn't be decremented.
> 
> Now say no fancy logic happens on close so we have these stateids left
> over . What to do on unmount? It will error with err_client_busy since
> there are non-zero copy notify states and only after a lease period it
> will release the resources (when the close of the file should have
> removed any copy notify state)?
> 
> Question: would it be acceptable to do something like this on freeing
> of the parent stateid?
> 
> @@ -896,8 +931,12 @@ static void block_delegations(struct knfsd_fh *fh)
>         might_lock(&clp->cl_lock);
> 
>         if (!refcount_dec_and_lock(&s->sc_count, &clp->cl_lock)) {
> -               wake_up_all(&close_wq);
> -               return;
> +               if (!refcount_sub_and_test_checked(s->sc_cp_list_size,
> +                               &s->sc_count)) {
> +                       refcount_add_checked(s->sc_cp_list_size, &s->sc_count);
> +                       wake_up_all(&close_wq);
> +                       return;
> +               }
>         }
>         idr_remove(&clp->cl_stateids, s->sc_stateid.si_opaque.so_id);
>         spin_unlock(&clp->cl_lock);
> 
> then free the copy notify stateids associated with stateid.
> 
> Laundromat would still be checking the copy_notify stateids for
> anything that's been not active for a while (but not closed).
> 
> 
> 
> 
> 
> >
> > --b.

Olga Kornievskaia Aug. 7, 2019, 4:42 p.m. UTC | #16

On Wed, Aug 7, 2019 at 12:09 PM J. Bruce Fields <bfields@fieldses.org> wrote:
>
> On Wed, Aug 07, 2019 at 12:02:40PM -0400, Olga Kornievskaia wrote:
> > On Thu, Aug 1, 2019 at 3:36 PM J. Bruce Fields <bfields@redhat.com> wrote:
> > >
> > > On Thu, Aug 01, 2019 at 02:24:04PM -0400, Olga Kornievskaia wrote:
> > > > i was just looking at close_lru and delegation_lru but I guess that's
> > > > not a list of delegation or open stateids but rather some complex of
> > > > not deleting the stateid right away but moving it to nfs4_ol_stateid
> > > > and the list on the nfsd_net. Are you looking for something similar
> > > > for the copy_notify state or can I just keep a global list of the
> > > > nfs4_client and add and delete of that (not move to the delete later)?
> > >
> > > A global list seems like it should work if the locking's OK.
> >
> > I'm having issues taking a reference on a parent stateid and being
> > able to clean it. Let me try to explain.
>
> With other stateid parent relationships I believe what we do is: instead
> of the child taking a reference on the parent, we ensure that the child
> is destroyed, and that nobody can be holding a pointer to it, before we
> destroy the parent.

I don't think we can get away from not taking a reference on the
parent. When a READ comes with the copy_notify stateid, it's used to
lookup the parent state because the nfs4_preprocess_stateid_op() that
checks the validity of the stateid for a given operation needs to
check validity of that parent stateid). Otherwise, we'd have to
special case the READ calling nfs4_preprocess_stateid_op() and special
call that function to when called from READ and finding a copy_notify
stateid will forego the other checks. Do you want me to that instead
of what I proposed below?

>
> --b.
>
> > Since I take a reference on the stateid, then during what would have
> > been the last put (due to say a close operation), stateid isn't
> > released. Now that stateid is sticking around. I personally would have
> > liked on what would have been a close and release of the stateid to
> > release the copy notify state(s) (which was being done before but
> > having a reference makes it hard? i want to count number of copy
> > notify states and if then somehow if the num_copies-1 is going to make
> > it 0, then decrement by num_copies (and the normal -1) but if it's not
> > the last reference then it shouldn't be decremented.
> >
> > Now say no fancy logic happens on close so we have these stateids left
> > over . What to do on unmount? It will error with err_client_busy since
> > there are non-zero copy notify states and only after a lease period it
> > will release the resources (when the close of the file should have
> > removed any copy notify state)?
> >
> > Question: would it be acceptable to do something like this on freeing
> > of the parent stateid?
> >
> > @@ -896,8 +931,12 @@ static void block_delegations(struct knfsd_fh *fh)
> >         might_lock(&clp->cl_lock);
> >
> >         if (!refcount_dec_and_lock(&s->sc_count, &clp->cl_lock)) {
> > -               wake_up_all(&close_wq);
> > -               return;
> > +               if (!refcount_sub_and_test_checked(s->sc_cp_list_size,
> > +                               &s->sc_count)) {
> > +                       refcount_add_checked(s->sc_cp_list_size, &s->sc_count);
> > +                       wake_up_all(&close_wq);
> > +                       return;
> > +               }
> >         }
> >         idr_remove(&clp->cl_stateids, s->sc_stateid.si_opaque.so_id);
> >         spin_unlock(&clp->cl_lock);
> >
> > then free the copy notify stateids associated with stateid.
> >
> > Laundromat would still be checking the copy_notify stateids for
> > anything that's been not active for a while (but not closed).
> >
> >
> >
> >
> >
> > >
> > > --b.

J. Bruce Fields Aug. 8, 2019, 11:25 a.m. UTC | #17

On Wed, Aug 07, 2019 at 12:42:08PM -0400, Olga Kornievskaia wrote:
> On Wed, Aug 7, 2019 at 12:09 PM J. Bruce Fields <bfields@fieldses.org> wrote:
> >
> > On Wed, Aug 07, 2019 at 12:02:40PM -0400, Olga Kornievskaia wrote:
> > > On Thu, Aug 1, 2019 at 3:36 PM J. Bruce Fields <bfields@redhat.com> wrote:
> > > >
> > > > On Thu, Aug 01, 2019 at 02:24:04PM -0400, Olga Kornievskaia wrote:
> > > > > i was just looking at close_lru and delegation_lru but I guess that's
> > > > > not a list of delegation or open stateids but rather some complex of
> > > > > not deleting the stateid right away but moving it to nfs4_ol_stateid
> > > > > and the list on the nfsd_net. Are you looking for something similar
> > > > > for the copy_notify state or can I just keep a global list of the
> > > > > nfs4_client and add and delete of that (not move to the delete later)?
> > > >
> > > > A global list seems like it should work if the locking's OK.
> > >
> > > I'm having issues taking a reference on a parent stateid and being
> > > able to clean it. Let me try to explain.
> >
> > With other stateid parent relationships I believe what we do is: instead
> > of the child taking a reference on the parent, we ensure that the child
> > is destroyed, and that nobody can be holding a pointer to it, before we
> > destroy the parent.
> 
> I don't think we can get away from not taking a reference on the
> parent. When a READ comes with the copy_notify stateid, it's used to
> lookup the parent state because the nfs4_preprocess_stateid_op() that
> checks the validity of the stateid for a given operation needs to
> check validity of that parent stateid). Otherwise, we'd have to
> special case the READ calling nfs4_preprocess_stateid_op() and special
> call that function to when called from READ and finding a copy_notify
> stateid will forego the other checks. Do you want me to that instead
> of what I proposed below?

Um, honestly I'm not sure I understand your code below yet.  I'll take
another look....

> > > Since I take a reference on the stateid, then during what would have
> > > been the last put (due to say a close operation), stateid isn't
> > > released. Now that stateid is sticking around. I personally would have
> > > liked on what would have been a close and release of the stateid to
> > > release the copy notify state(s)

That's OK with me as long as it works.  Did I complain about it?  The
only real requirement is that we've got *some* way to assure that we
aren't going to find a copy_notify stateid and try to follow it to its
parent, after the parent's been freed.

--b.

> > > (which was being done before but
> > > having a reference makes it hard? i want to count number of copy
> > > notify states and if then somehow if the num_copies-1 is going to make
> > > it 0, then decrement by num_copies (and the normal -1) but if it's not
> > > the last reference then it shouldn't be decremented.
> > >
> > > Now say no fancy logic happens on close so we have these stateids left
> > > over . What to do on unmount? It will error with err_client_busy since
> > > there are non-zero copy notify states and only after a lease period it
> > > will release the resources (when the close of the file should have
> > > removed any copy notify state)?
> > >
> > > Question: would it be acceptable to do something like this on freeing
> > > of the parent stateid?
> > >
> > > @@ -896,8 +931,12 @@ static void block_delegations(struct knfsd_fh *fh)
> > >         might_lock(&clp->cl_lock);
> > >
> > >         if (!refcount_dec_and_lock(&s->sc_count, &clp->cl_lock)) {
> > > -               wake_up_all(&close_wq);
> > > -               return;
> > > +               if (!refcount_sub_and_test_checked(s->sc_cp_list_size,
> > > +                               &s->sc_count)) {
> > > +                       refcount_add_checked(s->sc_cp_list_size, &s->sc_count);
> > > +                       wake_up_all(&close_wq);
> > > +                       return;
> > > +               }
> > >         }
> > >         idr_remove(&clp->cl_stateids, s->sc_stateid.si_opaque.so_id);
> > >         spin_unlock(&clp->cl_lock);
> > >
> > > then free the copy notify stateids associated with stateid.
> > >
> > > Laundromat would still be checking the copy_notify stateids for
> > > anything that's been not active for a while (but not closed).
> > >
> > >
> > >
> > >
> > >
> > > >
> > > > --b.

[v4,5/8] NFSD check stateids against copy stateids

Commit Message

Comments

Patch