mbox series

[00/14] Delegation bugfixes

Message ID 20191023235600.10880-1-trond.myklebust@hammerspace.com (mailing list archive)
Headers show
Series Delegation bugfixes | expand

Message

Trond Myklebust Oct. 23, 2019, 11:55 p.m. UTC
The following patchset fixes up a number of issues with delegations,
but mainly attempts to fix a race condition between open and
delegreturn, where the open and the delegreturn get re-ordered so
that the delegreturn ends up revoking the delegation that was returned
by the open.
The root cause is that in certain circumstances, we may currently end
up freeing the delegation from delegreturn, so when we later receive
the reply to the open, we've lost track of the fact that the seqid
predates the one that was returned.

This patchset fixes that case by ensuring that we always keep track
of the last delegation stateid that was returned for any given inode.
That way, if we later see a delegation stateid with the same opaque
field, but an older seqid, we know we cannot trust it, and so we
ask to replay the OPEN compound.

Trond Myklebust (14):
  NFSv4: Don't allow a cached open with a revoked delegation
  NFSv4: Fix delegation handling in update_open_stateid()
  NFSv4: nfs4_callback_getattr() should ignore revoked delegations
  NFSv4: Delegation recalls should not find revoked delegations
  NFSv4: fail nfs4_refresh_delegation_stateid() when the delegation was
    revoked
  NFS: Rename nfs_inode_return_delegation_noreclaim()
  NFSv4: Don't remove the delegation from the super_list more than once
  NFSv4: Hold the delegation spinlock when updating the seqid
  NFSv4: Clear the NFS_DELEGATION_REVOKED flag in
    nfs_update_inplace_delegation()
  NFSv4: Update the stateid seqid in nfs_revoke_delegation()
  NFSv4: Revoke the delegation on success in nfs4_delegreturn_done()
  NFSv4: Ignore requests to return the delegation if it was revoked
  NFSv4: Don't reclaim delegations that have been returned or revoked
  NFSv4: Fix races between open and delegreturn

 fs/nfs/callback_proc.c |   2 +-
 fs/nfs/delegation.c    | 109 +++++++++++++++++++++++++++++------------
 fs/nfs/delegation.h    |   4 +-
 fs/nfs/nfs4proc.c      |  13 ++---
 fs/nfs/nfs4super.c     |   4 +-
 5 files changed, 88 insertions(+), 44 deletions(-)

Comments

Olga Kornievskaia Oct. 31, 2019, 3:27 p.m. UTC | #1
Hi Trond,

This patch set produces the following in my testing. Basically what I
see the client is prevented from using a delegation at all.

After I induce a race of DELEGRETURN/OPEN
--- the racing OPEN gets a delegation (it returns the same seqid and
other as the delegation being returned) but the client doesn't use it.
--- the following (next) OPEN that also gets a delegation immediately
has the client returning the given delegation.

Disclaimer: in my testing the racing DELEGRETURN doesn't fail with
OLD_STATEID, NetApp returns OK.

On Thu, Oct 24, 2019 at 6:56 AM Trond Myklebust <trondmy@gmail.com> wrote:
>
> The following patchset fixes up a number of issues with delegations,
> but mainly attempts to fix a race condition between open and
> delegreturn, where the open and the delegreturn get re-ordered so
> that the delegreturn ends up revoking the delegation that was returned
> by the open.
> The root cause is that in certain circumstances, we may currently end
> up freeing the delegation from delegreturn, so when we later receive
> the reply to the open, we've lost track of the fact that the seqid
> predates the one that was returned.
>
> This patchset fixes that case by ensuring that we always keep track
> of the last delegation stateid that was returned for any given inode.
> That way, if we later see a delegation stateid with the same opaque
> field, but an older seqid, we know we cannot trust it, and so we
> ask to replay the OPEN compound.
>
> Trond Myklebust (14):
>   NFSv4: Don't allow a cached open with a revoked delegation
>   NFSv4: Fix delegation handling in update_open_stateid()
>   NFSv4: nfs4_callback_getattr() should ignore revoked delegations
>   NFSv4: Delegation recalls should not find revoked delegations
>   NFSv4: fail nfs4_refresh_delegation_stateid() when the delegation was
>     revoked
>   NFS: Rename nfs_inode_return_delegation_noreclaim()
>   NFSv4: Don't remove the delegation from the super_list more than once
>   NFSv4: Hold the delegation spinlock when updating the seqid
>   NFSv4: Clear the NFS_DELEGATION_REVOKED flag in
>     nfs_update_inplace_delegation()
>   NFSv4: Update the stateid seqid in nfs_revoke_delegation()
>   NFSv4: Revoke the delegation on success in nfs4_delegreturn_done()
>   NFSv4: Ignore requests to return the delegation if it was revoked
>   NFSv4: Don't reclaim delegations that have been returned or revoked
>   NFSv4: Fix races between open and delegreturn
>
>  fs/nfs/callback_proc.c |   2 +-
>  fs/nfs/delegation.c    | 109 +++++++++++++++++++++++++++++------------
>  fs/nfs/delegation.h    |   4 +-
>  fs/nfs/nfs4proc.c      |  13 ++---
>  fs/nfs/nfs4super.c     |   4 +-
>  5 files changed, 88 insertions(+), 44 deletions(-)
>
> --
> 2.21.0
>
Olga Kornievskaia Oct. 31, 2019, 3:49 p.m. UTC | #2
On Thu, Oct 31, 2019 at 11:27 AM Olga Kornievskaia <aglo@umich.edu> wrote:
>
> Hi Trond,
>
> This patch set produces the following in my testing. Basically what I
> see the client is prevented from using a delegation at all.
>
> After I induce a race of DELEGRETURN/OPEN
> --- the racing OPEN gets a delegation (it returns the same seqid and
> other as the delegation being returned) but the client doesn't use it.
> --- the following (next) OPEN that also gets a delegation immediately
> has the client returning the given delegation.
>
> Disclaimer: in my testing the racing DELEGRETURN doesn't fail with
> OLD_STATEID, NetApp returns OK.

Testing the same against Linux. It prevents the client from using
future delegation stateid. On the induced DELEGRETURN/OPEN race, the
linux server doesn't give a new read delegation. The following open
gets a read delegation and returns it right away.


> On Thu, Oct 24, 2019 at 6:56 AM Trond Myklebust <trondmy@gmail.com> wrote:
> >
> > The following patchset fixes up a number of issues with delegations,
> > but mainly attempts to fix a race condition between open and
> > delegreturn, where the open and the delegreturn get re-ordered so
> > that the delegreturn ends up revoking the delegation that was returned
> > by the open.
> > The root cause is that in certain circumstances, we may currently end
> > up freeing the delegation from delegreturn, so when we later receive
> > the reply to the open, we've lost track of the fact that the seqid
> > predates the one that was returned.
> >
> > This patchset fixes that case by ensuring that we always keep track
> > of the last delegation stateid that was returned for any given inode.
> > That way, if we later see a delegation stateid with the same opaque
> > field, but an older seqid, we know we cannot trust it, and so we
> > ask to replay the OPEN compound.
> >
> > Trond Myklebust (14):
> >   NFSv4: Don't allow a cached open with a revoked delegation
> >   NFSv4: Fix delegation handling in update_open_stateid()
> >   NFSv4: nfs4_callback_getattr() should ignore revoked delegations
> >   NFSv4: Delegation recalls should not find revoked delegations
> >   NFSv4: fail nfs4_refresh_delegation_stateid() when the delegation was
> >     revoked
> >   NFS: Rename nfs_inode_return_delegation_noreclaim()
> >   NFSv4: Don't remove the delegation from the super_list more than once
> >   NFSv4: Hold the delegation spinlock when updating the seqid
> >   NFSv4: Clear the NFS_DELEGATION_REVOKED flag in
> >     nfs_update_inplace_delegation()
> >   NFSv4: Update the stateid seqid in nfs_revoke_delegation()
> >   NFSv4: Revoke the delegation on success in nfs4_delegreturn_done()
> >   NFSv4: Ignore requests to return the delegation if it was revoked
> >   NFSv4: Don't reclaim delegations that have been returned or revoked
> >   NFSv4: Fix races between open and delegreturn
> >
> >  fs/nfs/callback_proc.c |   2 +-
> >  fs/nfs/delegation.c    | 109 +++++++++++++++++++++++++++++------------
> >  fs/nfs/delegation.h    |   4 +-
> >  fs/nfs/nfs4proc.c      |  13 ++---
> >  fs/nfs/nfs4super.c     |   4 +-
> >  5 files changed, 88 insertions(+), 44 deletions(-)
> >
> > --
> > 2.21.0
> >
Olga Kornievskaia Oct. 31, 2019, 4:15 p.m. UTC | #3
Hi Trond,

Now that I recall the reason the Ontap server gives out the same
delegation was to deal with the linux client behaviour who sends an
open even though it holds a delegation (according to the server's
view). So what server does is it gives out the same delegation. This
patch series changes the semantics. Can you describe what you expect
the server is supposed to do in this case?

On Thu, Oct 31, 2019 at 11:49 AM Olga Kornievskaia <aglo@umich.edu> wrote:
>
> On Thu, Oct 31, 2019 at 11:27 AM Olga Kornievskaia <aglo@umich.edu> wrote:
> >
> > Hi Trond,
> >
> > This patch set produces the following in my testing. Basically what I
> > see the client is prevented from using a delegation at all.
> >
> > After I induce a race of DELEGRETURN/OPEN
> > --- the racing OPEN gets a delegation (it returns the same seqid and
> > other as the delegation being returned) but the client doesn't use it.
> > --- the following (next) OPEN that also gets a delegation immediately
> > has the client returning the given delegation.
> >
> > Disclaimer: in my testing the racing DELEGRETURN doesn't fail with
> > OLD_STATEID, NetApp returns OK.
>
> Testing the same against Linux. It prevents the client from using
> future delegation stateid. On the induced DELEGRETURN/OPEN race, the
> linux server doesn't give a new read delegation. The following open
> gets a read delegation and returns it right away.
>
>
> > On Thu, Oct 24, 2019 at 6:56 AM Trond Myklebust <trondmy@gmail.com> wrote:
> > >
> > > The following patchset fixes up a number of issues with delegations,
> > > but mainly attempts to fix a race condition between open and
> > > delegreturn, where the open and the delegreturn get re-ordered so
> > > that the delegreturn ends up revoking the delegation that was returned
> > > by the open.
> > > The root cause is that in certain circumstances, we may currently end
> > > up freeing the delegation from delegreturn, so when we later receive
> > > the reply to the open, we've lost track of the fact that the seqid
> > > predates the one that was returned.
> > >
> > > This patchset fixes that case by ensuring that we always keep track
> > > of the last delegation stateid that was returned for any given inode.
> > > That way, if we later see a delegation stateid with the same opaque
> > > field, but an older seqid, we know we cannot trust it, and so we
> > > ask to replay the OPEN compound.
> > >
> > > Trond Myklebust (14):
> > >   NFSv4: Don't allow a cached open with a revoked delegation
> > >   NFSv4: Fix delegation handling in update_open_stateid()
> > >   NFSv4: nfs4_callback_getattr() should ignore revoked delegations
> > >   NFSv4: Delegation recalls should not find revoked delegations
> > >   NFSv4: fail nfs4_refresh_delegation_stateid() when the delegation was
> > >     revoked
> > >   NFS: Rename nfs_inode_return_delegation_noreclaim()
> > >   NFSv4: Don't remove the delegation from the super_list more than once
> > >   NFSv4: Hold the delegation spinlock when updating the seqid
> > >   NFSv4: Clear the NFS_DELEGATION_REVOKED flag in
> > >     nfs_update_inplace_delegation()
> > >   NFSv4: Update the stateid seqid in nfs_revoke_delegation()
> > >   NFSv4: Revoke the delegation on success in nfs4_delegreturn_done()
> > >   NFSv4: Ignore requests to return the delegation if it was revoked
> > >   NFSv4: Don't reclaim delegations that have been returned or revoked
> > >   NFSv4: Fix races between open and delegreturn
> > >
> > >  fs/nfs/callback_proc.c |   2 +-
> > >  fs/nfs/delegation.c    | 109 +++++++++++++++++++++++++++++------------
> > >  fs/nfs/delegation.h    |   4 +-
> > >  fs/nfs/nfs4proc.c      |  13 ++---
> > >  fs/nfs/nfs4super.c     |   4 +-
> > >  5 files changed, 88 insertions(+), 44 deletions(-)
> > >
> > > --
> > > 2.21.0
> > >
Trond Myklebust Oct. 31, 2019, 10:54 p.m. UTC | #4
Hi Olga

On Thu, 2019-10-31 at 12:15 -0400, Olga Kornievskaia wrote:
> Hi Trond,
> 
> Now that I recall the reason the Ontap server gives out the same
> delegation was to deal with the linux client behaviour who sends an
> open even though it holds a delegation (according to the server's
> view). So what server does is it gives out the same delegation. This
> patch series changes the semantics. Can you describe what you expect
> the server is supposed to do in this case?

If the client sends out a second OPEN for the same file (presumably
because an application request caused the file to be opened again
before the client could process the reply to the first open), then I
think it should be OK for the server to send the delegation stateid
again. If the new open caused a state change (for instance the
delegation was upgraded from READ to WRITE) then the seqid for the
delegation stateid should be bumped. If there was not state change, it
really should be up to the server whether or not it bumps the seqid.

However on the client side, we want to be able to consider that if the
2 open replies return the exact same delegation stateid, then we
consider the second reply to be a no-op as far as our update of the
internal state is concerned. If the seqid was bumped, the client needs
to update its internal state.

So all this patch series is doing, is setting up rules to allow us to
enforce that case, and in particular to ensure that _if_ we send a
DELEGRETURN before we process the reply to the second open, then we
recognise that a successful DELEGRETURN means we no longer hold
delegation state, no matter what the contents of the reply to the
second open tells us.

> On Thu, Oct 31, 2019 at 11:49 AM Olga Kornievskaia <aglo@umich.edu>
> wrote:
> > On Thu, Oct 31, 2019 at 11:27 AM Olga Kornievskaia <aglo@umich.edu>
> > wrote:
> > > Hi Trond,
> > > 
> > > This patch set produces the following in my testing. Basically
> > > what I
> > > see the client is prevented from using a delegation at all.
> > > 
> > > After I induce a race of DELEGRETURN/OPEN
> > > --- the racing OPEN gets a delegation (it returns the same seqid
> > > and
> > > other as the delegation being returned) but the client doesn't
> > > use it.
> > > --- the following (next) OPEN that also gets a delegation
> > > immediately
> > > has the client returning the given delegation.
> > > 
> > > Disclaimer: in my testing the racing DELEGRETURN doesn't fail
> > > with
> > > OLD_STATEID, NetApp returns OK.
> > 
> > Testing the same against Linux. It prevents the client from using
> > future delegation stateid. On the induced DELEGRETURN/OPEN race,
> > the
> > linux server doesn't give a new read delegation. The following open
> > gets a read delegation and returns it right away.
> > 
> > 
> > > On Thu, Oct 24, 2019 at 6:56 AM Trond Myklebust <
> > > trondmy@gmail.com> wrote:
> > > > The following patchset fixes up a number of issues with
> > > > delegations,
> > > > but mainly attempts to fix a race condition between open and
> > > > delegreturn, where the open and the delegreturn get re-ordered
> > > > so
> > > > that the delegreturn ends up revoking the delegation that was
> > > > returned
> > > > by the open.
> > > > The root cause is that in certain circumstances, we may
> > > > currently end
> > > > up freeing the delegation from delegreturn, so when we later
> > > > receive
> > > > the reply to the open, we've lost track of the fact that the
> > > > seqid
> > > > predates the one that was returned.
> > > > 
> > > > This patchset fixes that case by ensuring that we always keep
> > > > track
> > > > of the last delegation stateid that was returned for any given
> > > > inode.
> > > > That way, if we later see a delegation stateid with the same
> > > > opaque
> > > > field, but an older seqid, we know we cannot trust it, and so
> > > > we
> > > > ask to replay the OPEN compound.
> > > > 
> > > > Trond Myklebust (14):
> > > >   NFSv4: Don't allow a cached open with a revoked delegation
> > > >   NFSv4: Fix delegation handling in update_open_stateid()
> > > >   NFSv4: nfs4_callback_getattr() should ignore revoked
> > > > delegations
> > > >   NFSv4: Delegation recalls should not find revoked delegations
> > > >   NFSv4: fail nfs4_refresh_delegation_stateid() when the
> > > > delegation was
> > > >     revoked
> > > >   NFS: Rename nfs_inode_return_delegation_noreclaim()
> > > >   NFSv4: Don't remove the delegation from the super_list more
> > > > than once
> > > >   NFSv4: Hold the delegation spinlock when updating the seqid
> > > >   NFSv4: Clear the NFS_DELEGATION_REVOKED flag in
> > > >     nfs_update_inplace_delegation()
> > > >   NFSv4: Update the stateid seqid in nfs_revoke_delegation()
> > > >   NFSv4: Revoke the delegation on success in
> > > > nfs4_delegreturn_done()
> > > >   NFSv4: Ignore requests to return the delegation if it was
> > > > revoked
> > > >   NFSv4: Don't reclaim delegations that have been returned or
> > > > revoked
> > > >   NFSv4: Fix races between open and delegreturn
> > > > 
> > > >  fs/nfs/callback_proc.c |   2 +-
> > > >  fs/nfs/delegation.c    | 109 +++++++++++++++++++++++++++++--
> > > > ----------
> > > >  fs/nfs/delegation.h    |   4 +-
> > > >  fs/nfs/nfs4proc.c      |  13 ++---
> > > >  fs/nfs/nfs4super.c     |   4 +-
> > > >  5 files changed, 88 insertions(+), 44 deletions(-)
> > > > 
> > > > --
> > > > 2.21.0
> > > >