diff mbox series

NFS: only invalidate dentrys that are clearly invalid.

Message ID 87361aovm3.fsf@notabene.neil.brown.name (mailing list archive)
State New, archived
Headers show
Series NFS: only invalidate dentrys that are clearly invalid. | expand

Commit Message

NeilBrown Nov. 16, 2020, 2:59 a.m. UTC
Prior to commit 5ceb9d7fdaaf ("NFS: Refactor nfs_lookup_revalidate()")
and error from nfs_lookup_verify_inode() other than -ESTALE would result
in nfs_lookup_revalidate() returning that error code (-ESTALE is mapped
to zero).
Since that commit, all errors result in zero being returned.

When nfs_lookup_revalidate() returns zero, the dentry is invalidated
and, significantly, if the dentry is a directory that is mounted on,
that mountpoint is lost.

If you:
 - mount an NFS filesystem which contains a directory
 - mount something (e.g. tmpfs) on that directory
 - use iptables (or scissors) to block traffic to the server
 - ls -l the-mounted-on-directory
 - interrupt the 'ls -l'
you will find that the directory has been unmounted.

This can be fixed by returning the actual error code from
nfs_lookup_verify_inode() rather then zero (except for -ESTALE).

Fixes: 5ceb9d7fdaaf ("NFS: Refactor nfs_lookup_revalidate()")
Signed-off-by: NeilBrown <neilb@suse.de>
---
 fs/nfs/dir.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

Comments

Trond Myklebust Nov. 16, 2020, 4:27 a.m. UTC | #1
On Mon, 2020-11-16 at 13:59 +1100, NeilBrown wrote:
> 
> Prior to commit 5ceb9d7fdaaf ("NFS: Refactor
> nfs_lookup_revalidate()")
> and error from nfs_lookup_verify_inode() other than -ESTALE would
> result
> in nfs_lookup_revalidate() returning that error code (-ESTALE is
> mapped
> to zero).
> Since that commit, all errors result in zero being returned.
> 
> When nfs_lookup_revalidate() returns zero, the dentry is invalidated
> and, significantly, if the dentry is a directory that is mounted on,
> that mountpoint is lost.
> 
> If you:
>  - mount an NFS filesystem which contains a directory
>  - mount something (e.g. tmpfs) on that directory
>  - use iptables (or scissors) to block traffic to the server
>  - ls -l the-mounted-on-directory
>  - interrupt the 'ls -l'
> you will find that the directory has been unmounted.
> 
> This can be fixed by returning the actual error code from
> nfs_lookup_verify_inode() rather then zero (except for -ESTALE).
> 
> Fixes: 5ceb9d7fdaaf ("NFS: Refactor nfs_lookup_revalidate()")
> Signed-off-by: NeilBrown <neilb@suse.de>
> ---
>  fs/nfs/dir.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> index cb52db9a0cfb..d24acf556e9e 100644
> --- a/fs/nfs/dir.c
> +++ b/fs/nfs/dir.c
> @@ -1350,7 +1350,7 @@ nfs_do_lookup_revalidate(struct inode *dir,
> struct dentry *dentry,
>                          unsigned int flags)
>  {
>         struct inode *inode;
> -       int error;
> +       int error = 0;
>  
>         nfs_inc_stats(dir, NFSIOS_DENTRYREVALIDATE);
>         inode = d_inode(dentry);
> @@ -1372,8 +1372,10 @@ nfs_do_lookup_revalidate(struct inode *dir,
> struct dentry *dentry,
>             nfs_check_verifier(dir, dentry, flags & LOOKUP_RCU)) {
>                 error = nfs_lookup_verify_inode(inode, flags);
>                 if (error) {
> -                       if (error == -ESTALE)
> +                       if (error == -ESTALE) {
>                                 nfs_zap_caches(dir);
> +                               error = 0;
> +                       }
>                         goto out_bad;
>                 }
>                 nfs_advise_use_readdirplus(dir);
> @@ -1395,7 +1397,7 @@ nfs_do_lookup_revalidate(struct inode *dir,
> struct dentry *dentry,
>  out_bad:
>         if (flags & LOOKUP_RCU)
>                 return -ECHILD;
> -       return nfs_lookup_revalidate_done(dir, dentry, inode, 0);
> +       return nfs_lookup_revalidate_done(dir, dentry, inode, error);

Which errors do we actually need to return here? As far as I can tell,
the only errors that nfs_lookup_verify_inode() is supposed to return is
ENOMEM, ESTALE, ECHILD, and possibly EIO or ETiMEDOUT.

Why would it be better to return those errors rather than just a 0 when
we need to invalidate the inode, particularly since we already have a
special case in nfs_lookup_revalidate_done() when the dentry is root?

>  }
>  
>  static int
NeilBrown Nov. 16, 2020, 4:43 a.m. UTC | #2
On Mon, Nov 16 2020, Trond Myklebust wrote:

> On Mon, 2020-11-16 at 13:59 +1100, NeilBrown wrote:
>> 
>> Prior to commit 5ceb9d7fdaaf ("NFS: Refactor
>> nfs_lookup_revalidate()")
>> and error from nfs_lookup_verify_inode() other than -ESTALE would
>> result
>> in nfs_lookup_revalidate() returning that error code (-ESTALE is
>> mapped
>> to zero).
>> Since that commit, all errors result in zero being returned.
>> 
>> When nfs_lookup_revalidate() returns zero, the dentry is invalidated
>> and, significantly, if the dentry is a directory that is mounted on,
>> that mountpoint is lost.
>> 
>> If you:
>>  - mount an NFS filesystem which contains a directory
>>  - mount something (e.g. tmpfs) on that directory
>>  - use iptables (or scissors) to block traffic to the server
>>  - ls -l the-mounted-on-directory
>>  - interrupt the 'ls -l'
>> you will find that the directory has been unmounted.
>> 
>> This can be fixed by returning the actual error code from
>> nfs_lookup_verify_inode() rather then zero (except for -ESTALE).
>> 
>> Fixes: 5ceb9d7fdaaf ("NFS: Refactor nfs_lookup_revalidate()")
>> Signed-off-by: NeilBrown <neilb@suse.de>
>> ---
>>  fs/nfs/dir.c | 8 +++++---
>>  1 file changed, 5 insertions(+), 3 deletions(-)
>> 
>> diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
>> index cb52db9a0cfb..d24acf556e9e 100644
>> --- a/fs/nfs/dir.c
>> +++ b/fs/nfs/dir.c
>> @@ -1350,7 +1350,7 @@ nfs_do_lookup_revalidate(struct inode *dir,
>> struct dentry *dentry,
>>                          unsigned int flags)
>>  {
>>         struct inode *inode;
>> -       int error;
>> +       int error = 0;
>>  
>>         nfs_inc_stats(dir, NFSIOS_DENTRYREVALIDATE);
>>         inode = d_inode(dentry);
>> @@ -1372,8 +1372,10 @@ nfs_do_lookup_revalidate(struct inode *dir,
>> struct dentry *dentry,
>>             nfs_check_verifier(dir, dentry, flags & LOOKUP_RCU)) {
>>                 error = nfs_lookup_verify_inode(inode, flags);
>>                 if (error) {
>> -                       if (error == -ESTALE)
>> +                       if (error == -ESTALE) {
>>                                 nfs_zap_caches(dir);
>> +                               error = 0;
>> +                       }
>>                         goto out_bad;
>>                 }
>>                 nfs_advise_use_readdirplus(dir);
>> @@ -1395,7 +1397,7 @@ nfs_do_lookup_revalidate(struct inode *dir,
>> struct dentry *dentry,
>>  out_bad:
>>         if (flags & LOOKUP_RCU)
>>                 return -ECHILD;
>> -       return nfs_lookup_revalidate_done(dir, dentry, inode, 0);
>> +       return nfs_lookup_revalidate_done(dir, dentry, inode, error);
>
> Which errors do we actually need to return here? As far as I can tell,
> the only errors that nfs_lookup_verify_inode() is supposed to return is
> ENOMEM, ESTALE, ECHILD, and possibly EIO or ETiMEDOUT.
>
> Why would it be better to return those errors rather than just a 0 when
> we need to invalidate the inode, particularly since we already have a
> special case in nfs_lookup_revalidate_done() when the dentry is root?

ERESTARTSYS is the error that easily causes problems.

Returning 0 causes d_invalidate() to be called which is quite heavy
handed in mountpoints.
So it is only reasonable to return 0 when we have unambiguous
confirmation from the server that the object no longer exists.  ESTALE
is unambiguous. EIO might be unambiguous.  ERESTARTSYS, ENOMEM,
ETIMEDOUT are transient and don't justify d_invalidate() being called.

(BTW, Commit cc89684c9a26 ("NFS: only invalidate dentrys that are clearly invalid.")
 fixed much the same bug 3 years ago).
 
Thanks,
NeilBrown


>
>>  }
>>  
>>  static int
>
> -- 
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> trond.myklebust@hammerspace.com
Trond Myklebust Nov. 16, 2020, 4:50 a.m. UTC | #3
On Mon, 2020-11-16 at 15:43 +1100, NeilBrown wrote:
> On Mon, Nov 16 2020, Trond Myklebust wrote:
> 
> > On Mon, 2020-11-16 at 13:59 +1100, NeilBrown wrote:
> > > 
> > > Prior to commit 5ceb9d7fdaaf ("NFS: Refactor
> > > nfs_lookup_revalidate()")
> > > and error from nfs_lookup_verify_inode() other than -ESTALE would
> > > result
> > > in nfs_lookup_revalidate() returning that error code (-ESTALE is
> > > mapped
> > > to zero).
> > > Since that commit, all errors result in zero being returned.
> > > 
> > > When nfs_lookup_revalidate() returns zero, the dentry is
> > > invalidated
> > > and, significantly, if the dentry is a directory that is mounted
> > > on,
> > > that mountpoint is lost.
> > > 
> > > If you:
> > >  - mount an NFS filesystem which contains a directory
> > >  - mount something (e.g. tmpfs) on that directory
> > >  - use iptables (or scissors) to block traffic to the server
> > >  - ls -l the-mounted-on-directory
> > >  - interrupt the 'ls -l'
> > > you will find that the directory has been unmounted.
> > > 
> > > This can be fixed by returning the actual error code from
> > > nfs_lookup_verify_inode() rather then zero (except for -ESTALE).
> > > 
> > > Fixes: 5ceb9d7fdaaf ("NFS: Refactor nfs_lookup_revalidate()")
> > > Signed-off-by: NeilBrown <neilb@suse.de>
> > > ---
> > >  fs/nfs/dir.c | 8 +++++---
> > >  1 file changed, 5 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> > > index cb52db9a0cfb..d24acf556e9e 100644
> > > --- a/fs/nfs/dir.c
> > > +++ b/fs/nfs/dir.c
> > > @@ -1350,7 +1350,7 @@ nfs_do_lookup_revalidate(struct inode *dir,
> > > struct dentry *dentry,
> > >                          unsigned int flags)
> > >  {
> > >         struct inode *inode;
> > > -       int error;
> > > +       int error = 0;
> > >  
> > >         nfs_inc_stats(dir, NFSIOS_DENTRYREVALIDATE);
> > >         inode = d_inode(dentry);
> > > @@ -1372,8 +1372,10 @@ nfs_do_lookup_revalidate(struct inode
> > > *dir,
> > > struct dentry *dentry,
> > >             nfs_check_verifier(dir, dentry, flags & LOOKUP_RCU))
> > > {
> > >                 error = nfs_lookup_verify_inode(inode, flags);
> > >                 if (error) {
> > > -                       if (error == -ESTALE)
> > > +                       if (error == -ESTALE) {
> > >                                 nfs_zap_caches(dir);
> > > +                               error = 0;
> > > +                       }
> > >                         goto out_bad;
> > >                 }
> > >                 nfs_advise_use_readdirplus(dir);
> > > @@ -1395,7 +1397,7 @@ nfs_do_lookup_revalidate(struct inode *dir,
> > > struct dentry *dentry,
> > >  out_bad:
> > >         if (flags & LOOKUP_RCU)
> > >                 return -ECHILD;
> > > -       return nfs_lookup_revalidate_done(dir, dentry, inode, 0);
> > > +       return nfs_lookup_revalidate_done(dir, dentry, inode,
> > > error);
> > 
> > Which errors do we actually need to return here? As far as I can
> > tell,
> > the only errors that nfs_lookup_verify_inode() is supposed to
> > return is
> > ENOMEM, ESTALE, ECHILD, and possibly EIO or ETiMEDOUT.
> > 
> > Why would it be better to return those errors rather than just a 0
> > when
> > we need to invalidate the inode, particularly since we already have
> > a
> > special case in nfs_lookup_revalidate_done() when the dentry is
> > root?
> 
> ERESTARTSYS is the error that easily causes problems.
> 
> Returning 0 causes d_invalidate() to be called which is quite heavy
> handed in mountpoints.

My point is that it shouldn't get returned for mountpoints. See
nfs_lookup_revalidate_done().

> So it is only reasonable to return 0 when we have unambiguous
> confirmation from the server that the object no longer exists. 
> ESTALE
> is unambiguous. EIO might be unambiguous.  ERESTARTSYS, ENOMEM,
> ETIMEDOUT are transient and don't justify d_invalidate() being
> called.
> 
> (BTW, Commit cc89684c9a26 ("NFS: only invalidate dentrys that are
> clearly invalid.")
>  fixed much the same bug 3 years ago).
>  
> Thanks,
> NeilBrown
> 
> 
> > 
> > >  }
> > >  
> > >  static int
> > 
> > -- 
> > Trond Myklebust
> > Linux NFS client maintainer, Hammerspace
> > trond.myklebust@hammerspace.com
NeilBrown Nov. 16, 2020, 5 a.m. UTC | #4
On Mon, Nov 16 2020, Trond Myklebust wrote:

> On Mon, 2020-11-16 at 15:43 +1100, NeilBrown wrote:
>> On Mon, Nov 16 2020, Trond Myklebust wrote:
>> 
>> > On Mon, 2020-11-16 at 13:59 +1100, NeilBrown wrote:
>> > > 
>> > > Prior to commit 5ceb9d7fdaaf ("NFS: Refactor
>> > > nfs_lookup_revalidate()")
>> > > and error from nfs_lookup_verify_inode() other than -ESTALE would
>> > > result
>> > > in nfs_lookup_revalidate() returning that error code (-ESTALE is
>> > > mapped
>> > > to zero).
>> > > Since that commit, all errors result in zero being returned.
>> > > 
>> > > When nfs_lookup_revalidate() returns zero, the dentry is
>> > > invalidated
>> > > and, significantly, if the dentry is a directory that is mounted
>> > > on,
>> > > that mountpoint is lost.
>> > > 
>> > > If you:
>> > >  - mount an NFS filesystem which contains a directory
>> > >  - mount something (e.g. tmpfs) on that directory
>> > >  - use iptables (or scissors) to block traffic to the server
>> > >  - ls -l the-mounted-on-directory
>> > >  - interrupt the 'ls -l'
>> > > you will find that the directory has been unmounted.
>> > > 
>> > > This can be fixed by returning the actual error code from
>> > > nfs_lookup_verify_inode() rather then zero (except for -ESTALE).
>> > > 
>> > > Fixes: 5ceb9d7fdaaf ("NFS: Refactor nfs_lookup_revalidate()")
>> > > Signed-off-by: NeilBrown <neilb@suse.de>
>> > > ---
>> > >  fs/nfs/dir.c | 8 +++++---
>> > >  1 file changed, 5 insertions(+), 3 deletions(-)
>> > > 
>> > > diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
>> > > index cb52db9a0cfb..d24acf556e9e 100644
>> > > --- a/fs/nfs/dir.c
>> > > +++ b/fs/nfs/dir.c
>> > > @@ -1350,7 +1350,7 @@ nfs_do_lookup_revalidate(struct inode *dir,
>> > > struct dentry *dentry,
>> > >                          unsigned int flags)
>> > >  {
>> > >         struct inode *inode;
>> > > -       int error;
>> > > +       int error = 0;
>> > >  
>> > >         nfs_inc_stats(dir, NFSIOS_DENTRYREVALIDATE);
>> > >         inode = d_inode(dentry);
>> > > @@ -1372,8 +1372,10 @@ nfs_do_lookup_revalidate(struct inode
>> > > *dir,
>> > > struct dentry *dentry,
>> > >             nfs_check_verifier(dir, dentry, flags & LOOKUP_RCU))
>> > > {
>> > >                 error = nfs_lookup_verify_inode(inode, flags);
>> > >                 if (error) {
>> > > -                       if (error == -ESTALE)
>> > > +                       if (error == -ESTALE) {
>> > >                                 nfs_zap_caches(dir);
>> > > +                               error = 0;
>> > > +                       }
>> > >                         goto out_bad;
>> > >                 }
>> > >                 nfs_advise_use_readdirplus(dir);
>> > > @@ -1395,7 +1397,7 @@ nfs_do_lookup_revalidate(struct inode *dir,
>> > > struct dentry *dentry,
>> > >  out_bad:
>> > >         if (flags & LOOKUP_RCU)
>> > >                 return -ECHILD;
>> > > -       return nfs_lookup_revalidate_done(dir, dentry, inode, 0);
>> > > +       return nfs_lookup_revalidate_done(dir, dentry, inode,
>> > > error);
>> > 
>> > Which errors do we actually need to return here? As far as I can
>> > tell,
>> > the only errors that nfs_lookup_verify_inode() is supposed to
>> > return is
>> > ENOMEM, ESTALE, ECHILD, and possibly EIO or ETiMEDOUT.
>> > 
>> > Why would it be better to return those errors rather than just a 0
>> > when
>> > we need to invalidate the inode, particularly since we already have
>> > a
>> > special case in nfs_lookup_revalidate_done() when the dentry is
>> > root?
>> 
>> ERESTARTSYS is the error that easily causes problems.
>> 
>> Returning 0 causes d_invalidate() to be called which is quite heavy
>> handed in mountpoints.
>
> My point is that it shouldn't get returned for mountpoints. See
> nfs_lookup_revalidate_done().

nfs_lookup_revalidate_done() only checks IS_ROOT(), and while many
mountpoints are IS_ROOT(), not all are (--bind easily makes others).

But that isn't even really relevant here.  The dentry being revalidated
is the underlying directory - that something else is mounted on.
step_into() which follows mount points is called in walk_component()
*after* lookup_fast or lookup_slow which will have revalidated the
dentry.

NeilBrown


>
>> So it is only reasonable to return 0 when we have unambiguous
>> confirmation from the server that the object no longer exists. 
>> ESTALE
>> is unambiguous. EIO might be unambiguous.  ERESTARTSYS, ENOMEM,
>> ETIMEDOUT are transient and don't justify d_invalidate() being
>> called.
>> 
>> (BTW, Commit cc89684c9a26 ("NFS: only invalidate dentrys that are
>> clearly invalid.")
>>  fixed much the same bug 3 years ago).
>>  
>> Thanks,
>> NeilBrown
>> 
>> 
>> > 
>> > >  }
>> > >  
>> > >  static int
>> > 
>> > -- 
>> > Trond Myklebust
>> > Linux NFS client maintainer, Hammerspace
>> > trond.myklebust@hammerspace.com
>
> -- 
> Trond Myklebust
> CTO, Hammerspace Inc
> 4984 El Camino Real, Suite 208
> Los Altos, CA 94022
> ​
> www.hammer.space
Trond Myklebust Nov. 16, 2020, 5:07 a.m. UTC | #5
On Mon, 2020-11-16 at 16:00 +1100, NeilBrown wrote:
> On Mon, Nov 16 2020, Trond Myklebust wrote:
> 
> > On Mon, 2020-11-16 at 15:43 +1100, NeilBrown wrote:
> > > On Mon, Nov 16 2020, Trond Myklebust wrote:
> > > 
> > > > On Mon, 2020-11-16 at 13:59 +1100, NeilBrown wrote:
> > > > > 
> > > > > Prior to commit 5ceb9d7fdaaf ("NFS: Refactor
> > > > > nfs_lookup_revalidate()")
> > > > > and error from nfs_lookup_verify_inode() other than -ESTALE
> > > > > would
> > > > > result
> > > > > in nfs_lookup_revalidate() returning that error code (-ESTALE
> > > > > is
> > > > > mapped
> > > > > to zero).
> > > > > Since that commit, all errors result in zero being returned.
> > > > > 
> > > > > When nfs_lookup_revalidate() returns zero, the dentry is
> > > > > invalidated
> > > > > and, significantly, if the dentry is a directory that is
> > > > > mounted
> > > > > on,
> > > > > that mountpoint is lost.
> > > > > 
> > > > > If you:
> > > > >  - mount an NFS filesystem which contains a directory
> > > > >  - mount something (e.g. tmpfs) on that directory
> > > > >  - use iptables (or scissors) to block traffic to the server
> > > > >  - ls -l the-mounted-on-directory
> > > > >  - interrupt the 'ls -l'
> > > > > you will find that the directory has been unmounted.
> > > > > 
> > > > > This can be fixed by returning the actual error code from
> > > > > nfs_lookup_verify_inode() rather then zero (except for -
> > > > > ESTALE).
> > > > > 
> > > > > Fixes: 5ceb9d7fdaaf ("NFS: Refactor nfs_lookup_revalidate()")
> > > > > Signed-off-by: NeilBrown <neilb@suse.de>
> > > > > ---
> > > > >  fs/nfs/dir.c | 8 +++++---
> > > > >  1 file changed, 5 insertions(+), 3 deletions(-)
> > > > > 
> > > > > diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> > > > > index cb52db9a0cfb..d24acf556e9e 100644
> > > > > --- a/fs/nfs/dir.c
> > > > > +++ b/fs/nfs/dir.c
> > > > > @@ -1350,7 +1350,7 @@ nfs_do_lookup_revalidate(struct inode
> > > > > *dir,
> > > > > struct dentry *dentry,
> > > > >                          unsigned int flags)
> > > > >  {
> > > > >         struct inode *inode;
> > > > > -       int error;
> > > > > +       int error = 0;
> > > > >  
> > > > >         nfs_inc_stats(dir, NFSIOS_DENTRYREVALIDATE);
> > > > >         inode = d_inode(dentry);
> > > > > @@ -1372,8 +1372,10 @@ nfs_do_lookup_revalidate(struct inode
> > > > > *dir,
> > > > > struct dentry *dentry,
> > > > >             nfs_check_verifier(dir, dentry, flags &
> > > > > LOOKUP_RCU))
> > > > > {
> > > > >                 error = nfs_lookup_verify_inode(inode,
> > > > > flags);
> > > > >                 if (error) {
> > > > > -                       if (error == -ESTALE)
> > > > > +                       if (error == -ESTALE) {
> > > > >                                 nfs_zap_caches(dir);
> > > > > +                               error = 0;
> > > > > +                       }
> > > > >                         goto out_bad;
> > > > >                 }
> > > > >                 nfs_advise_use_readdirplus(dir);
> > > > > @@ -1395,7 +1397,7 @@ nfs_do_lookup_revalidate(struct inode
> > > > > *dir,
> > > > > struct dentry *dentry,
> > > > >  out_bad:
> > > > >         if (flags & LOOKUP_RCU)
> > > > >                 return -ECHILD;
> > > > > -       return nfs_lookup_revalidate_done(dir, dentry, inode,
> > > > > 0);
> > > > > +       return nfs_lookup_revalidate_done(dir, dentry, inode,
> > > > > error);
> > > > 
> > > > Which errors do we actually need to return here? As far as I
> > > > can
> > > > tell,
> > > > the only errors that nfs_lookup_verify_inode() is supposed to
> > > > return is
> > > > ENOMEM, ESTALE, ECHILD, and possibly EIO or ETiMEDOUT.
> > > > 
> > > > Why would it be better to return those errors rather than just
> > > > a 0
> > > > when
> > > > we need to invalidate the inode, particularly since we already
> > > > have
> > > > a
> > > > special case in nfs_lookup_revalidate_done() when the dentry is
> > > > root?
> > > 
> > > ERESTARTSYS is the error that easily causes problems.
> > > 
> > > Returning 0 causes d_invalidate() to be called which is quite
> > > heavy
> > > handed in mountpoints.
> > 
> > My point is that it shouldn't get returned for mountpoints. See
> > nfs_lookup_revalidate_done().
> 
> nfs_lookup_revalidate_done() only checks IS_ROOT(), and while many
> mountpoints are IS_ROOT(), not all are (--bind easily makes others).
> 
> But that isn't even really relevant here.  The dentry being
> revalidated
> is the underlying directory - that something else is mounted on.
> step_into() which follows mount points is called in walk_component()
> *after* lookup_fast or lookup_slow which will have revalidated the
> dentry.

So then why is it not sufficient to just add a check for
d_mountpoint()? This is a revalidation, not a new lookup.

> 
> NeilBrown
> 
> 
> > 
> > > So it is only reasonable to return 0 when we have unambiguous
> > > confirmation from the server that the object no longer exists. 
> > > ESTALE
> > > is unambiguous. EIO might be unambiguous.  ERESTARTSYS, ENOMEM,
> > > ETIMEDOUT are transient and don't justify d_invalidate() being
> > > called.
> > > 
> > > (BTW, Commit cc89684c9a26 ("NFS: only invalidate dentrys that are
> > > clearly invalid.")
> > >  fixed much the same bug 3 years ago).
> > >  
> > > Thanks,
> > > NeilBrown
> > > 
> > > 
> > > > 
> > > > >  }
> > > > >  
> > > > >  static int
> > > > 
> > > > -- 
> > > > Trond Myklebust
> > > > Linux NFS client maintainer, Hammerspace
> > > > trond.myklebust@hammerspace.com
> > 
> > -- 
> > Trond Myklebust
> > CTO, Hammerspace Inc
> > 4984 El Camino Real, Suite 208
> > Los Altos, CA 94022
> > ​
> > www.hammer.space
NeilBrown Nov. 16, 2020, 5:12 a.m. UTC | #6
On Mon, Nov 16 2020, Trond Myklebust wrote:

> On Mon, 2020-11-16 at 16:00 +1100, NeilBrown wrote:
>> On Mon, Nov 16 2020, Trond Myklebust wrote:
>> 
>> > On Mon, 2020-11-16 at 15:43 +1100, NeilBrown wrote:
>> > > On Mon, Nov 16 2020, Trond Myklebust wrote:
>> > > 
>> > > > On Mon, 2020-11-16 at 13:59 +1100, NeilBrown wrote:
>> > > > > 
>> > > > > Prior to commit 5ceb9d7fdaaf ("NFS: Refactor
>> > > > > nfs_lookup_revalidate()")
>> > > > > and error from nfs_lookup_verify_inode() other than -ESTALE
>> > > > > would
>> > > > > result
>> > > > > in nfs_lookup_revalidate() returning that error code (-ESTALE
>> > > > > is
>> > > > > mapped
>> > > > > to zero).
>> > > > > Since that commit, all errors result in zero being returned.
>> > > > > 
>> > > > > When nfs_lookup_revalidate() returns zero, the dentry is
>> > > > > invalidated
>> > > > > and, significantly, if the dentry is a directory that is
>> > > > > mounted
>> > > > > on,
>> > > > > that mountpoint is lost.
>> > > > > 
>> > > > > If you:
>> > > > >  - mount an NFS filesystem which contains a directory
>> > > > >  - mount something (e.g. tmpfs) on that directory
>> > > > >  - use iptables (or scissors) to block traffic to the server
>> > > > >  - ls -l the-mounted-on-directory
>> > > > >  - interrupt the 'ls -l'
>> > > > > you will find that the directory has been unmounted.
>> > > > > 
>> > > > > This can be fixed by returning the actual error code from
>> > > > > nfs_lookup_verify_inode() rather then zero (except for -
>> > > > > ESTALE).
>> > > > > 
>> > > > > Fixes: 5ceb9d7fdaaf ("NFS: Refactor nfs_lookup_revalidate()")
>> > > > > Signed-off-by: NeilBrown <neilb@suse.de>
>> > > > > ---
>> > > > >  fs/nfs/dir.c | 8 +++++---
>> > > > >  1 file changed, 5 insertions(+), 3 deletions(-)
>> > > > > 
>> > > > > diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
>> > > > > index cb52db9a0cfb..d24acf556e9e 100644
>> > > > > --- a/fs/nfs/dir.c
>> > > > > +++ b/fs/nfs/dir.c
>> > > > > @@ -1350,7 +1350,7 @@ nfs_do_lookup_revalidate(struct inode
>> > > > > *dir,
>> > > > > struct dentry *dentry,
>> > > > >                          unsigned int flags)
>> > > > >  {
>> > > > >         struct inode *inode;
>> > > > > -       int error;
>> > > > > +       int error = 0;
>> > > > >  
>> > > > >         nfs_inc_stats(dir, NFSIOS_DENTRYREVALIDATE);
>> > > > >         inode = d_inode(dentry);
>> > > > > @@ -1372,8 +1372,10 @@ nfs_do_lookup_revalidate(struct inode
>> > > > > *dir,
>> > > > > struct dentry *dentry,
>> > > > >             nfs_check_verifier(dir, dentry, flags &
>> > > > > LOOKUP_RCU))
>> > > > > {
>> > > > >                 error = nfs_lookup_verify_inode(inode,
>> > > > > flags);
>> > > > >                 if (error) {
>> > > > > -                       if (error == -ESTALE)
>> > > > > +                       if (error == -ESTALE) {
>> > > > >                                 nfs_zap_caches(dir);
>> > > > > +                               error = 0;
>> > > > > +                       }
>> > > > >                         goto out_bad;
>> > > > >                 }
>> > > > >                 nfs_advise_use_readdirplus(dir);
>> > > > > @@ -1395,7 +1397,7 @@ nfs_do_lookup_revalidate(struct inode
>> > > > > *dir,
>> > > > > struct dentry *dentry,
>> > > > >  out_bad:
>> > > > >         if (flags & LOOKUP_RCU)
>> > > > >                 return -ECHILD;
>> > > > > -       return nfs_lookup_revalidate_done(dir, dentry, inode,
>> > > > > 0);
>> > > > > +       return nfs_lookup_revalidate_done(dir, dentry, inode,
>> > > > > error);
>> > > > 
>> > > > Which errors do we actually need to return here? As far as I
>> > > > can
>> > > > tell,
>> > > > the only errors that nfs_lookup_verify_inode() is supposed to
>> > > > return is
>> > > > ENOMEM, ESTALE, ECHILD, and possibly EIO or ETiMEDOUT.
>> > > > 
>> > > > Why would it be better to return those errors rather than just
>> > > > a 0
>> > > > when
>> > > > we need to invalidate the inode, particularly since we already
>> > > > have
>> > > > a
>> > > > special case in nfs_lookup_revalidate_done() when the dentry is
>> > > > root?
>> > > 
>> > > ERESTARTSYS is the error that easily causes problems.
>> > > 
>> > > Returning 0 causes d_invalidate() to be called which is quite
>> > > heavy
>> > > handed in mountpoints.
>> > 
>> > My point is that it shouldn't get returned for mountpoints. See
>> > nfs_lookup_revalidate_done().
>> 
>> nfs_lookup_revalidate_done() only checks IS_ROOT(), and while many
>> mountpoints are IS_ROOT(), not all are (--bind easily makes others).
>> 
>> But that isn't even really relevant here.  The dentry being
>> revalidated
>> is the underlying directory - that something else is mounted on.
>> step_into() which follows mount points is called in walk_component()
>> *after* lookup_fast or lookup_slow which will have revalidated the
>> dentry.
>
> So then why is it not sufficient to just add a check for
> d_mountpoint()? This is a revalidation, not a new lookup.
>

I guess you could do that.
But why would you want to call d_invalidate() just because a signal was
received, or a memory allocation failed?

NeilBrown


>> 
>> NeilBrown
>> 
>> 
>> > 
>> > > So it is only reasonable to return 0 when we have unambiguous
>> > > confirmation from the server that the object no longer exists. 
>> > > ESTALE
>> > > is unambiguous. EIO might be unambiguous.  ERESTARTSYS, ENOMEM,
>> > > ETIMEDOUT are transient and don't justify d_invalidate() being
>> > > called.
>> > > 
>> > > (BTW, Commit cc89684c9a26 ("NFS: only invalidate dentrys that are
>> > > clearly invalid.")
>> > >  fixed much the same bug 3 years ago).
>> > >  
>> > > Thanks,
>> > > NeilBrown
>> > > 
>> > > 
>> > > > 
>> > > > >  }
>> > > > >  
>> > > > >  static int
>> > > > 
>> > > > -- 
>> > > > Trond Myklebust
>> > > > Linux NFS client maintainer, Hammerspace
>> > > > trond.myklebust@hammerspace.com
>> > 
>> > -- 
>> > Trond Myklebust
>> > CTO, Hammerspace Inc
>> > 4984 El Camino Real, Suite 208
>> > Los Altos, CA 94022
>> > ​
>> > www.hammer.space
>
> -- 
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> trond.myklebust@hammerspace.com
Trond Myklebust Nov. 16, 2020, 5:32 a.m. UTC | #7
On Mon, 2020-11-16 at 16:12 +1100, NeilBrown wrote:
> On Mon, Nov 16 2020, Trond Myklebust wrote:
> 
> > On Mon, 2020-11-16 at 16:00 +1100, NeilBrown wrote:
> > > On Mon, Nov 16 2020, Trond Myklebust wrote:
> > > 
> > > > On Mon, 2020-11-16 at 15:43 +1100, NeilBrown wrote:
> > > > > On Mon, Nov 16 2020, Trond Myklebust wrote:
> > > > > 
> > > > > > On Mon, 2020-11-16 at 13:59 +1100, NeilBrown wrote:
> > > > > > > 
> > > > > > > Prior to commit 5ceb9d7fdaaf ("NFS: Refactor
> > > > > > > nfs_lookup_revalidate()")
> > > > > > > and error from nfs_lookup_verify_inode() other than -
> > > > > > > ESTALE
> > > > > > > would
> > > > > > > result
> > > > > > > in nfs_lookup_revalidate() returning that error code (-
> > > > > > > ESTALE
> > > > > > > is
> > > > > > > mapped
> > > > > > > to zero).
> > > > > > > Since that commit, all errors result in zero being
> > > > > > > returned.
> > > > > > > 
> > > > > > > When nfs_lookup_revalidate() returns zero, the dentry is
> > > > > > > invalidated
> > > > > > > and, significantly, if the dentry is a directory that is
> > > > > > > mounted
> > > > > > > on,
> > > > > > > that mountpoint is lost.
> > > > > > > 
> > > > > > > If you:
> > > > > > >  - mount an NFS filesystem which contains a directory
> > > > > > >  - mount something (e.g. tmpfs) on that directory
> > > > > > >  - use iptables (or scissors) to block traffic to the
> > > > > > > server
> > > > > > >  - ls -l the-mounted-on-directory
> > > > > > >  - interrupt the 'ls -l'
> > > > > > > you will find that the directory has been unmounted.
> > > > > > > 
> > > > > > > This can be fixed by returning the actual error code from
> > > > > > > nfs_lookup_verify_inode() rather then zero (except for -
> > > > > > > ESTALE).
> > > > > > > 
> > > > > > > Fixes: 5ceb9d7fdaaf ("NFS: Refactor
> > > > > > > nfs_lookup_revalidate()")
> > > > > > > Signed-off-by: NeilBrown <neilb@suse.de>
> > > > > > > ---
> > > > > > >  fs/nfs/dir.c | 8 +++++---
> > > > > > >  1 file changed, 5 insertions(+), 3 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> > > > > > > index cb52db9a0cfb..d24acf556e9e 100644
> > > > > > > --- a/fs/nfs/dir.c
> > > > > > > +++ b/fs/nfs/dir.c
> > > > > > > @@ -1350,7 +1350,7 @@ nfs_do_lookup_revalidate(struct
> > > > > > > inode
> > > > > > > *dir,
> > > > > > > struct dentry *dentry,
> > > > > > >                          unsigned int flags)
> > > > > > >  {
> > > > > > >         struct inode *inode;
> > > > > > > -       int error;
> > > > > > > +       int error = 0;
> > > > > > >  
> > > > > > >         nfs_inc_stats(dir, NFSIOS_DENTRYREVALIDATE);
> > > > > > >         inode = d_inode(dentry);
> > > > > > > @@ -1372,8 +1372,10 @@ nfs_do_lookup_revalidate(struct
> > > > > > > inode
> > > > > > > *dir,
> > > > > > > struct dentry *dentry,
> > > > > > >             nfs_check_verifier(dir, dentry, flags &
> > > > > > > LOOKUP_RCU))
> > > > > > > {
> > > > > > >                 error = nfs_lookup_verify_inode(inode,
> > > > > > > flags);
> > > > > > >                 if (error) {
> > > > > > > -                       if (error == -ESTALE)
> > > > > > > +                       if (error == -ESTALE) {
> > > > > > >                                 nfs_zap_caches(dir);
> > > > > > > +                               error = 0;
> > > > > > > +                       }
> > > > > > >                         goto out_bad;
> > > > > > >                 }
> > > > > > >                 nfs_advise_use_readdirplus(dir);
> > > > > > > @@ -1395,7 +1397,7 @@ nfs_do_lookup_revalidate(struct
> > > > > > > inode
> > > > > > > *dir,
> > > > > > > struct dentry *dentry,
> > > > > > >  out_bad:
> > > > > > >         if (flags & LOOKUP_RCU)
> > > > > > >                 return -ECHILD;
> > > > > > > -       return nfs_lookup_revalidate_done(dir, dentry,
> > > > > > > inode,
> > > > > > > 0);
> > > > > > > +       return nfs_lookup_revalidate_done(dir, dentry,
> > > > > > > inode,
> > > > > > > error);
> > > > > > 
> > > > > > Which errors do we actually need to return here? As far as
> > > > > > I
> > > > > > can
> > > > > > tell,
> > > > > > the only errors that nfs_lookup_verify_inode() is supposed
> > > > > > to
> > > > > > return is
> > > > > > ENOMEM, ESTALE, ECHILD, and possibly EIO or ETiMEDOUT.
> > > > > > 
> > > > > > Why would it be better to return those errors rather than
> > > > > > just
> > > > > > a 0
> > > > > > when
> > > > > > we need to invalidate the inode, particularly since we
> > > > > > already
> > > > > > have
> > > > > > a
> > > > > > special case in nfs_lookup_revalidate_done() when the
> > > > > > dentry is
> > > > > > root?
> > > > > 
> > > > > ERESTARTSYS is the error that easily causes problems.
> > > > > 
> > > > > Returning 0 causes d_invalidate() to be called which is quite
> > > > > heavy
> > > > > handed in mountpoints.
> > > > 
> > > > My point is that it shouldn't get returned for mountpoints. See
> > > > nfs_lookup_revalidate_done().
> > > 
> > > nfs_lookup_revalidate_done() only checks IS_ROOT(), and while
> > > many
> > > mountpoints are IS_ROOT(), not all are (--bind easily makes
> > > others).
> > > 
> > > But that isn't even really relevant here.  The dentry being
> > > revalidated
> > > is the underlying directory - that something else is mounted on.
> > > step_into() which follows mount points is called in
> > > walk_component()
> > > *after* lookup_fast or lookup_slow which will have revalidated
> > > the
> > > dentry.
> > 
> > So then why is it not sufficient to just add a check for
> > d_mountpoint()? This is a revalidation, not a new lookup.
> > 
> 
> I guess you could do that.
> But why would you want to call d_invalidate() just because a signal
> was
> received, or a memory allocation failed?

Why would I care about the error return from nfs_lookup_verify_inode()?
This is a revalidation, and so sometimes the error returned is not
transient, but is persistent (e.g. EIO/ETIMEDOUT if the server is
down). In those cases, I still want to be able to do things like
unmount the filesystem.

> 
> NeilBrown
> 
> 
> > > 
> > > NeilBrown
> > > 
> > > 
> > > > 
> > > > > So it is only reasonable to return 0 when we have unambiguous
> > > > > confirmation from the server that the object no longer
> > > > > exists. 
> > > > > ESTALE
> > > > > is unambiguous. EIO might be unambiguous.  ERESTARTSYS,
> > > > > ENOMEM,
> > > > > ETIMEDOUT are transient and don't justify d_invalidate()
> > > > > being
> > > > > called.
> > > > > 
> > > > > (BTW, Commit cc89684c9a26 ("NFS: only invalidate dentrys that
> > > > > are
> > > > > clearly invalid.")
> > > > >  fixed much the same bug 3 years ago).
> > > > >  
> > > > > Thanks,
> > > > > NeilBrown
> > > > > 
> > > > > 
> > > > > > 
> > > > > > >  }
> > > > > > >  
> > > > > > >  static int
> > > > > > 
> > > > > > -- 
> > > > > > Trond Myklebust
> > > > > > Linux NFS client maintainer, Hammerspace
> > > > > > trond.myklebust@hammerspace.com
> > > > 
> > > > -- 
> > > > Trond Myklebust
> > > > CTO, Hammerspace Inc
> > > > 4984 El Camino Real, Suite 208
> > > > Los Altos, CA 94022
> > > > ​
> > > > www.hammer.space
> > 
> > -- 
> > Trond Myklebust
> > Linux NFS client maintainer, Hammerspace
> > trond.myklebust@hammerspace.com
NeilBrown Nov. 16, 2020, 6:08 a.m. UTC | #8
On Mon, Nov 16 2020, Trond Myklebust wrote:

> On Mon, 2020-11-16 at 16:12 +1100, NeilBrown wrote:
>> On Mon, Nov 16 2020, Trond Myklebust wrote:
>> 
>> > On Mon, 2020-11-16 at 16:00 +1100, NeilBrown wrote:
>> > > On Mon, Nov 16 2020, Trond Myklebust wrote:
>> > > 
>> > > > On Mon, 2020-11-16 at 15:43 +1100, NeilBrown wrote:
>> > > > > On Mon, Nov 16 2020, Trond Myklebust wrote:
>> > > > > 
>> > > > > > On Mon, 2020-11-16 at 13:59 +1100, NeilBrown wrote:
>> > > > > > > 
>> > > > > > > Prior to commit 5ceb9d7fdaaf ("NFS: Refactor
>> > > > > > > nfs_lookup_revalidate()")
>> > > > > > > and error from nfs_lookup_verify_inode() other than -
>> > > > > > > ESTALE
>> > > > > > > would
>> > > > > > > result
>> > > > > > > in nfs_lookup_revalidate() returning that error code (-
>> > > > > > > ESTALE
>> > > > > > > is
>> > > > > > > mapped
>> > > > > > > to zero).
>> > > > > > > Since that commit, all errors result in zero being
>> > > > > > > returned.
>> > > > > > > 
>> > > > > > > When nfs_lookup_revalidate() returns zero, the dentry is
>> > > > > > > invalidated
>> > > > > > > and, significantly, if the dentry is a directory that is
>> > > > > > > mounted
>> > > > > > > on,
>> > > > > > > that mountpoint is lost.
>> > > > > > > 
>> > > > > > > If you:
>> > > > > > >  - mount an NFS filesystem which contains a directory
>> > > > > > >  - mount something (e.g. tmpfs) on that directory
>> > > > > > >  - use iptables (or scissors) to block traffic to the
>> > > > > > > server
>> > > > > > >  - ls -l the-mounted-on-directory
>> > > > > > >  - interrupt the 'ls -l'
>> > > > > > > you will find that the directory has been unmounted.
>> > > > > > > 
>> > > > > > > This can be fixed by returning the actual error code from
>> > > > > > > nfs_lookup_verify_inode() rather then zero (except for -
>> > > > > > > ESTALE).
>> > > > > > > 
>> > > > > > > Fixes: 5ceb9d7fdaaf ("NFS: Refactor
>> > > > > > > nfs_lookup_revalidate()")
>> > > > > > > Signed-off-by: NeilBrown <neilb@suse.de>
>> > > > > > > ---
>> > > > > > >  fs/nfs/dir.c | 8 +++++---
>> > > > > > >  1 file changed, 5 insertions(+), 3 deletions(-)
>> > > > > > > 
>> > > > > > > diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
>> > > > > > > index cb52db9a0cfb..d24acf556e9e 100644
>> > > > > > > --- a/fs/nfs/dir.c
>> > > > > > > +++ b/fs/nfs/dir.c
>> > > > > > > @@ -1350,7 +1350,7 @@ nfs_do_lookup_revalidate(struct
>> > > > > > > inode
>> > > > > > > *dir,
>> > > > > > > struct dentry *dentry,
>> > > > > > >                          unsigned int flags)
>> > > > > > >  {
>> > > > > > >         struct inode *inode;
>> > > > > > > -       int error;
>> > > > > > > +       int error = 0;
>> > > > > > >  
>> > > > > > >         nfs_inc_stats(dir, NFSIOS_DENTRYREVALIDATE);
>> > > > > > >         inode = d_inode(dentry);
>> > > > > > > @@ -1372,8 +1372,10 @@ nfs_do_lookup_revalidate(struct
>> > > > > > > inode
>> > > > > > > *dir,
>> > > > > > > struct dentry *dentry,
>> > > > > > >             nfs_check_verifier(dir, dentry, flags &
>> > > > > > > LOOKUP_RCU))
>> > > > > > > {
>> > > > > > >                 error = nfs_lookup_verify_inode(inode,
>> > > > > > > flags);
>> > > > > > >                 if (error) {
>> > > > > > > -                       if (error == -ESTALE)
>> > > > > > > +                       if (error == -ESTALE) {
>> > > > > > >                                 nfs_zap_caches(dir);
>> > > > > > > +                               error = 0;
>> > > > > > > +                       }
>> > > > > > >                         goto out_bad;
>> > > > > > >                 }
>> > > > > > >                 nfs_advise_use_readdirplus(dir);
>> > > > > > > @@ -1395,7 +1397,7 @@ nfs_do_lookup_revalidate(struct
>> > > > > > > inode
>> > > > > > > *dir,
>> > > > > > > struct dentry *dentry,
>> > > > > > >  out_bad:
>> > > > > > >         if (flags & LOOKUP_RCU)
>> > > > > > >                 return -ECHILD;
>> > > > > > > -       return nfs_lookup_revalidate_done(dir, dentry,
>> > > > > > > inode,
>> > > > > > > 0);
>> > > > > > > +       return nfs_lookup_revalidate_done(dir, dentry,
>> > > > > > > inode,
>> > > > > > > error);
>> > > > > > 
>> > > > > > Which errors do we actually need to return here? As far as
>> > > > > > I
>> > > > > > can
>> > > > > > tell,
>> > > > > > the only errors that nfs_lookup_verify_inode() is supposed
>> > > > > > to
>> > > > > > return is
>> > > > > > ENOMEM, ESTALE, ECHILD, and possibly EIO or ETiMEDOUT.
>> > > > > > 
>> > > > > > Why would it be better to return those errors rather than
>> > > > > > just
>> > > > > > a 0
>> > > > > > when
>> > > > > > we need to invalidate the inode, particularly since we
>> > > > > > already
>> > > > > > have
>> > > > > > a
>> > > > > > special case in nfs_lookup_revalidate_done() when the
>> > > > > > dentry is
>> > > > > > root?
>> > > > > 
>> > > > > ERESTARTSYS is the error that easily causes problems.
>> > > > > 
>> > > > > Returning 0 causes d_invalidate() to be called which is quite
>> > > > > heavy
>> > > > > handed in mountpoints.
>> > > > 
>> > > > My point is that it shouldn't get returned for mountpoints. See
>> > > > nfs_lookup_revalidate_done().
>> > > 
>> > > nfs_lookup_revalidate_done() only checks IS_ROOT(), and while
>> > > many
>> > > mountpoints are IS_ROOT(), not all are (--bind easily makes
>> > > others).
>> > > 
>> > > But that isn't even really relevant here.  The dentry being
>> > > revalidated
>> > > is the underlying directory - that something else is mounted on.
>> > > step_into() which follows mount points is called in
>> > > walk_component()
>> > > *after* lookup_fast or lookup_slow which will have revalidated
>> > > the
>> > > dentry.
>> > 
>> > So then why is it not sufficient to just add a check for
>> > d_mountpoint()? This is a revalidation, not a new lookup.
>> > 
>> 
>> I guess you could do that.
>> But why would you want to call d_invalidate() just because a signal
>> was
>> received, or a memory allocation failed?
>
> Why would I care about the error return from nfs_lookup_verify_inode()?

?? Why wouldn't you?  What am I missing?

> This is a revalidation, and so sometimes the error returned is not
> transient, but is persistent (e.g. EIO/ETIMEDOUT if the server is
> down). In those cases, I still want to be able to do things like
> unmount the filesystem.

Why do you think that a server being down is a persistent error?
Servers come back up.

I don't have a particular opinion on how EIO and ETIMEDOUT should be
handled as I don't have a clear idea of what underlying issue produces
them.

I do know that ESTALE means the directory doesn't exist any more, so the
dentry should be invalidated, whether it is mounted on or not.
I do know that ERESTARTSYS means that a fatal signal was received by the
current process, so it is inappropriate to invalidate the dentry.

So I'm certain that we need to be able to handle different error codes
differently.  Maybe EIO should be treated the same as ESTALE.  Probably
ENOMEM should be handled like ERESTARTSYS....

NeilBrown


>
>> 
>> NeilBrown
>> 
>> 
>> > > 
>> > > NeilBrown
>> > > 
>> > > 
>> > > > 
>> > > > > So it is only reasonable to return 0 when we have unambiguous
>> > > > > confirmation from the server that the object no longer
>> > > > > exists. 
>> > > > > ESTALE
>> > > > > is unambiguous. EIO might be unambiguous.  ERESTARTSYS,
>> > > > > ENOMEM,
>> > > > > ETIMEDOUT are transient and don't justify d_invalidate()
>> > > > > being
>> > > > > called.
>> > > > > 
>> > > > > (BTW, Commit cc89684c9a26 ("NFS: only invalidate dentrys that
>> > > > > are
>> > > > > clearly invalid.")
>> > > > >  fixed much the same bug 3 years ago).
>> > > > >  
>> > > > > Thanks,
>> > > > > NeilBrown
>> > > > > 
>> > > > > 
>> > > > > > 
>> > > > > > >  }
>> > > > > > >  
>> > > > > > >  static int
>> > > > > > 
>> > > > > > -- 
>> > > > > > Trond Myklebust
>> > > > > > Linux NFS client maintainer, Hammerspace
>> > > > > > trond.myklebust@hammerspace.com
>> > > > 
>> > > > -- 
>> > > > Trond Myklebust
>> > > > CTO, Hammerspace Inc
>> > > > 4984 El Camino Real, Suite 208
>> > > > Los Altos, CA 94022
>> > > > ​
>> > > > www.hammer.space
>> > 
>> > -- 
>> > Trond Myklebust
>> > Linux NFS client maintainer, Hammerspace
>> > trond.myklebust@hammerspace.com
>
> -- 
> Trond Myklebust
> CTO, Hammerspace Inc
> 4984 El Camino Real, Suite 208
> Los Altos, CA 94022
> ​
> www.hammer.space
diff mbox series

Patch

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index cb52db9a0cfb..d24acf556e9e 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -1350,7 +1350,7 @@  nfs_do_lookup_revalidate(struct inode *dir, struct dentry *dentry,
 			 unsigned int flags)
 {
 	struct inode *inode;
-	int error;
+	int error = 0;
 
 	nfs_inc_stats(dir, NFSIOS_DENTRYREVALIDATE);
 	inode = d_inode(dentry);
@@ -1372,8 +1372,10 @@  nfs_do_lookup_revalidate(struct inode *dir, struct dentry *dentry,
 	    nfs_check_verifier(dir, dentry, flags & LOOKUP_RCU)) {
 		error = nfs_lookup_verify_inode(inode, flags);
 		if (error) {
-			if (error == -ESTALE)
+			if (error == -ESTALE) {
 				nfs_zap_caches(dir);
+				error = 0;
+			}
 			goto out_bad;
 		}
 		nfs_advise_use_readdirplus(dir);
@@ -1395,7 +1397,7 @@  nfs_do_lookup_revalidate(struct inode *dir, struct dentry *dentry,
 out_bad:
 	if (flags & LOOKUP_RCU)
 		return -ECHILD;
-	return nfs_lookup_revalidate_done(dir, dentry, inode, 0);
+	return nfs_lookup_revalidate_done(dir, dentry, inode, error);
 }
 
 static int