diff mbox

nfs: don't allow nfs_find_actor to match inodes of the wrong type

Message ID 1362013834-29600-1-git-send-email-jlayton@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Jeff Layton Feb. 28, 2013, 1:10 a.m. UTC
Benny Halevy reported the following oops when testing RHEL6:

<7>nfs_update_inode: inode 892950 mode changed, 0040755 to 0100644
<1>BUG: unable to handle kernel NULL pointer dereference at (null)
<1>IP: [<ffffffffa02a52c5>] nfs_closedir+0x15/0x30 [nfs]
<4>PGD 81448a067 PUD 831632067 PMD 0
<4>Oops: 0000 [#1] SMP
<4>last sysfs file: /sys/kernel/mm/redhat_transparent_hugepage/enabled
<4>CPU 6
<4>Modules linked in: fuse bonding 8021q garp ebtable_nat ebtables be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi softdog bridge stp llc xt_physdev ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_multiport iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 dm_round_robin dm_multipath objlayoutdriver2(U) nfs(U) lockd fscache auth_rpcgss nfs_acl sunrpc vhost_net macvtap macvlan tun kvm_intel kvm be2net igb dca ptp pps_core microcode serio_raw sg iTCO_wdt iTCO_vendor_support i7core_edac edac_core shpchp ext4 mbcache jbd2 sd_mod crc_t10dif ahci dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
<4>
<4>Pid: 6332, comm: dd Not tainted 2.6.32-358.el6.x86_64 #1 HP ProLiant DL170e G6  /ProLiant DL170e G6
<4>RIP: 0010:[<ffffffffa02a52c5>]  [<ffffffffa02a52c5>] nfs_closedir+0x15/0x30 [nfs]
<4>RSP: 0018:ffff88081458bb98  EFLAGS: 00010292
<4>RAX: ffffffffa02a52b0 RBX: 0000000000000000 RCX: 0000000000000003
<4>RDX: ffffffffa02e45a0 RSI: ffff88081440b300 RDI: ffff88082d5f5760
<4>RBP: ffff88081458bba8 R08: 0000000000000000 R09: 0000000000000000
<4>R10: 0000000000000772 R11: 0000000000400004 R12: 0000000040000008
<4>R13: ffff88082d5f5760 R14: ffff88082d6e8800 R15: ffff88082f12d780
<4>FS:  00007f728f37e700(0000) GS:ffff8800456c0000(0000) knlGS:0000000000000000
<4>CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
<4>CR2: 0000000000000000 CR3: 0000000831279000 CR4: 00000000000007e0
<4>DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<4>DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
<4>Process dd (pid: 6332, threadinfo ffff88081458a000, task ffff88082fa0e040)
<4>Stack:
<4> 0000000040000008 ffff88081440b300 ffff88081458bbf8 ffffffff81182745
<4><d> ffff88082d5f5760 ffff88082d6e8800 ffff88081458bbf8 ffffffffffffffea
<4><d> ffff88082f12d780 ffff88082d6e8800 ffffffffa02a50a0 ffff88082d5f5760
<4>Call Trace:
<4> [<ffffffff81182745>] __fput+0xf5/0x210
<4> [<ffffffffa02a50a0>] ? do_open+0x0/0x20 [nfs]
<4> [<ffffffff81182885>] fput+0x25/0x30
<4> [<ffffffff8117e23e>] __dentry_open+0x27e/0x360
<4> [<ffffffff811c397a>] ? inotify_d_instantiate+0x2a/0x60
<4> [<ffffffff8117e4b9>] lookup_instantiate_filp+0x69/0x90
<4> [<ffffffffa02a6679>] nfs_intent_set_file+0x59/0x90 [nfs]
<4> [<ffffffffa02a686b>] nfs_atomic_lookup+0x1bb/0x310 [nfs]
<4> [<ffffffff8118e0c2>] __lookup_hash+0x102/0x160
<4> [<ffffffff81225052>] ? selinux_inode_permission+0x72/0xb0
<4> [<ffffffff8118e76a>] lookup_hash+0x3a/0x50
<4> [<ffffffff81192a4b>] do_filp_open+0x2eb/0xdd0
<4> [<ffffffff8104757c>] ? __do_page_fault+0x1ec/0x480
<4> [<ffffffff8119f562>] ? alloc_fd+0x92/0x160
<4> [<ffffffff8117de79>] do_sys_open+0x69/0x140
<4> [<ffffffff811811f6>] ? sys_lseek+0x66/0x80
<4> [<ffffffff8117df90>] sys_open+0x20/0x30
<4> [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
<4>Code: 65 48 8b 04 25 c8 cb 00 00 83 a8 44 e0 ff ff 01 5b 41 5c c9 c3 90 55 48 89 e5 53 48 83 ec 08 0f 1f 44 00 00 48 8b 9e a0 00 00 00 <48> 8b 3b e8 13 0c f7 ff 48 89 df e8 ab 3d ec e0 48 83 c4 08 31
<1>RIP  [<ffffffffa02a52c5>] nfs_closedir+0x15/0x30 [nfs]
<4> RSP <ffff88081458bb98>
<4>CR2: 0000000000000000

I think this is ultimately due to a bug on the server. The client had
previously found a directory dentry. It then later tried to do an atomic
open on a new (regular file) dentry. The attributes it got back had the
same filehandle as the previously found directory inode. It then tried
to put the filp because it failed the aops tests for O_DIRECT opens, and
oopsed here because the ctx was still NULL.

Obviously the root cause here is a server issue, but we can take steps
to mitigate this on the client. When nfs_fhget is called, we always know
what type of inode it is. In the event that there's a broken or
malicious server on the other end of the wire, the client can end up
crashing because the wrong ops are set on it.

Have nfs_find_actor check that the inode type is correct after checking
the fileid. The fileid check should rarely ever match, so it should only
rarely ever get to this check. In the case where we have a broken
server, we may see two different inodes with the same i_ino, but the
client should be able to cope with them without crashing.

This should fix the oops reported here:

    https://bugzilla.redhat.com/show_bug.cgi?id=913660

Reported-by: Benny Halevy <bhalevy@tonian.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
---
 fs/nfs/inode.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Quentin Barnes July 31, 2013, 6:36 p.m. UTC | #1
I was reviewing patches I was integrating into my NFS source base
and this one popped up (commit f6488c9b).  In reviewing it, it would
fix the problem, however, I think it's not the best fix for it.
Catching it in nfs_find_actor() is after the horse has left the
barn so to speak.

I think the problem is in nfs_fhget() when constructing the new
inode (inside the if (inode->i_state & I_NEW) {...}) and should be
addressed there.

The nfs module already has checks to ensure that the expression
"((S_IFMT & inode->i_mode) == (S_IFMT & fattr->mode))" is true
in nfs_update_inode() and nfs_check_inode_attributes().  I think
problem Benny hit was that the equivalent check is missing in
nfs_fhget() when constructing a new inode.  The same check that's
in those other two functions needs to be added to nfs_fhget()'s "if
(inode->i_state & I_NEW) {...}" conditional block to prevent the
problem of an inconsistent fattr/inode state from occurring in the
first place.

I can come up with a patch if you'd like, but I wanted to make
sure my assertions were valid first.  Is removing a check from
nfs_find_actor() and adding the check to nfs_fhget() to prevent the
inconsistency in the first place a better approach?

Quentin

On Wed, Feb 27, 2013 at 7:10 PM, Jeff Layton <jlayton@redhat.com> wrote:
> Benny Halevy reported the following oops when testing RHEL6:
[...]
> This should fix the oops reported here:
>
>     https://bugzilla.redhat.com/show_bug.cgi?id=913660
>
> Reported-by: Benny Halevy <bhalevy@tonian.com>
> Signed-off-by: Jeff Layton <jlayton@redhat.com>
> ---
>  fs/nfs/inode.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
> index 6acc73c..f52c99f 100644
> --- a/fs/nfs/inode.c
> +++ b/fs/nfs/inode.c
> @@ -237,6 +237,8 @@ nfs_find_actor(struct inode *inode, void *opaque)
>
>         if (NFS_FILEID(inode) != fattr->fileid)
>                 return 0;
> +       if ((S_IFMT & inode->i_mode) != (S_IFMT & fattr->mode))
> +               return 0;
>         if (nfs_compare_fh(NFS_FH(inode), fh))
>                 return 0;
>         if (is_bad_inode(inode) || NFS_STALE(inode))
> --
> 1.7.11.7
>
> --
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jeff Layton July 31, 2013, 6:46 p.m. UTC | #2
On Wed, 31 Jul 2013 13:36:21 -0500
Quentin Barnes <qbarnes@gmail.com> wrote:

> I was reviewing patches I was integrating into my NFS source base
> and this one popped up (commit f6488c9b).  In reviewing it, it would
> fix the problem, however, I think it's not the best fix for it.
> Catching it in nfs_find_actor() is after the horse has left the
> barn so to speak.
> 
> I think the problem is in nfs_fhget() when constructing the new
> inode (inside the if (inode->i_state & I_NEW) {...}) and should be
> addressed there.
> 
> The nfs module already has checks to ensure that the expression
> "((S_IFMT & inode->i_mode) == (S_IFMT & fattr->mode))" is true
> in nfs_update_inode() and nfs_check_inode_attributes().  I think
> problem Benny hit was that the equivalent check is missing in
> nfs_fhget() when constructing a new inode.  The same check that's
> in those other two functions needs to be added to nfs_fhget()'s "if
> (inode->i_state & I_NEW) {...}" conditional block to prevent the
> problem of an inconsistent fattr/inode state from occurring in the
> first place.
> 
> I can come up with a patch if you'd like, but I wanted to make
> sure my assertions were valid first.  Is removing a check from
> nfs_find_actor() and adding the check to nfs_fhget() to prevent the
> inconsistency in the first place a better approach?
> 
> Quentin
> 
> On Wed, Feb 27, 2013 at 7:10 PM, Jeff Layton <jlayton@redhat.com> wrote:
> > Benny Halevy reported the following oops when testing RHEL6:
> [...]
> > This should fix the oops reported here:
> >
> >     https://bugzilla.redhat.com/show_bug.cgi?id=913660
> >
> > Reported-by: Benny Halevy <bhalevy@tonian.com>
> > Signed-off-by: Jeff Layton <jlayton@redhat.com>
> > ---
> >  fs/nfs/inode.c | 2 ++
> >  1 file changed, 2 insertions(+)
> >
> > diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
> > index 6acc73c..f52c99f 100644
> > --- a/fs/nfs/inode.c
> > +++ b/fs/nfs/inode.c
> > @@ -237,6 +237,8 @@ nfs_find_actor(struct inode *inode, void *opaque)
> >
> >         if (NFS_FILEID(inode) != fattr->fileid)
> >                 return 0;
> > +       if ((S_IFMT & inode->i_mode) != (S_IFMT & fattr->mode))
> > +               return 0;
> >         if (nfs_compare_fh(NFS_FH(inode), fh))
> >                 return 0;
> >         if (is_bad_inode(inode) || NFS_STALE(inode))
> > --
> > 1.7.11.7
> >
> > --


I don't think I agree. The whole problem was that we were allowing
iget5_locked to match an existing inode even though (S_IFMT & mode) was
different.

Why would it be preferable to use the same struct inode even though
it's clearly a different one on the server? inodes generally don't
change their type after they've been created.

That said, patches speak louder than words so if you have one to
propose I can certainly take a look. Maybe it'll make sense to me
then...
Quentin Barnes July 31, 2013, 8:26 p.m. UTC | #3
On Wed, Jul 31, 2013 at 1:46 PM, Jeff Layton <jlayton@redhat.com> wrote:
> On Wed, 31 Jul 2013 13:36:21 -0500
> Quentin Barnes <qbarnes@gmail.com> wrote:
>
>> I was reviewing patches I was integrating into my NFS source base
>> and this one popped up (commit f6488c9b).  In reviewing it, it would
>> fix the problem, however, I think it's not the best fix for it.
>> Catching it in nfs_find_actor() is after the horse has left the
>> barn so to speak.
>>
>> I think the problem is in nfs_fhget() when constructing the new
>> inode (inside the if (inode->i_state & I_NEW) {...}) and should be
>> addressed there.
>>
>> The nfs module already has checks to ensure that the expression
>> "((S_IFMT & inode->i_mode) == (S_IFMT & fattr->mode))" is true
>> in nfs_update_inode() and nfs_check_inode_attributes().  I think
>> problem Benny hit was that the equivalent check is missing in
>> nfs_fhget() when constructing a new inode.  The same check that's
>> in those other two functions needs to be added to nfs_fhget()'s "if
>> (inode->i_state & I_NEW) {...}" conditional block to prevent the
>> problem of an inconsistent fattr/inode state from occurring in the
>> first place.
>>
>> I can come up with a patch if you'd like, but I wanted to make
>> sure my assertions were valid first.  Is removing a check from
>> nfs_find_actor() and adding the check to nfs_fhget() to prevent the
>> inconsistency in the first place a better approach?
>>
>> Quentin
>
> I don't think I agree. The whole problem was that we were allowing
> iget5_locked to match an existing inode even though (S_IFMT & mode) was
> different.

I assert that when an inconsistent state happens you don't want
nfs_find_actor() to be the one responsible for noticing it
(actually, ignoring the problem).  You want the existing consistency
and error recovery paths to handle it instead.

So one or two things are going on.  The existing error recovery
paths are inadequate or that there's a hole in the fattr/inode
consistency checking (or possibly both).  Your patch covers over
this underlying bug rather than addressing it and which also lets
the old inode not get flagged as invalid.  I would rather see the
real problem fixed and the old inode get flagged rather than just
ignoring it and creating another new inode.

Before your patch, if because of a server bug reusing a file
handle, nfs_find_actor(), would match and iget5_locked() would
return an inode that matched on its file handle but that didn't
necessarily match the fattr info.

If that inode returned by iget5_locked() was not I_NEW (the else
case), nfs_fhget() would invoke nfs_refresh_inode(), which would
call nfs_refresh_inode_locked().  It calls either nfs_update_inode()
or nfs_check_inode_attributes().  Both of those functions already
check for fattr/inode S_IFMT consistency catching bad states.

> Why would it be preferable to use the same struct inode even though
> it's clearly a different one on the server? inodes generally don't
> change their type after they've been created.

It's not preferable to have the same inode in a different state.
What you don't want is nfs_find_actor() as a side-effect to let the
problem go unnoticed.  You want let the normal, existing error
handling paths deal with it, or if they're not fully dealing with
it, fix them.

Since the I_NEW conditional false path (the else case) in nfs_fhget()
should catch the problem, I had thought that the problem must lay
with the I_NEW conditional true path.  Now I don't think so.

I see now that nfs_check_inode_attributes() is just returning an
error when the fattr and inode states are found to be inconsistent,
but is taking no action on the bad inode's state (i.e. flagging it
as invalid).  And upon return, nfs_fhget() is not doing anything with
the bad return state passed up to nfs_refresh_inode().  Now I'm
thinking that this is problematic hole because it is ignoring the
error state and that let Benny's panic happen.

It doesn't seem to me that it should be nfs_check_inode_attributes()
job to flag the inode as invalid.  That seems outside its role.  I
would think it should be nfs_refresh_inode_locked()'s job, but I'm
still reviewing the code.

Thoughts?

> That said, patches speak louder than words so if you have one to
> propose I can certainly take a look. Maybe it'll make sense to me
> then...

As you can see, I'm still trying to fully piece together the
underlying cause cause of the original bug.

The panic was against RHEL6.4.  Do you want a patch against that
code base or against the linux git?

> Jeff Layton <jlayton@redhat.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Trond Myklebust July 31, 2013, 8:32 p.m. UTC | #4
T24gV2VkLCAyMDEzLTA3LTMxIGF0IDE1OjI2IC0wNTAwLCBRdWVudGluIEJhcm5lcyB3cm90ZToN
Cj4gT24gV2VkLCBKdWwgMzEsIDIwMTMgYXQgMTo0NiBQTSwgSmVmZiBMYXl0b24gPGpsYXl0b25A
cmVkaGF0LmNvbT4gd3JvdGU6DQo+ID4gT24gV2VkLCAzMSBKdWwgMjAxMyAxMzozNjoyMSAtMDUw
MA0KPiA+IFF1ZW50aW4gQmFybmVzIDxxYmFybmVzQGdtYWlsLmNvbT4gd3JvdGU6DQo+ID4NCj4g
Pj4gSSB3YXMgcmV2aWV3aW5nIHBhdGNoZXMgSSB3YXMgaW50ZWdyYXRpbmcgaW50byBteSBORlMg
c291cmNlIGJhc2UNCj4gPj4gYW5kIHRoaXMgb25lIHBvcHBlZCB1cCAoY29tbWl0IGY2NDg4Yzli
KS4gIEluIHJldmlld2luZyBpdCwgaXQgd291bGQNCj4gPj4gZml4IHRoZSBwcm9ibGVtLCBob3dl
dmVyLCBJIHRoaW5rIGl0J3Mgbm90IHRoZSBiZXN0IGZpeCBmb3IgaXQuDQo+ID4+IENhdGNoaW5n
IGl0IGluIG5mc19maW5kX2FjdG9yKCkgaXMgYWZ0ZXIgdGhlIGhvcnNlIGhhcyBsZWZ0IHRoZQ0K
PiA+PiBiYXJuIHNvIHRvIHNwZWFrLg0KPiA+Pg0KPiA+PiBJIHRoaW5rIHRoZSBwcm9ibGVtIGlz
IGluIG5mc19maGdldCgpIHdoZW4gY29uc3RydWN0aW5nIHRoZSBuZXcNCj4gPj4gaW5vZGUgKGlu
c2lkZSB0aGUgaWYgKGlub2RlLT5pX3N0YXRlICYgSV9ORVcpIHsuLi59KSBhbmQgc2hvdWxkIGJl
DQo+ID4+IGFkZHJlc3NlZCB0aGVyZS4NCj4gPj4NCj4gPj4gVGhlIG5mcyBtb2R1bGUgYWxyZWFk
eSBoYXMgY2hlY2tzIHRvIGVuc3VyZSB0aGF0IHRoZSBleHByZXNzaW9uDQo+ID4+ICIoKFNfSUZN
VCAmIGlub2RlLT5pX21vZGUpID09IChTX0lGTVQgJiBmYXR0ci0+bW9kZSkpIiBpcyB0cnVlDQo+
ID4+IGluIG5mc191cGRhdGVfaW5vZGUoKSBhbmQgbmZzX2NoZWNrX2lub2RlX2F0dHJpYnV0ZXMo
KS4gIEkgdGhpbmsNCj4gPj4gcHJvYmxlbSBCZW5ueSBoaXQgd2FzIHRoYXQgdGhlIGVxdWl2YWxl
bnQgY2hlY2sgaXMgbWlzc2luZyBpbg0KPiA+PiBuZnNfZmhnZXQoKSB3aGVuIGNvbnN0cnVjdGlu
ZyBhIG5ldyBpbm9kZS4gIFRoZSBzYW1lIGNoZWNrIHRoYXQncw0KPiA+PiBpbiB0aG9zZSBvdGhl
ciB0d28gZnVuY3Rpb25zIG5lZWRzIHRvIGJlIGFkZGVkIHRvIG5mc19maGdldCgpJ3MgImlmDQo+
ID4+IChpbm9kZS0+aV9zdGF0ZSAmIElfTkVXKSB7Li4ufSIgY29uZGl0aW9uYWwgYmxvY2sgdG8g
cHJldmVudCB0aGUNCj4gPj4gcHJvYmxlbSBvZiBhbiBpbmNvbnNpc3RlbnQgZmF0dHIvaW5vZGUg
c3RhdGUgZnJvbSBvY2N1cnJpbmcgaW4gdGhlDQo+ID4+IGZpcnN0IHBsYWNlLg0KPiA+Pg0KPiA+
PiBJIGNhbiBjb21lIHVwIHdpdGggYSBwYXRjaCBpZiB5b3UnZCBsaWtlLCBidXQgSSB3YW50ZWQg
dG8gbWFrZQ0KPiA+PiBzdXJlIG15IGFzc2VydGlvbnMgd2VyZSB2YWxpZCBmaXJzdC4gIElzIHJl
bW92aW5nIGEgY2hlY2sgZnJvbQ0KPiA+PiBuZnNfZmluZF9hY3RvcigpIGFuZCBhZGRpbmcgdGhl
IGNoZWNrIHRvIG5mc19maGdldCgpIHRvIHByZXZlbnQgdGhlDQo+ID4+IGluY29uc2lzdGVuY3kg
aW4gdGhlIGZpcnN0IHBsYWNlIGEgYmV0dGVyIGFwcHJvYWNoPw0KPiA+Pg0KPiA+PiBRdWVudGlu
DQo+ID4NCj4gPiBJIGRvbid0IHRoaW5rIEkgYWdyZWUuIFRoZSB3aG9sZSBwcm9ibGVtIHdhcyB0
aGF0IHdlIHdlcmUgYWxsb3dpbmcNCj4gPiBpZ2V0NV9sb2NrZWQgdG8gbWF0Y2ggYW4gZXhpc3Rp
bmcgaW5vZGUgZXZlbiB0aG91Z2ggKFNfSUZNVCAmIG1vZGUpIHdhcw0KPiA+IGRpZmZlcmVudC4N
Cj4gDQo+IEkgYXNzZXJ0IHRoYXQgd2hlbiBhbiBpbmNvbnNpc3RlbnQgc3RhdGUgaGFwcGVucyB5
b3UgZG9uJ3Qgd2FudA0KPiBuZnNfZmluZF9hY3RvcigpIHRvIGJlIHRoZSBvbmUgcmVzcG9uc2li
bGUgZm9yIG5vdGljaW5nIGl0DQo+IChhY3R1YWxseSwgaWdub3JpbmcgdGhlIHByb2JsZW0pLiAg
WW91IHdhbnQgdGhlIGV4aXN0aW5nIGNvbnNpc3RlbmN5DQo+IGFuZCBlcnJvciByZWNvdmVyeSBw
YXRocyB0byBoYW5kbGUgaXQgaW5zdGVhZC4NCj4gDQo+IFNvIG9uZSBvciB0d28gdGhpbmdzIGFy
ZSBnb2luZyBvbi4gIFRoZSBleGlzdGluZyBlcnJvciByZWNvdmVyeQ0KPiBwYXRocyBhcmUgaW5h
ZGVxdWF0ZSBvciB0aGF0IHRoZXJlJ3MgYSBob2xlIGluIHRoZSBmYXR0ci9pbm9kZQ0KPiBjb25z
aXN0ZW5jeSBjaGVja2luZyAob3IgcG9zc2libHkgYm90aCkuICBZb3VyIHBhdGNoIGNvdmVycyBv
dmVyDQo+IHRoaXMgdW5kZXJseWluZyBidWcgcmF0aGVyIHRoYW4gYWRkcmVzc2luZyBpdCBhbmQg
d2hpY2ggYWxzbyBsZXRzDQo+IHRoZSBvbGQgaW5vZGUgbm90IGdldCBmbGFnZ2VkIGFzIGludmFs
aWQuICBJIHdvdWxkIHJhdGhlciBzZWUgdGhlDQo+IHJlYWwgcHJvYmxlbSBmaXhlZCBhbmQgdGhl
IG9sZCBpbm9kZSBnZXQgZmxhZ2dlZCByYXRoZXIgdGhhbiBqdXN0DQo+IGlnbm9yaW5nIGl0IGFu
ZCBjcmVhdGluZyBhbm90aGVyIG5ldyBpbm9kZS4NCj4gDQo+IEJlZm9yZSB5b3VyIHBhdGNoLCBp
ZiBiZWNhdXNlIG9mIGEgc2VydmVyIGJ1ZyByZXVzaW5nIGEgZmlsZQ0KPiBoYW5kbGUsIG5mc19m
aW5kX2FjdG9yKCksIHdvdWxkIG1hdGNoIGFuZCBpZ2V0NV9sb2NrZWQoKSB3b3VsZA0KPiByZXR1
cm4gYW4gaW5vZGUgdGhhdCBtYXRjaGVkIG9uIGl0cyBmaWxlIGhhbmRsZSBidXQgdGhhdCBkaWRu
J3QNCj4gbmVjZXNzYXJpbHkgbWF0Y2ggdGhlIGZhdHRyIGluZm8uDQo+IA0KPiBJZiB0aGF0IGlu
b2RlIHJldHVybmVkIGJ5IGlnZXQ1X2xvY2tlZCgpIHdhcyBub3QgSV9ORVcgKHRoZSBlbHNlDQo+
IGNhc2UpLCBuZnNfZmhnZXQoKSB3b3VsZCBpbnZva2UgbmZzX3JlZnJlc2hfaW5vZGUoKSwgd2hp
Y2ggd291bGQNCj4gY2FsbCBuZnNfcmVmcmVzaF9pbm9kZV9sb2NrZWQoKS4gIEl0IGNhbGxzIGVp
dGhlciBuZnNfdXBkYXRlX2lub2RlKCkNCj4gb3IgbmZzX2NoZWNrX2lub2RlX2F0dHJpYnV0ZXMo
KS4gIEJvdGggb2YgdGhvc2UgZnVuY3Rpb25zIGFscmVhZHkNCj4gY2hlY2sgZm9yIGZhdHRyL2lu
b2RlIFNfSUZNVCBjb25zaXN0ZW5jeSBjYXRjaGluZyBiYWQgc3RhdGVzLg0KPiANCj4gPiBXaHkg
d291bGQgaXQgYmUgcHJlZmVyYWJsZSB0byB1c2UgdGhlIHNhbWUgc3RydWN0IGlub2RlIGV2ZW4g
dGhvdWdoDQo+ID4gaXQncyBjbGVhcmx5IGEgZGlmZmVyZW50IG9uZSBvbiB0aGUgc2VydmVyPyBp
bm9kZXMgZ2VuZXJhbGx5IGRvbid0DQo+ID4gY2hhbmdlIHRoZWlyIHR5cGUgYWZ0ZXIgdGhleSd2
ZSBiZWVuIGNyZWF0ZWQuDQo+IA0KPiBJdCdzIG5vdCBwcmVmZXJhYmxlIHRvIGhhdmUgdGhlIHNh
bWUgaW5vZGUgaW4gYSBkaWZmZXJlbnQgc3RhdGUuDQo+IFdoYXQgeW91IGRvbid0IHdhbnQgaXMg
bmZzX2ZpbmRfYWN0b3IoKSBhcyBhIHNpZGUtZWZmZWN0IHRvIGxldCB0aGUNCj4gcHJvYmxlbSBn
byB1bm5vdGljZWQuICBZb3Ugd2FudCBsZXQgdGhlIG5vcm1hbCwgZXhpc3RpbmcgZXJyb3INCj4g
aGFuZGxpbmcgcGF0aHMgZGVhbCB3aXRoIGl0LCBvciBpZiB0aGV5J3JlIG5vdCBmdWxseSBkZWFs
aW5nIHdpdGgNCj4gaXQsIGZpeCB0aGVtLg0KPiANCj4gU2luY2UgdGhlIElfTkVXIGNvbmRpdGlv
bmFsIGZhbHNlIHBhdGggKHRoZSBlbHNlIGNhc2UpIGluIG5mc19maGdldCgpDQo+IHNob3VsZCBj
YXRjaCB0aGUgcHJvYmxlbSwgSSBoYWQgdGhvdWdodCB0aGF0IHRoZSBwcm9ibGVtIG11c3QgbGF5
DQo+IHdpdGggdGhlIElfTkVXIGNvbmRpdGlvbmFsIHRydWUgcGF0aC4gIE5vdyBJIGRvbid0IHRo
aW5rIHNvLg0KPiANCj4gSSBzZWUgbm93IHRoYXQgbmZzX2NoZWNrX2lub2RlX2F0dHJpYnV0ZXMo
KSBpcyBqdXN0IHJldHVybmluZyBhbg0KPiBlcnJvciB3aGVuIHRoZSBmYXR0ciBhbmQgaW5vZGUg
c3RhdGVzIGFyZSBmb3VuZCB0byBiZSBpbmNvbnNpc3RlbnQsDQo+IGJ1dCBpcyB0YWtpbmcgbm8g
YWN0aW9uIG9uIHRoZSBiYWQgaW5vZGUncyBzdGF0ZSAoaS5lLiBmbGFnZ2luZyBpdA0KPiBhcyBp
bnZhbGlkKS4gIEFuZCB1cG9uIHJldHVybiwgbmZzX2ZoZ2V0KCkgaXMgbm90IGRvaW5nIGFueXRo
aW5nIHdpdGgNCj4gdGhlIGJhZCByZXR1cm4gc3RhdGUgcGFzc2VkIHVwIHRvIG5mc19yZWZyZXNo
X2lub2RlKCkuICBOb3cgSSdtDQo+IHRoaW5raW5nIHRoYXQgdGhpcyBpcyBwcm9ibGVtYXRpYyBo
b2xlIGJlY2F1c2UgaXQgaXMgaWdub3JpbmcgdGhlDQo+IGVycm9yIHN0YXRlIGFuZCB0aGF0IGxl
dCBCZW5ueSdzIHBhbmljIGhhcHBlbi4NCj4gDQo+IEl0IGRvZXNuJ3Qgc2VlbSB0byBtZSB0aGF0
IGl0IHNob3VsZCBiZSBuZnNfY2hlY2tfaW5vZGVfYXR0cmlidXRlcygpDQo+IGpvYiB0byBmbGFn
IHRoZSBpbm9kZSBhcyBpbnZhbGlkLiAgVGhhdCBzZWVtcyBvdXRzaWRlIGl0cyByb2xlLiAgSQ0K
PiB3b3VsZCB0aGluayBpdCBzaG91bGQgYmUgbmZzX3JlZnJlc2hfaW5vZGVfbG9ja2VkKCkncyBq
b2IsIGJ1dCBJJ20NCj4gc3RpbGwgcmV2aWV3aW5nIHRoZSBjb2RlLg0KPiANCj4gVGhvdWdodHM/
DQoNClF1aXRlIGZyYW5rbHksIGFsbCBJIGNhcmUgYWJvdXQgaW4gYSBzaXR1YXRpb24gbGlrZSB0
aGlzIGlzIHRoYXQgdGhlDQpjbGllbnQgZG9lc24ndCBPb3BzLg0KDQpJZiBhIHNlcnZlciBpcyBy
ZWFsbHkgdGhpcyB1dHRlcmx5IGJyb2tlbiwgdGhlbiB0aGVyZSBpcyBubyB3YXkgd2UgY2FuDQpm
aXggaXQgb24gdGhlIGNsaWVudCwgYW5kIHdlJ3JlIG5vdCBldmVuIGdvaW5nIHRvIHRyeS4NCg0K
LS0gDQpUcm9uZCBNeWtsZWJ1c3QNCkxpbnV4IE5GUyBjbGllbnQgbWFpbnRhaW5lcg0KDQpOZXRB
cHANClRyb25kLk15a2xlYnVzdEBuZXRhcHAuY29tDQp3d3cubmV0YXBwLmNvbQ0K
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jeff Layton July 31, 2013, 8:38 p.m. UTC | #5
On Wed, 31 Jul 2013 20:32:24 +0000
"Myklebust, Trond" <Trond.Myklebust@netapp.com> wrote:

> On Wed, 2013-07-31 at 15:26 -0500, Quentin Barnes wrote:
> > On Wed, Jul 31, 2013 at 1:46 PM, Jeff Layton <jlayton@redhat.com> wrote:
> > > On Wed, 31 Jul 2013 13:36:21 -0500
> > > Quentin Barnes <qbarnes@gmail.com> wrote:
> > >
> > >> I was reviewing patches I was integrating into my NFS source base
> > >> and this one popped up (commit f6488c9b).  In reviewing it, it would
> > >> fix the problem, however, I think it's not the best fix for it.
> > >> Catching it in nfs_find_actor() is after the horse has left the
> > >> barn so to speak.
> > >>
> > >> I think the problem is in nfs_fhget() when constructing the new
> > >> inode (inside the if (inode->i_state & I_NEW) {...}) and should be
> > >> addressed there.
> > >>
> > >> The nfs module already has checks to ensure that the expression
> > >> "((S_IFMT & inode->i_mode) == (S_IFMT & fattr->mode))" is true
> > >> in nfs_update_inode() and nfs_check_inode_attributes().  I think
> > >> problem Benny hit was that the equivalent check is missing in
> > >> nfs_fhget() when constructing a new inode.  The same check that's
> > >> in those other two functions needs to be added to nfs_fhget()'s "if
> > >> (inode->i_state & I_NEW) {...}" conditional block to prevent the
> > >> problem of an inconsistent fattr/inode state from occurring in the
> > >> first place.
> > >>
> > >> I can come up with a patch if you'd like, but I wanted to make
> > >> sure my assertions were valid first.  Is removing a check from
> > >> nfs_find_actor() and adding the check to nfs_fhget() to prevent the
> > >> inconsistency in the first place a better approach?
> > >>
> > >> Quentin
> > >
> > > I don't think I agree. The whole problem was that we were allowing
> > > iget5_locked to match an existing inode even though (S_IFMT & mode) was
> > > different.
> > 
> > I assert that when an inconsistent state happens you don't want
> > nfs_find_actor() to be the one responsible for noticing it
> > (actually, ignoring the problem).  You want the existing consistency
> > and error recovery paths to handle it instead.
> > 
> > So one or two things are going on.  The existing error recovery
> > paths are inadequate or that there's a hole in the fattr/inode
> > consistency checking (or possibly both).  Your patch covers over
> > this underlying bug rather than addressing it and which also lets
> > the old inode not get flagged as invalid.  I would rather see the
> > real problem fixed and the old inode get flagged rather than just
> > ignoring it and creating another new inode.
> > 
> > Before your patch, if because of a server bug reusing a file
> > handle, nfs_find_actor(), would match and iget5_locked() would
> > return an inode that matched on its file handle but that didn't
> > necessarily match the fattr info.
> > 
> > If that inode returned by iget5_locked() was not I_NEW (the else
> > case), nfs_fhget() would invoke nfs_refresh_inode(), which would
> > call nfs_refresh_inode_locked().  It calls either nfs_update_inode()
> > or nfs_check_inode_attributes().  Both of those functions already
> > check for fattr/inode S_IFMT consistency catching bad states.
> > 
> > > Why would it be preferable to use the same struct inode even though
> > > it's clearly a different one on the server? inodes generally don't
> > > change their type after they've been created.
> > 
> > It's not preferable to have the same inode in a different state.
> > What you don't want is nfs_find_actor() as a side-effect to let the
> > problem go unnoticed.  You want let the normal, existing error
> > handling paths deal with it, or if they're not fully dealing with
> > it, fix them.
> > 
> > Since the I_NEW conditional false path (the else case) in nfs_fhget()
> > should catch the problem, I had thought that the problem must lay
> > with the I_NEW conditional true path.  Now I don't think so.
> > 
> > I see now that nfs_check_inode_attributes() is just returning an
> > error when the fattr and inode states are found to be inconsistent,
> > but is taking no action on the bad inode's state (i.e. flagging it
> > as invalid).  And upon return, nfs_fhget() is not doing anything with
> > the bad return state passed up to nfs_refresh_inode().  Now I'm
> > thinking that this is problematic hole because it is ignoring the
> > error state and that let Benny's panic happen.
> > 
> > It doesn't seem to me that it should be nfs_check_inode_attributes()
> > job to flag the inode as invalid.  That seems outside its role.  I
> > would think it should be nfs_refresh_inode_locked()'s job, but I'm
> > still reviewing the code.
> > 
> > Thoughts?
> 
> Quite frankly, all I care about in a situation like this is that the
> client doesn't Oops.
> 
> If a server is really this utterly broken, then there is no way we can
> fix it on the client, and we're not even going to try.
> 

Agreed. For the record, this problem came about due to a bug in the
server against which Benny was testing. It's not a situation that
should ever occur under normal conditions.
diff mbox

Patch

diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index 6acc73c..f52c99f 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -237,6 +237,8 @@  nfs_find_actor(struct inode *inode, void *opaque)
 
 	if (NFS_FILEID(inode) != fattr->fileid)
 		return 0;
+	if ((S_IFMT & inode->i_mode) != (S_IFMT & fattr->mode))
+		return 0;
 	if (nfs_compare_fh(NFS_FH(inode), fh))
 		return 0;
 	if (is_bad_inode(inode) || NFS_STALE(inode))