Message ID | 20130731230051.GB31859@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
T24gV2VkLCAyMDEzLTA3LTMxIGF0IDE4OjAwIC0wNTAwLCBRdWVudGluIEJhcm5lcyB3cm90ZToN Cj4gPiBRdWl0ZSBmcmFua2x5LCBhbGwgSSBjYXJlIGFib3V0IGluIGEgc2l0dWF0aW9uIGxpa2Ug dGhpcyBpcyB0aGF0IHRoZQ0KPiA+IGNsaWVudCBkb2Vzbid0IE9vcHMuDQo+IA0KPiBDZXJ0YWlu bHksIGFuZCBoaXMgcGF0Y2ggZG9lcyBkbyB0aGF0LCBidXQgaXQncyBhbHNvIHBvaW50aW5nIG91 dA0KPiB0aGVyZSdzIGFub3RoZXIgYnVnIHJ1bm5pbmcgYXJvdW5kLiAgQW5kIG9uY2UgeW91IGZp eCB0aGF0IGJ1ZywgdGhlDQo+IG9yaWdpbmFsIHBhdGNoIGlzIG5vIGxvbmdlciBuZWVkZWQuDQo+ IA0KPiA+IElmIGEgc2VydmVyIGlzIHJlYWxseSB0aGlzIHV0dGVybHkgYnJva2VuLCB0aGVuIHRo ZXJlIGlzIG5vIHdheSB3ZSBjYW4NCj4gPiBmaXggaXQgb24gdGhlIGNsaWVudCwgYW5kIHdlJ3Jl IG5vdCBldmVuIGdvaW5nIHRvIHRyeS4NCj4gDQo+IE9mIGNvdXJzZS4gIEJ1dCB5b3UgYWxzbyBk b24ndCB3YW50IHRvIHVubmVjZXNzYXJpbHkgbGVhdmUgdGhlDQo+IGNsaWVudCB3aXRoIGFuIGlu dmFsaWQgaW5vZGUgdGhhdCdzIG5vdCBwcm9wZXJseSBmbGFnZ2VkIGFuZA0KPiBwb3NzaWJseSBs ZWF2ZSBhbm90aGVyIGJ1ZyB1bmZpeGVkLg0KDQpXaHkgaXMgdGhlIGludmFsaWQgaW5vZGUgbm90 IGJlaW5nIGZsYWdnZWQgYSBwcm9ibGVtIHRoYXQgbmVlZHMgdG8gYmUNCnNvbHZlZD8NCkhvdyBk b2VzIHRoZSBwYXRjaCB0aGF0IHlvdSd2ZSBwcm9wb3NlZCBjaGFuZ2UgbWF0dGVycyBmb3IgdGhl IGVuZCB1c2VyDQpvciBhcHBsaWNhdGlvbj8NCg0KLS0gDQpUcm9uZCBNeWtsZWJ1c3QNCkxpbnV4 IE5GUyBjbGllbnQgbWFpbnRhaW5lcg0KDQpOZXRBcHANClRyb25kLk15a2xlYnVzdEBuZXRhcHAu Y29tDQp3d3cubmV0YXBwLmNvbQ0K -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 31 Jul 2013 18:00:51 -0500 Quentin Barnes <qbarnes@gmail.com> wrote: > > Quite frankly, all I care about in a situation like this is that the > > client doesn't Oops. > > Certainly, and his patch does do that, but it's also pointing out > there's another bug running around. And once you fix that bug, the > original patch is no longer needed. > > > If a server is really this utterly broken, then there is no way we can > > fix it on the client, and we're not even going to try. > > Of course. But you also don't want to unnecessarily leave the > client with an invalid inode that's not properly flagged and > possibly leave another bug unfixed. > > Here is a example patch that I feel better addresses the problem: > > > commit 2d6b411eea04ae4398707b584b8d9e552606aaf7 > Author: Quentin Barnes <qbarnes@yahoo-inc.com> > Date: Wed Jul 31 17:50:35 2013 -0500 > > Have nfs_refresh_inode_locked() ensure that it doesn't return without > flagging invalid inodes (ones that don't match its fattr type). > > nfs_refresh_inode() already does this, so we need to check the return > status from nfs_check_inode_attributes() before returning from > nfs_refresh_inode_locked(). > > Once this hole is plugged, there will be no invalid inode references > returned by nfs_fhget(), so no need to check in nfs_find_actor(). > > diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c > index af6e806..d2263a5 100644 > --- a/fs/nfs/inode.c > +++ b/fs/nfs/inode.c > @@ -244,8 +244,6 @@ nfs_find_actor(struct inode *inode, void *opaque) > > if (NFS_FILEID(inode) != fattr->fileid) > return 0; > - if ((S_IFMT & inode->i_mode) != (S_IFMT & fattr->mode)) > - return 0; > if (nfs_compare_fh(NFS_FH(inode), fh)) > return 0; > if (is_bad_inode(inode) || NFS_STALE(inode)) > @@ -1269,9 +1267,16 @@ static int nfs_inode_attrs_need_update(const struct inode *inode, const struct n > > static int nfs_refresh_inode_locked(struct inode *inode, struct nfs_fattr *fattr) > { > + int status; > + > if (nfs_inode_attrs_need_update(inode, fattr)) > return nfs_update_inode(inode, fattr); > - return nfs_check_inode_attributes(inode, fattr); > + > + status = nfs_check_inode_attributes(inode, fattr); > + if (status) > + nfs_invalidate_inode(inode); > + > + return status; > } > > /** I don't think that's going to fix the problem. The issue is that the i_fops are set when the inode is in I_NEW state, and are never changed. An inode is only I_NEW when it's first created. In this case, we did an atomic open against a filename and got back attributes that trick the code into matching the new dentry up with an inode that we previously found to be a directory. The patch above doesn't address that. It just adds an extra cache invalidation, which won't help this case. The ops are still set the same way and the teardown of the atomic_open will still end up hitting the panic.
On Wed, Jul 31, 2013 at 7:46 PM, Jeff Layton <jlayton@redhat.com> wrote: > On Wed, 31 Jul 2013 18:00:51 -0500 > Quentin Barnes <qbarnes@gmail.com> wrote: > >> > Quite frankly, all I care about in a situation like this is that the >> > client doesn't Oops. >> >> Certainly, and his patch does do that, but it's also pointing out >> there's another bug running around. And once you fix that bug, the >> original patch is no longer needed. >> >> > If a server is really this utterly broken, then there is no way we can >> > fix it on the client, and we're not even going to try. >> >> Of course. But you also don't want to unnecessarily leave the >> client with an invalid inode that's not properly flagged and >> possibly leave another bug unfixed. >> >> Here is a example patch that I feel better addresses the problem: >> >> >> commit 2d6b411eea04ae4398707b584b8d9e552606aaf7 >> Author: Quentin Barnes <qbarnes@yahoo-inc.com> >> Date: Wed Jul 31 17:50:35 2013 -0500 >> >> Have nfs_refresh_inode_locked() ensure that it doesn't return without >> flagging invalid inodes (ones that don't match its fattr type). >> >> nfs_refresh_inode() already does this, so we need to check the return >> status from nfs_check_inode_attributes() before returning from >> nfs_refresh_inode_locked(). >> >> Once this hole is plugged, there will be no invalid inode references >> returned by nfs_fhget(), so no need to check in nfs_find_actor(). >> >> diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c >> index af6e806..d2263a5 100644 >> --- a/fs/nfs/inode.c >> +++ b/fs/nfs/inode.c >> @@ -244,8 +244,6 @@ nfs_find_actor(struct inode *inode, void *opaque) >> >> if (NFS_FILEID(inode) != fattr->fileid) >> return 0; >> - if ((S_IFMT & inode->i_mode) != (S_IFMT & fattr->mode)) >> - return 0; >> if (nfs_compare_fh(NFS_FH(inode), fh)) >> return 0; >> if (is_bad_inode(inode) || NFS_STALE(inode)) >> @@ -1269,9 +1267,16 @@ static int nfs_inode_attrs_need_update(const struct inode *inode, const struct n >> >> static int nfs_refresh_inode_locked(struct inode *inode, struct nfs_fattr *fattr) >> { >> + int status; >> + >> if (nfs_inode_attrs_need_update(inode, fattr)) >> return nfs_update_inode(inode, fattr); >> - return nfs_check_inode_attributes(inode, fattr); >> + >> + status = nfs_check_inode_attributes(inode, fattr); >> + if (status) >> + nfs_invalidate_inode(inode); >> + >> + return status; >> } >> >> /** > > I don't think that's going to fix the problem. > > The issue is that the i_fops are set when the inode is in I_NEW state, > and are never changed. An inode is only I_NEW when it's first created. > In this case, we did an atomic open against a filename and got back > attributes that trick the code into matching the new dentry up with an > inode that we previously found to be a directory. Oh, yes! The original reported mentioned an atomic open. We should look at it from that NFSv4 perspective. (I know I tend to inadvertently focus too much on the NFSv3 paths since I know those best. Sigh.) > The patch above doesn't address that. I disagree. Let's take a look at it from the NFSv4 angle -- The NFSv4 atomic open call path is: nfs4_atomic_open()->nfs4_do_open()->_nfs4_do_open() In _nfs4_do_open(), it calls nfs4_do_setattr() which fills in an fattr structure then calls nfs_post_op_update_inode() with it to update the inode. Then we call nfs_post_op_update_inode()-> nfs_post_op_update_inode_locked()->nfs_refresh_inode_locked(). So, yes, I think even in Benny's case, fixing nfs_refresh_inode_locked() to invalidate an inode when nfs_check_inode_attributes() catches that an NFS server had pulled a switcheroo (dir to file with same fh), the approach above would have caught it when it happened. > It just adds an extra cache > invalidation, which won't help this case. The ops are still set the > same way and the teardown of the atomic_open will still end up hitting > the panic. It would help me if you could explain how flagging the inode as invalid doesn't help here. If that's the case, then other bugs might be running around. > -- > Jeff Layton <jlayton@redhat.com> -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jul 31, 2013 at 6:47 PM, Myklebust, Trond <Trond.Myklebust@netapp.com> wrote: > On Wed, 2013-07-31 at 18:00 -0500, Quentin Barnes wrote: >> > Quite frankly, all I care about in a situation like this is that the >> > client doesn't Oops. >> >> Certainly, and his patch does do that, but it's also pointing out >> there's another bug running around. And once you fix that bug, the >> original patch is no longer needed. >> >> > If a server is really this utterly broken, then there is no way we can >> > fix it on the client, and we're not even going to try. >> >> Of course. But you also don't want to unnecessarily leave the >> client with an invalid inode that's not properly flagged and >> possibly leave another bug unfixed. > > Why is the invalid inode not being flagged a problem that needs to be > solved? > How does the patch that you've proposed change matters for the end user > or application? Ah, good questions! Let's see if I can address them to your satisfaction. > Why is the invalid inode not being flagged a problem that needs to be > solved? It's giving nfs_refresh_inode_locked() a consistent behavior. If nfs_refresh_inode_locked() has two paths depending on if nfs_inode_attrs_need_update() happens to be true or not, calling either nfs_update_inode() or nfs_check_inode_attributes(). If nfs_refresh_inode_locked() happens to go down the nfs_update_inode() [which could simply be due to nothing other than timing], the problem of having an NFS server reusing a file handle would be caught, the inode flagged, and the error bubbled up to the caller. However, without the proposed change, if the path taken is to nfs_check_inode_attributes(), the error it caught is silently ignored (which I assert led to Benny's reported panic). > How does the patch that you've proposed change matters for the end user > or application? With the original patch, user space could either get back from a syscall on the file either a failure or success depending on how or when the nfs subsystem noticed that the NFS server had reused a file handle. With the new patch, user space will consistently always get a failure. > -- > Trond Myklebust > Linux NFS client maintainer > > NetApp > Trond.Myklebust@netapp.com > www.netapp.com -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 2 Aug 2013 12:58:51 -0500 Quentin Barnes <qbarnes@gmail.com> wrote: > On Wed, Jul 31, 2013 at 7:46 PM, Jeff Layton <jlayton@redhat.com> wrote: > > On Wed, 31 Jul 2013 18:00:51 -0500 > > Quentin Barnes <qbarnes@gmail.com> wrote: > > > >> > Quite frankly, all I care about in a situation like this is that the > >> > client doesn't Oops. > >> > >> Certainly, and his patch does do that, but it's also pointing out > >> there's another bug running around. And once you fix that bug, the > >> original patch is no longer needed. > >> > >> > If a server is really this utterly broken, then there is no way we can > >> > fix it on the client, and we're not even going to try. > >> > >> Of course. But you also don't want to unnecessarily leave the > >> client with an invalid inode that's not properly flagged and > >> possibly leave another bug unfixed. > >> > >> Here is a example patch that I feel better addresses the problem: > >> > >> > >> commit 2d6b411eea04ae4398707b584b8d9e552606aaf7 > >> Author: Quentin Barnes <qbarnes@yahoo-inc.com> > >> Date: Wed Jul 31 17:50:35 2013 -0500 > >> > >> Have nfs_refresh_inode_locked() ensure that it doesn't return without > >> flagging invalid inodes (ones that don't match its fattr type). > >> > >> nfs_refresh_inode() already does this, so we need to check the return > >> status from nfs_check_inode_attributes() before returning from > >> nfs_refresh_inode_locked(). > >> > >> Once this hole is plugged, there will be no invalid inode references > >> returned by nfs_fhget(), so no need to check in nfs_find_actor(). > >> > >> diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c > >> index af6e806..d2263a5 100644 > >> --- a/fs/nfs/inode.c > >> +++ b/fs/nfs/inode.c > >> @@ -244,8 +244,6 @@ nfs_find_actor(struct inode *inode, void *opaque) > >> > >> if (NFS_FILEID(inode) != fattr->fileid) > >> return 0; > >> - if ((S_IFMT & inode->i_mode) != (S_IFMT & fattr->mode)) > >> - return 0; > >> if (nfs_compare_fh(NFS_FH(inode), fh)) > >> return 0; > >> if (is_bad_inode(inode) || NFS_STALE(inode)) > >> @@ -1269,9 +1267,16 @@ static int nfs_inode_attrs_need_update(const struct inode *inode, const struct n > >> > >> static int nfs_refresh_inode_locked(struct inode *inode, struct nfs_fattr *fattr) > >> { > >> + int status; > >> + > >> if (nfs_inode_attrs_need_update(inode, fattr)) > >> return nfs_update_inode(inode, fattr); > >> - return nfs_check_inode_attributes(inode, fattr); > >> + > >> + status = nfs_check_inode_attributes(inode, fattr); > >> + if (status) > >> + nfs_invalidate_inode(inode); > >> + > >> + return status; > >> } > >> > >> /** > > > > I don't think that's going to fix the problem. > > > > The issue is that the i_fops are set when the inode is in I_NEW state, > > and are never changed. An inode is only I_NEW when it's first created. > > In this case, we did an atomic open against a filename and got back > > attributes that trick the code into matching the new dentry up with an > > inode that we previously found to be a directory. > > Oh, yes! The original reported mentioned an atomic open. We should > look at it from that NFSv4 perspective. > > (I know I tend to inadvertently focus too much on the NFSv3 paths > since I know those best. Sigh.) > > > The patch above doesn't address that. > > I disagree. > > Let's take a look at it from the NFSv4 angle -- > > The NFSv4 atomic open call path is: > nfs4_atomic_open()->nfs4_do_open()->_nfs4_do_open() > > In _nfs4_do_open(), it calls nfs4_do_setattr() which fills in an > fattr structure then calls nfs_post_op_update_inode() with it to > update the inode. Then we call nfs_post_op_update_inode()-> > nfs_post_op_update_inode_locked()->nfs_refresh_inode_locked(). > > So, yes, I think even in Benny's case, fixing nfs_refresh_inode_locked() > to invalidate an inode when nfs_check_inode_attributes() catches that > an NFS server had pulled a switcheroo (dir to file with same fh), the > approach above would have caught it when it happened. > No, it wouldn't have. First, nfs4_do_setattr only gets called on an O_EXCL open. That bug is still triggerable even without that flag set in the open() call. Second, opening a directory is different from opening a file. They create different open contexts and tearing those down at close time requires a different set of operations. If you use the *same* inode in this situation, then you end up with the wrong set of operations for tearing down the open context. That's what causes the panic. So, marking the inode as stale makes no difference whatsoever because we still have to tear down the context, and that gets done using the fops associated with the inode. > > It just adds an extra cache > > invalidation, which won't help this case. The ops are still set the > > same way and the teardown of the atomic_open will still end up hitting > > the panic. > > It would help me if you could explain how flagging the inode as > invalid doesn't help here. If that's the case, then other bugs > might be running around. > That just flags the inode as stale and marks its attributes as invalid. The inode still has the same operations. Aside from that, I'm not convinced that marking that inode as stale is correct at all. A stale inode has a pretty well-defined meaning in NFS. It means that the server doesn't know what this filehandle refers to. That doesn't seem to be the case here. We (likely) have a filehandle that got recycled too quickly, or something along those lines. That's indicative of a broken server, but there's nothing to suggest that the server will return NFSERR_STALE on it in the future. If you're serious about "fixing" this problem then I suggest: 1) describing what the problem is. We have a broken server here so all bets are off in working with it. Why should we care whether it's more "correct" to do what you suggest? What breakage can we avoid by doing this differently? 2) consider rolling up a pynfs testcase or something that simulates such a broken server, so you can verify that the original bug is still fixed with the patch you propose.
On Fri, Aug 02, 2013 at 02:29:24PM -0400, Jeff Layton wrote: > On Fri, 2 Aug 2013 12:58:51 -0500 > Quentin Barnes <qbarnes@gmail.com> wrote: > > On Wed, Jul 31, 2013 at 7:46 PM, Jeff Layton <jlayton@redhat.com> wrote: > > > The patch above doesn't address that. > > > > I disagree. > > > > Let's take a look at it from the NFSv4 angle -- > > > > The NFSv4 atomic open call path is: > > nfs4_atomic_open()->nfs4_do_open()->_nfs4_do_open() > > > > In _nfs4_do_open(), it calls nfs4_do_setattr() which fills in an > > fattr structure then calls nfs_post_op_update_inode() with it to > > update the inode. Then we call nfs_post_op_update_inode()-> > > nfs_post_op_update_inode_locked()->nfs_refresh_inode_locked(). > > > > So, yes, I think even in Benny's case, fixing nfs_refresh_inode_locked() > > to invalidate an inode when nfs_check_inode_attributes() catches that > > an NFS server had pulled a switcheroo (dir to file with same fh), the > > approach above would have caught it when it happened. > > No, it wouldn't have. > > First, nfs4_do_setattr only gets called on an O_EXCL open. That bug is > still triggerable even without that flag set in the open() call. I'll have to look at that path. Thanks for the additional details on the conditions triggering of the problem. > Second, opening a directory is different from opening a file. They > create different open contexts and tearing those down at close time > requires a different set of operations. Of course. > If you use the *same* inode in > this situation, then you end up with the wrong set of operations for > tearing down the open context. Yes, I see your point. I'm not seeing that so far. I see it only looking at the inode's parent directory or the superblock, but I'm still reading. > That's what causes the panic. So, marking the inode as stale makes no > difference whatsoever because we still have to tear down the context, > and that gets done using the fops associated with the inode. I know what you mean. I'm thinking it won't get that far, but I know I'm still trying to get familiar with the NFSv4 paths. > > > It just adds an extra cache > > > invalidation, which won't help this case. The ops are still set the > > > same way and the teardown of the atomic_open will still end up hitting > > > the panic. > > > > It would help me if you could explain how flagging the inode as > > invalid doesn't help here. If that's the case, then other bugs > > might be running around. > > That just flags the inode as stale and marks its attributes as invalid. > The inode still has the same operations. Aside from that, I'm not > convinced that marking that inode as stale is correct at all. Marking the inode as stale _may not_ actually be correct as you assert. However, that's exactly how this error is already being handled when encountered by nfs_update_inode(). Do you consider nfs_update_inode() inappropriate in this regards as well, or is there a separate reason why how it handles the error is valid? > A stale inode has a pretty well-defined meaning in NFS. It means that > the server doesn't know what this filehandle refers to. That doesn't > seem to be the case here. We (likely) have a filehandle that got > recycled too quickly, or something along those lines. That's indicative > of a broken server, but there's nothing to suggest that the server will > return NFSERR_STALE on it in the future. I would think of the NFS errors available with this situation, ESTALE would fit about the best. Even though the file handle was reused by the server, the server no longer has the original file that went with that filehandle, hence that reference is stale. In nfs_check_inode_attributes(), for this error (reusing a FH), it returns EIO (instead of ESTALE). Given those two, I think the latter fits this weird situation a little better. > If you're serious about "fixing" this problem then I suggest: > > 1) describing what the problem is. We have a broken server here so all > bets are off in working with it. Why should we care whether it's more > "correct" to do what you suggest? What breakage can we avoid by doing > this differently? I have been somewhat surprised by your comment. I can certainly understand, appreciate, and agree the skepticism of different approaches, however, if a better approach is available and proven, it should almost always be desired (unless, for some odd reason, say, it triggers a massive rewrite). (As to why I originally flagged this patch, I don't want to trigger a tangent here, but if you're interested, I can send you an email.) I do have a question for you about my example patch though. If you completely ignore the portion modifying nfs_find_actor() and only focus on the modification to nfs_refresh_inode_locked(), do you think that change fixes an existing problem of how the error return state from nfs_check_inode_attributes() is currently unchecked (calling from nfs_fhget())? > 2) consider rolling up a pynfs testcase or something that simulates > such a broken server, so you can verify that the original bug is still > fixed with the patch you propose. Oh, yes, before a patch such as mine is accepted, I should definitely have to prove that it still fixes the original panic with hard evidence. No argument there. I would expect nothing less. To emulate the bug, I was considering writing a systemtap script to monitor the NFSv4 protocols and then for a known file handle for a directory on a later call forge up a return for a regular file. Would you expect that this approach (with some tweaking as needed to get the protocol sequence right) be able to trigger the original panic on the kernel version it was reported on? To write the code to trigger the sequence of NFSv4 protocols, I would expect to do say a stat() on a directory to trigger a LOOKUP to load the dentry and inode, then do an open(...,O_CREAT|O_DIRECT) to trigger an OPEN intercepting its return information overwriting and forging it to look like a regular file. Is that the correct sequence? What variations should also be tested? > -- > Jeff Layton <jlayton@redhat.com> Quentin -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c index af6e806..d2263a5 100644 --- a/fs/nfs/inode.c +++ b/fs/nfs/inode.c @@ -244,8 +244,6 @@ nfs_find_actor(struct inode *inode, void *opaque) if (NFS_FILEID(inode) != fattr->fileid) return 0; - if ((S_IFMT & inode->i_mode) != (S_IFMT & fattr->mode)) - return 0; if (nfs_compare_fh(NFS_FH(inode), fh)) return 0; if (is_bad_inode(inode) || NFS_STALE(inode)) @@ -1269,9 +1267,16 @@ static int nfs_inode_attrs_need_update(const struct inode *inode, const struct n static int nfs_refresh_inode_locked(struct inode *inode, struct nfs_fattr *fattr) { + int status; + if (nfs_inode_attrs_need_update(inode, fattr)) return nfs_update_inode(inode, fattr); - return nfs_check_inode_attributes(inode, fattr); + + status = nfs_check_inode_attributes(inode, fattr); + if (status) + nfs_invalidate_inode(inode); + + return status; } /**