Message ID | 599EE56B-46DD-411B-805D-11C2FB5E30A4@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 4 Nov 2016, at 12:02, Benjamin Coddington wrote: > Hi Trond, > > On 22 Sep 2016, at 13:39, Trond Myklebust wrote: > >> Right now, we're only running TEST/FREE_STATEID on the locks if >> the open stateid recovery succeeds. The protocol requires us to >> always do so. >> The fix would be to move the call to TEST/FREE_STATEID and do it >> before we attempt open recovery. >> >> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> >> --- >> fs/nfs/nfs4proc.c | 92 >> +++++++++++++++++++++++++++++++------------------------ >> 1 file changed, 52 insertions(+), 40 deletions(-) >> >> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c >> index 3c1b8cb7dd95..33ca6d768bd2 100644 >> --- a/fs/nfs/nfs4proc.c >> +++ b/fs/nfs/nfs4proc.c >> @@ -2486,6 +2486,45 @@ static void >> nfs41_check_delegation_stateid(struct nfs4_state *state) >> } >> >> /** >> + * nfs41_check_expired_locks - possibly free a lock stateid >> + * >> + * @state: NFSv4 state for an inode >> + * >> + * Returns NFS_OK if recovery for this stateid is now finished. >> + * Otherwise a negative NFS4ERR value is returned. >> + */ >> +static int nfs41_check_expired_locks(struct nfs4_state *state) >> +{ >> + int status, ret = NFS_OK; >> + struct nfs4_lock_state *lsp; >> + struct nfs_server *server = NFS_SERVER(state->inode); >> + >> + if (!test_bit(LK_STATE_IN_USE, &state->flags)) >> + goto out; >> + list_for_each_entry(lsp, &state->lock_states, ls_locks) { >> + if (test_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags)) { > > I bisected a crash to this patch (commit > c5896fc8622d57b31e1e98545d67d7089019e478). > I thought the problem was that this patch moved this path out from > under the > nfsi->rwsem in nfs4_reclaim_locks() so it ends up with a freed > nfs4_lock_state here. > > I can reproduce this with generic/089. Any ideas? Hit this on v4.9-rc4 this morning. This probably needs to take the state_lock before traversing the lock_states list. I guess we've never hit this before because the old path would serialize things somehow - maybe via taking flc_lock in nfs4_reclaim_locks().. I'll test that fix. Ben -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 7 Nov 2016, at 8:09, Benjamin Coddington wrote: > On 4 Nov 2016, at 12:02, Benjamin Coddington wrote: > >> Hi Trond, >> >> On 22 Sep 2016, at 13:39, Trond Myklebust wrote: >> >>> Right now, we're only running TEST/FREE_STATEID on the locks if >>> the open stateid recovery succeeds. The protocol requires us to >>> always do so. >>> The fix would be to move the call to TEST/FREE_STATEID and do it >>> before we attempt open recovery. >>> >>> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> >>> --- >>> fs/nfs/nfs4proc.c | 92 >>> +++++++++++++++++++++++++++++++------------------------ >>> 1 file changed, 52 insertions(+), 40 deletions(-) >>> >>> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c >>> index 3c1b8cb7dd95..33ca6d768bd2 100644 >>> --- a/fs/nfs/nfs4proc.c >>> +++ b/fs/nfs/nfs4proc.c >>> @@ -2486,6 +2486,45 @@ static void >>> nfs41_check_delegation_stateid(struct nfs4_state *state) >>> } >>> >>> /** >>> + * nfs41_check_expired_locks - possibly free a lock stateid >>> + * >>> + * @state: NFSv4 state for an inode >>> + * >>> + * Returns NFS_OK if recovery for this stateid is now finished. >>> + * Otherwise a negative NFS4ERR value is returned. >>> + */ >>> +static int nfs41_check_expired_locks(struct nfs4_state *state) >>> +{ >>> + int status, ret = NFS_OK; >>> + struct nfs4_lock_state *lsp; >>> + struct nfs_server *server = NFS_SERVER(state->inode); >>> + >>> + if (!test_bit(LK_STATE_IN_USE, &state->flags)) >>> + goto out; >>> + list_for_each_entry(lsp, &state->lock_states, ls_locks) { >>> + if (test_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags)) { >> >> I bisected a crash to this patch (commit >> c5896fc8622d57b31e1e98545d67d7089019e478). >> I thought the problem was that this patch moved this path out from >> under the >> nfsi->rwsem in nfs4_reclaim_locks() so it ends up with a freed >> nfs4_lock_state here. >> >> I can reproduce this with generic/089. Any ideas? > > Hit this on v4.9-rc4 this morning. This probably needs to take the > state_lock before traversing the lock_states list. I guess we've > never hit > this before because the old path would serialize things somehow - > maybe via > taking flc_lock in nfs4_reclaim_locks().. I'll test that fix. Well, that's no good either as it gets stuck in a NFS4ERR_OLD_STATEID loop in recovery since we'd want to retry in that case, but taking the state_lock means we won't use the new stateid. So maybe we need both the state_lock to protect the list and the rwsem to stop new locks from being sent. I'll try that now. Ben -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 7 Nov 2016, at 8:45, Benjamin Coddington wrote: > > On 7 Nov 2016, at 8:09, Benjamin Coddington wrote: > >> On 4 Nov 2016, at 12:02, Benjamin Coddington wrote: >> >>> Hi Trond, >>> >>> On 22 Sep 2016, at 13:39, Trond Myklebust wrote: >>> >>>> Right now, we're only running TEST/FREE_STATEID on the locks if >>>> the open stateid recovery succeeds. The protocol requires us to >>>> always do so. >>>> The fix would be to move the call to TEST/FREE_STATEID and do it >>>> before we attempt open recovery. >>>> >>>> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> >>>> --- >>>> fs/nfs/nfs4proc.c | 92 >>>> +++++++++++++++++++++++++++++++------------------------ >>>> 1 file changed, 52 insertions(+), 40 deletions(-) >>>> >>>> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c >>>> index 3c1b8cb7dd95..33ca6d768bd2 100644 >>>> --- a/fs/nfs/nfs4proc.c >>>> +++ b/fs/nfs/nfs4proc.c >>>> @@ -2486,6 +2486,45 @@ static void >>>> nfs41_check_delegation_stateid(struct nfs4_state *state) >>>> } >>>> >>>> /** >>>> + * nfs41_check_expired_locks - possibly free a lock stateid >>>> + * >>>> + * @state: NFSv4 state for an inode >>>> + * >>>> + * Returns NFS_OK if recovery for this stateid is now finished. >>>> + * Otherwise a negative NFS4ERR value is returned. >>>> + */ >>>> +static int nfs41_check_expired_locks(struct nfs4_state *state) >>>> +{ >>>> + int status, ret = NFS_OK; >>>> + struct nfs4_lock_state *lsp; >>>> + struct nfs_server *server = NFS_SERVER(state->inode); >>>> + >>>> + if (!test_bit(LK_STATE_IN_USE, &state->flags)) >>>> + goto out; >>>> + list_for_each_entry(lsp, &state->lock_states, ls_locks) { >>>> + if (test_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags)) { >>> >>> I bisected a crash to this patch (commit >>> c5896fc8622d57b31e1e98545d67d7089019e478). >>> I thought the problem was that this patch moved this path out from >>> under the >>> nfsi->rwsem in nfs4_reclaim_locks() so it ends up with a freed >>> nfs4_lock_state here. >>> >>> I can reproduce this with generic/089. Any ideas? >> >> Hit this on v4.9-rc4 this morning. This probably needs to take the >> state_lock before traversing the lock_states list. I guess we've >> never hit >> this before because the old path would serialize things somehow - >> maybe via >> taking flc_lock in nfs4_reclaim_locks().. I'll test that fix. > > Well, that's no good either as it gets stuck in a NFS4ERR_OLD_STATEID > loop > in recovery since we'd want to retry in that case, but taking the > state_lock > means we won't use the new stateid. So maybe we need both the > state_lock to > protect the list and the rwsem to stop new locks from being sent. > I'll try > that now. That one got much further, but eventually soft-locked up on the state_lock when what looks like the state manager needed to have a TEST_STATEID wait on another lock to complete. The other question here is why are we doing recovery so much? It seems like we're sending FREE_STATEID unnecessarily on successful DELEGRETURN and LOCKU, but that shouldn't be triggering state recovery.. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 7 Nov 2016, at 9:59, Trond Myklebust wrote: > On Nov 7, 2016, at 09:50, Benjamin Coddington > <bcodding@redhat.com<mailto:bcodding@redhat.com>> wrote: > > On 7 Nov 2016, at 8:45, Benjamin Coddington wrote: > > On 7 Nov 2016, at 8:09, Benjamin Coddington wrote: > > On 4 Nov 2016, at 12:02, Benjamin Coddington wrote: > > Hi Trond, > > On 22 Sep 2016, at 13:39, Trond Myklebust wrote: > > Right now, we're only running TEST/FREE_STATEID on the locks if > the open stateid recovery succeeds. The protocol requires us to > always do so. > The fix would be to move the call to TEST/FREE_STATEID and do it > before we attempt open recovery. > > Signed-off-by: Trond Myklebust > <trond.myklebust@primarydata.com<mailto:trond.myklebust@primarydata.com>> > --- > fs/nfs/nfs4proc.c | 92 > +++++++++++++++++++++++++++++++------------------------ > 1 file changed, 52 insertions(+), 40 deletions(-) > > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c > index 3c1b8cb7dd95..33ca6d768bd2 100644 > --- a/fs/nfs/nfs4proc.c > +++ b/fs/nfs/nfs4proc.c > @@ -2486,6 +2486,45 @@ static void > nfs41_check_delegation_stateid(struct nfs4_state *state) > } > > /** > + * nfs41_check_expired_locks - possibly free a lock stateid > + * > + * @state: NFSv4 state for an inode > + * > + * Returns NFS_OK if recovery for this stateid is now finished. > + * Otherwise a negative NFS4ERR value is returned. > + */ > +static int nfs41_check_expired_locks(struct nfs4_state *state) > +{ > + int status, ret = NFS_OK; > + struct nfs4_lock_state *lsp; > + struct nfs_server *server = NFS_SERVER(state->inode); > + > + if (!test_bit(LK_STATE_IN_USE, &state->flags)) > + goto out; > + list_for_each_entry(lsp, &state->lock_states, ls_locks) { > + if (test_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags)) { > > I bisected a crash to this patch (commit > c5896fc8622d57b31e1e98545d67d7089019e478). > I thought the problem was that this patch moved this path out from > under the > nfsi->rwsem in nfs4_reclaim_locks() so it ends up with a freed > nfs4_lock_state here. > > I can reproduce this with generic/089. Any ideas? > > Hit this on v4.9-rc4 this morning. This probably needs to take the > state_lock before traversing the lock_states list. I guess we've > never hit > this before because the old path would serialize things somehow - > maybe via > taking flc_lock in nfs4_reclaim_locks().. I'll test that fix. > > Well, that's no good either as it gets stuck in a NFS4ERR_OLD_STATEID > loop > in recovery since we'd want to retry in that case, but taking the > state_lock > means we won't use the new stateid. So maybe we need both the > state_lock to > protect the list and the rwsem to stop new locks from being sent. > I'll try > that now. > > That one got much further, but eventually soft-locked up on the > state_lock > when what looks like the state manager needed to have a TEST_STATEID > wait on > another lock to complete. > > The other question here is why are we doing recovery so much? It > seems like > we're sending FREE_STATEID unnecessarily on successful DELEGRETURN and > LOCKU, but that shouldn't be triggering state recovery.. > > FREE_STATEID is required by the protocol after LOCKU, so that’s > intentional. I thought it wasn't required if we do CLOSE, but I checked again and that wasn't what I was seeing. I am seeing LOCKU, FREE_STATEID, CLOSE, so it is correct. > It isn’t needed after DELEGRETURN, so I’m not sure why that is > happening. I think it's just falling through the case statement in nfs4_delegreturn_done(). I'll send a patch for that. I think the fix here is to manually increment ls_count for the current lock state, and expect that the lock_states list can be modified while we walk it. I'll send a patch for that too if it runs though testing OK. Ben -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Ben, On 11/08/2016 10:10 AM, Benjamin Coddington wrote: > > > On 7 Nov 2016, at 9:59, Trond Myklebust wrote: > >> On Nov 7, 2016, at 09:50, Benjamin Coddington <bcodding@redhat.com<mailto:bcodding@redhat.com>> wrote: >> >> On 7 Nov 2016, at 8:45, Benjamin Coddington wrote: >> >> On 7 Nov 2016, at 8:09, Benjamin Coddington wrote: >> >> On 4 Nov 2016, at 12:02, Benjamin Coddington wrote: >> >> Hi Trond, >> >> On 22 Sep 2016, at 13:39, Trond Myklebust wrote: >> >> Right now, we're only running TEST/FREE_STATEID on the locks if >> the open stateid recovery succeeds. The protocol requires us to >> always do so. >> The fix would be to move the call to TEST/FREE_STATEID and do it >> before we attempt open recovery. >> >> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com<mailto:trond.myklebust@primarydata.com>> >> --- >> fs/nfs/nfs4proc.c | 92 +++++++++++++++++++++++++++++++------------------------ >> 1 file changed, 52 insertions(+), 40 deletions(-) >> >> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c >> index 3c1b8cb7dd95..33ca6d768bd2 100644 >> --- a/fs/nfs/nfs4proc.c >> +++ b/fs/nfs/nfs4proc.c >> @@ -2486,6 +2486,45 @@ static void nfs41_check_delegation_stateid(struct nfs4_state *state) >> } >> >> /** >> + * nfs41_check_expired_locks - possibly free a lock stateid >> + * >> + * @state: NFSv4 state for an inode >> + * >> + * Returns NFS_OK if recovery for this stateid is now finished. >> + * Otherwise a negative NFS4ERR value is returned. >> + */ >> +static int nfs41_check_expired_locks(struct nfs4_state *state) >> +{ >> + int status, ret = NFS_OK; >> + struct nfs4_lock_state *lsp; >> + struct nfs_server *server = NFS_SERVER(state->inode); >> + >> + if (!test_bit(LK_STATE_IN_USE, &state->flags)) >> + goto out; >> + list_for_each_entry(lsp, &state->lock_states, ls_locks) { >> + if (test_bit(NFS_LOCK_INITIALIZED, &lsp->ls_flags)) { >> >> I bisected a crash to this patch (commit c5896fc8622d57b31e1e98545d67d7089019e478). >> I thought the problem was that this patch moved this path out from under the >> nfsi->rwsem in nfs4_reclaim_locks() so it ends up with a freed >> nfs4_lock_state here. >> >> I can reproduce this with generic/089. Any ideas? >> >> Hit this on v4.9-rc4 this morning. This probably needs to take the >> state_lock before traversing the lock_states list. I guess we've never hit >> this before because the old path would serialize things somehow - maybe via >> taking flc_lock in nfs4_reclaim_locks().. I'll test that fix. >> >> Well, that's no good either as it gets stuck in a NFS4ERR_OLD_STATEID loop >> in recovery since we'd want to retry in that case, but taking the state_lock >> means we won't use the new stateid. So maybe we need both the state_lock to >> protect the list and the rwsem to stop new locks from being sent. I'll try >> that now. >> >> That one got much further, but eventually soft-locked up on the state_lock >> when what looks like the state manager needed to have a TEST_STATEID wait on >> another lock to complete. >> >> The other question here is why are we doing recovery so much? It seems like >> we're sending FREE_STATEID unnecessarily on successful DELEGRETURN and >> LOCKU, but that shouldn't be triggering state recovery.. >> >> FREE_STATEID is required by the protocol after LOCKU, so that’s intentional. > > I thought it wasn't required if we do CLOSE, but I checked again and that > wasn't what I was seeing. I am seeing LOCKU, FREE_STATEID, CLOSE, so it is > correct. > >> It isn’t needed after DELEGRETURN, so I’m not sure why that is happening. > > I think it's just falling through the case statement in > nfs4_delegreturn_done(). I'll send a patch for that. > > I think the fix here is to manually increment ls_count for the current lock > state, and expect that the lock_states list can be modified while we walk > it. I'll send a patch for that too if it runs though testing OK. Do you have an estimate for when this patch will be ready? I want to include it in my next bugfix pull request for 4.9. Thanks, Anna > > Ben -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Anna, On 10 Nov 2016, at 10:01, Anna Schumaker wrote: > Do you have an estimate for when this patch will be ready? I want to > include it in my next bugfix pull request for 4.9. I haven't posted because I am still trying to get to the bottom of another problem where the client gets stuck in a loop sending the same stateid over and over on NFS4ERR_OLD_STATEID. I want to make sure this problem isn't caused by this fix -- which I don't think it is, but I'd rather make sure. If I don't make any progress on this problem by the end of today, I'll post what I have. Read on if interested in this new problem: It looks like racing opens with the same openowner can be returned out of order by the server, so the client sees stateid seqid of 2 before 1. Then a LOCK sent with seqid 1 is endlessly retried if sent while doing recovery. It's hard to tell if I was able to capture all the moving parts to describe this problem, though. As it takes a very long time for me to reproduce, and the packet captures were dropping frames. I'm working on manually reproducing it now. Ben -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
T24gVGh1LCAyMDE2LTExLTEwIGF0IDEwOjU4IC0wNTAwLCBCZW5qYW1pbiBDb2RkaW5ndG9uIHdy b3RlOg0KPiBIaSBBbm5hLA0KPiANCj4gT24gMTAgTm92IDIwMTYsIGF0IDEwOjAxLCBBbm5hIFNj aHVtYWtlciB3cm90ZToNCj4gPiANCj4gPiBEbyB5b3UgaGF2ZSBhbiBlc3RpbWF0ZSBmb3Igd2hl biB0aGlzIHBhdGNoIHdpbGwgYmUgcmVhZHk/wqDCoEkgd2FudA0KPiA+IHRvwqANCj4gPiBpbmNs dWRlIGl0IGluIG15IG5leHQgYnVnZml4IHB1bGwgcmVxdWVzdCBmb3IgNC45Lg0KPiANCj4gSSBo YXZlbid0IHBvc3RlZCBiZWNhdXNlIEkgYW0gc3RpbGwgdHJ5aW5nIHRvIGdldCB0byB0aGUgYm90 dG9tIG9mwqANCj4gYW5vdGhlcg0KPiBwcm9ibGVtIHdoZXJlIHRoZSBjbGllbnQgZ2V0cyBzdHVj ayBpbiBhIGxvb3Agc2VuZGluZyB0aGUgc2FtZQ0KPiBzdGF0ZWlkwqANCj4gb3Zlcg0KPiBhbmQg b3ZlciBvbiBORlM0RVJSX09MRF9TVEFURUlELsKgwqBJIHdhbnQgdG8gbWFrZSBzdXJlIHRoaXMg cHJvYmxlbQ0KPiBpc24ndA0KPiBjYXVzZWQgYnkgdGhpcyBmaXggLS0gd2hpY2ggSSBkb24ndCB0 aGluayBpdCBpcywgYnV0IEknZCByYXRoZXIgbWFrZcKgDQo+IHN1cmUuDQo+IElmIEkgZG9uJ3Qg bWFrZSBhbnkgcHJvZ3Jlc3Mgb24gdGhpcyBwcm9ibGVtIGJ5IHRoZSBlbmQgb2YgdG9kYXksDQo+ IEknbGzCoA0KPiBwb3N0DQo+IHdoYXQgSSBoYXZlLg0KPiANCj4gUmVhZCBvbiBpZiBpbnRlcmVz dGVkIGluIHRoaXMgbmV3IHByb2JsZW06DQo+IA0KPiBJdCBsb29rcyBsaWtlIHJhY2luZyBvcGVu cyB3aXRoIHRoZSBzYW1lIG9wZW5vd25lciBjYW4gYmUgcmV0dXJuZWQNCj4gb3V0wqANCj4gb2YN Cj4gb3JkZXIgYnkgdGhlIHNlcnZlciwgc28gdGhlIGNsaWVudCBzZWVzIHN0YXRlaWQgc2VxaWQg b2YgMiBiZWZvcmUNCj4gMS7CoMKgDQo+IFRoZW4gYQ0KPiBMT0NLIHNlbnQgd2l0aCBzZXFpZCAx IGlzIGVuZGxlc3NseSByZXRyaWVkIGlmIHNlbnQgd2hpbGUgZG9pbmfCoA0KPiByZWNvdmVyeS4N Cj7CoA0KDQpXaHkgaXMgaXQgZG9pbmcgdGhhdD/CoG5mczRfbG9ja19wcmVwYXJlKCkgc2hvdWxk IGJlIGNvcHlpbmcgdGhlIHN0YXRlaWQNCmZyb20gdGhlIG5mczRfc3RhdGUgc3RydWN0dXJlIG9u IGVhY2ggaXRlcmF0aW9uLg0KDQpDaGVlcnMNCsKgIFRyb25k -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index b5290fd..2f51200 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -2559,9 +2559,13 @@ static int nfs41_check_open_stateid(struct nfs4_state *state) static int nfs41_open_expired(struct nfs4_state_owner *sp, struct nfs4_state *state) { int status; + struct inode *inode = state->inode; + struct nfs_inode *nfsi = NFS_I(inode); nfs41_check_delegation_stateid(state); + down_write(&nfsi->rwsem); status = nfs41_check_expired_locks(state); + up_write(&nfsi->rwsem); if (status != NFS_OK)