CLOSE/OPEN race
diff mbox

Message ID 8153C306-D6B6-442B-B565-BC6BAEE9993F@redhat.com
State New
Headers show

Commit Message

Benjamin Coddington Nov. 14, 2016, 2:53 p.m. UTC
On 13 Nov 2016, at 9:47, Trond Myklebust wrote:

> On Sat, 2016-11-12 at 21:56 -0500, Jeff Layton wrote:
>> On Sat, 2016-11-12 at 16:16 -0500, Jeff Layton wrote:
>>>
>>> On Sat, 2016-11-12 at 13:03 -0500, Benjamin Coddington wrote:
>>>>
>>>>
>>>> On 12 Nov 2016, at 11:52, Jeff Layton wrote:
>>>>
>>>>>
>>>>>
>>>>> On Sat, 2016-11-12 at 10:31 -0500, Benjamin Coddington wrote:
>>>>>>
>>>>>>
>>>>>> On 12 Nov 2016, at 7:54, Jeff Layton wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Sat, 2016-11-12 at 06:08 -0500, Benjamin Coddington
>>>>>>> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> I've been seeing the following on a modified version of
>>>>>>>> generic/089
>>>>>>>> that gets the client stuck sending LOCK with
>>>>>>>> NFS4ERR_OLD_STATEID.
>>>>>>>>
>>>>>>>> 1. Client has open stateid A, sends a CLOSE
>>>>>>>> 2. Client sends OPEN with same owner
>>>>>>>> 3. Client sends another OPEN with same owner
>>>>>>>> 4. Client gets a reply to OPEN in 3, stateid is B.2
>>>>>>>> (stateid B
>>>>>>>> sequence 2)
>>>>>>>> 5. Client does LOCK,LOCKU,FREE_STATEID from B.2
>>>>>>>> 6. Client gets a reply to CLOSE in 1
>>>>>>>> 7. Client gets reply to OPEN in 2, stateid is B.1
>>>>>>>> 8. Client sends LOCK with B.1 - OLD_STATEID, now stuck in
>>>>>>>> a loop
>>>>>>>>
>>>>>>>> The CLOSE response in 6 causes us to clear
>>>>>>>> NFS_OPEN_STATE, so that
>>>>>>>> the OPEN
>>>>>>>> response in 7 is able to update the open_stateid even
>>>>>>>> though it has a
>>>>>>>> lower
>>>>>>>> sequence number.
>>>>>>>>
>>>>>>>> I think this case could be handled by never updating the
>>>>>>>> open_stateid
>>>>>>>> if the
>>>>>>>> stateids match but the sequence number of the new state
>>>>>>>> is less than
>>>>>>>> the
>>>>>>>> current open_state.
>>>>>>>>
>>>>>>>
>>>>>>> What kernel is this on?
>>>>>>
>>>>>> On v4.9-rc2 with a couple fixups.  Without them, I can't test
>>>>>> long
>>>>>> enough to
>>>>>> reproduce this race.  I don't think any of those are involved
>>>>>> in this
>>>>>> problem, though.
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Yes, that seems wrong. The client should be picking B.2 for
>>>>>>> the open
>>>>>>> stateid to use. I think that decision of whether to take a
>>>>>>> seqid is
>>>>>>> made
>>>>>>> in nfs_need_update_open_stateid. The logic in there looks
>>>>>>> correct to
>>>>>>> me
>>>>>>> at first glance though.
>>>>>>
>>>>>> nfs_need_update_open_stateid() will return true if
>>>>>> NFS_OPEN_STATE is
>>>>>> unset.
>>>>>> That's the precondition set up by steps 1-6.  Perhaps it
>>>>>> should not
>>>>>> update
>>>>>> the stateid if they match but the sequence number is less,
>>>>>> and still set
>>>>>> NFS_OPEN_STATE once more.  That will fix _this_ case.  Are
>>>>>> there other
>>>>>> cases
>>>>>> where that would be a problem?
>>>>>>
>>>>>> Ben
>>>>>
>>>>> That seems wrong.
>>>>
>>>> I'm not sure what you mean: what seems wrong?
>>>>
>>>
>>> Sorry, it seems wrong that the client would issue the LOCK with B.1
>>> there.
>>>
>>>>
>>>>>
>>>>>
>>>>> The only close was sent in step 1, and that was for a
>>>>> completely different stateid (A rather than B). It seems likely
>>>>> that
>>>>> that is where the bug is.
>>>>
>>>> I'm still not sure what point you're trying to make..
>>>>
>>>> Even though the close was sent in step 1, the response wasn't
>>>> processed
>>>> until step 6..
>>>
>>> Not really a point per-se, I was just saying where I think the bug
>>> might
>>> be...
>>>
>>> When you issue a CLOSE, you issue it vs. a particular stateid
>>> (stateid
>>> "A" in this case). Once the open stateid has been superseded by
>>> "B", the
>>> closing of "A" should have no effect.
>>>
>>> Perhaps nfs_clear_open_stateid needs to check and see whether the
>>> open
>>> stateid has been superseded before doing its thing?
>>>
>>
>> Ok, I see something that might be a problem in this call in
>> nfs4_close_done:
>>
>>        nfs_clear_open_stateid(state, &calldata->arg.stateid,
>>                         res_stateid, 
>> calldata->arg.fmode);
>>
>> Note that we pass two nfs4_stateids to this call. The first is the
>> stateid that got sent in the CLOSE call, and the second is the
>> stateid
>> that came back in the CLOSE response.
>>
>> RFC5661 and RFC7530 both indicate that the stateid in a CLOSE
>> response
>> should be ignored.
>>
>> So, I think a patch like this may be in order. As to whether it will
>> fix this bug, I sort of doubt it, but it might not hurt to test it
>> out?
>>
>> ----------------------8<--------------------------
>>
>> [RFC PATCH] nfs: properly ignore the stateid in a CLOSE response
>>
>> Signed-off-by: Jeff Layton 
>> ---
>>  fs/nfs/nfs4proc.c | 14 +++-----------
>>  1 file changed, 3 insertions(+), 11 deletions(-)
>>
>> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
>> index 7897826d7c51..58413bd0aae2 100644
>> --- a/fs/nfs/nfs4proc.c
>> +++ b/fs/nfs/nfs4proc.c
>> @@ -1451,7 +1451,6 @@ static void
>> nfs_resync_open_stateid_locked(struct nfs4_state *state)
>>  }
>>  
>>  static void nfs_clear_open_stateid_locked(struct nfs4_state *state,
>> -		nfs4_stateid *arg_stateid,
>>  		nfs4_stateid *stateid, fmode_t fmode)
>>  {
>>  	clear_bit(NFS_O_RDWR_STATE, &state->flags);
>> @@ -1467,12 +1466,8 @@ static void
>> nfs_clear_open_stateid_locked(struct nfs4_state *state,
>>  		clear_bit(NFS_O_WRONLY_STATE, &state->flags);
>>  		clear_bit(NFS_OPEN_STATE, &state->flags);
>>  	}
>> -	if (stateid == NULL)
>> -		return;
>>  	/* Handle races with OPEN */
>> -	if (!nfs4_stateid_match_other(arg_stateid, &state-
>>> open_stateid) ||
>> -	    (nfs4_stateid_match_other(stateid, &state->open_stateid)
>> &&
>> -	    !nfs4_stateid_is_newer(stateid, &state->open_stateid)))
>> {
>> +	if (!nfs4_stateid_match_other(stateid, &state-
>>> open_stateid)) {
>
> No. I think what we should be doing here is
>
> 1) if (nfs4_stateid_match_other(arg_stateid, &state->open_stateid) 
> then

You must mean (!nfs4_stateid_match_other(arg_stateid, 
&state->open_stateid)

> just ignore the result and return immediately, because it applies to a
> completely different stateid.
>
> 2) if (nfs4_stateid_match_other(stateid, &state->open_stateid)
> && !nfs4_stateid_is_newer(stateid, &state->open_stateid))) then 
> resync,
> because this was likely an OPEN_DOWNGRADE that has raced with one or
> more OPEN calls.

OK, but these do not help the originally reported race because at the 
time
that the CLOSE response handling does a resync - I presume
nfs_resync_open_stateid_locked() - all the state counters are zero, 
which
bypasses resetting NFS_OPEN_STATE, which, if unset, allows any stateid 
to
update the open_stateid.

I need something like:

         if (!nfs4_stateid_match_other(stateid, &state->open_stateid)) {
                 nfs4_stateid_copy(freeme, &state->open_stateid);
                 nfs_test_and_clear_all_open_stateid(state);

I've got that in testing, and I'll let it run for a day or so.

Another way to simplify CLOSE/OPEN races might be to never reuse an
openowner if we send a close.  Any opens after a close would have to use 
a
new mutated openowner, so the state always proceeds from open to close 
and
never back again.  Any reason not to do that?

Ben
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Trond Myklebust Nov. 14, 2016, 4:29 p.m. UTC | #1
On Mon, 2016-11-14 at 09:53 -0500, Benjamin Coddington wrote:
> On 13 Nov 2016, at 9:47, Trond Myklebust wrote:

> 

> > 

> > On Sat, 2016-11-12 at 21:56 -0500, Jeff Layton wrote:

> > > 

> > > On Sat, 2016-11-12 at 16:16 -0500, Jeff Layton wrote:

> > > > 

> > > > 

> > > > On Sat, 2016-11-12 at 13:03 -0500, Benjamin Coddington wrote:

> > > > > 

> > > > > 

> > > > > 

> > > > > On 12 Nov 2016, at 11:52, Jeff Layton wrote:

> > > > > 

> > > > > > 

> > > > > > 

> > > > > > 

> > > > > > On Sat, 2016-11-12 at 10:31 -0500, Benjamin Coddington

> > > > > > wrote:

> > > > > > > 

> > > > > > > 

> > > > > > > 

> > > > > > > On 12 Nov 2016, at 7:54, Jeff Layton wrote:

> > > > > > > 

> > > > > > > > 

> > > > > > > > 

> > > > > > > > 

> > > > > > > > 

> > > > > > > > On Sat, 2016-11-12 at 06:08 -0500, Benjamin Coddington

> > > > > > > > wrote:

> > > > > > > > > 

> > > > > > > > > 

> > > > > > > > > 

> > > > > > > > > 

> > > > > > > > > I've been seeing the following on a modified version

> > > > > > > > > of

> > > > > > > > > generic/089

> > > > > > > > > that gets the client stuck sending LOCK with

> > > > > > > > > NFS4ERR_OLD_STATEID.

> > > > > > > > > 

> > > > > > > > > 1. Client has open stateid A, sends a CLOSE

> > > > > > > > > 2. Client sends OPEN with same owner

> > > > > > > > > 3. Client sends another OPEN with same owner

> > > > > > > > > 4. Client gets a reply to OPEN in 3, stateid is B.2

> > > > > > > > > (stateid B

> > > > > > > > > sequence 2)

> > > > > > > > > 5. Client does LOCK,LOCKU,FREE_STATEID from B.2

> > > > > > > > > 6. Client gets a reply to CLOSE in 1

> > > > > > > > > 7. Client gets reply to OPEN in 2, stateid is B.1

> > > > > > > > > 8. Client sends LOCK with B.1 - OLD_STATEID, now

> > > > > > > > > stuck in

> > > > > > > > > a loop

> > > > > > > > > 

> > > > > > > > > The CLOSE response in 6 causes us to clear

> > > > > > > > > NFS_OPEN_STATE, so that

> > > > > > > > > the OPEN

> > > > > > > > > response in 7 is able to update the open_stateid even

> > > > > > > > > though it has a

> > > > > > > > > lower

> > > > > > > > > sequence number.

> > > > > > > > > 

> > > > > > > > > I think this case could be handled by never updating

> > > > > > > > > the

> > > > > > > > > open_stateid

> > > > > > > > > if the

> > > > > > > > > stateids match but the sequence number of the new

> > > > > > > > > state

> > > > > > > > > is less than

> > > > > > > > > the

> > > > > > > > > current open_state.

> > > > > > > > > 

> > > > > > > > 

> > > > > > > > What kernel is this on?

> > > > > > > 

> > > > > > > On v4.9-rc2 with a couple fixups.  Without them, I can't

> > > > > > > test

> > > > > > > long

> > > > > > > enough to

> > > > > > > reproduce this race.  I don't think any of those are

> > > > > > > involved

> > > > > > > in this

> > > > > > > problem, though.

> > > > > > > 

> > > > > > > > 

> > > > > > > > 

> > > > > > > > 

> > > > > > > > 

> > > > > > > > Yes, that seems wrong. The client should be picking B.2

> > > > > > > > for

> > > > > > > > the open

> > > > > > > > stateid to use. I think that decision of whether to

> > > > > > > > take a

> > > > > > > > seqid is

> > > > > > > > made

> > > > > > > > in nfs_need_update_open_stateid. The logic in there

> > > > > > > > looks

> > > > > > > > correct to

> > > > > > > > me

> > > > > > > > at first glance though.

> > > > > > > 

> > > > > > > nfs_need_update_open_stateid() will return true if

> > > > > > > NFS_OPEN_STATE is

> > > > > > > unset.

> > > > > > > That's the precondition set up by steps 1-6.  Perhaps it

> > > > > > > should not

> > > > > > > update

> > > > > > > the stateid if they match but the sequence number is

> > > > > > > less,

> > > > > > > and still set

> > > > > > > NFS_OPEN_STATE once more.  That will fix _this_

> > > > > > > case.  Are

> > > > > > > there other

> > > > > > > cases

> > > > > > > where that would be a problem?

> > > > > > > 

> > > > > > > Ben

> > > > > > 

> > > > > > That seems wrong.

> > > > > 

> > > > > I'm not sure what you mean: what seems wrong?

> > > > > 

> > > > 

> > > > Sorry, it seems wrong that the client would issue the LOCK with

> > > > B.1

> > > > there.

> > > > 

> > > > > 

> > > > > 

> > > > > > 

> > > > > > 

> > > > > > 

> > > > > > The only close was sent in step 1, and that was for a

> > > > > > completely different stateid (A rather than B). It seems

> > > > > > likely

> > > > > > that

> > > > > > that is where the bug is.

> > > > > 

> > > > > I'm still not sure what point you're trying to make..

> > > > > 

> > > > > Even though the close was sent in step 1, the response wasn't

> > > > > processed

> > > > > until step 6..

> > > > 

> > > > Not really a point per-se, I was just saying where I think the

> > > > bug

> > > > might

> > > > be...

> > > > 

> > > > When you issue a CLOSE, you issue it vs. a particular stateid

> > > > (stateid

> > > > "A" in this case). Once the open stateid has been superseded by

> > > > "B", the

> > > > closing of "A" should have no effect.

> > > > 

> > > > Perhaps nfs_clear_open_stateid needs to check and see whether

> > > > the

> > > > open

> > > > stateid has been superseded before doing its thing?

> > > > 

> > > 

> > > Ok, I see something that might be a problem in this call in

> > > nfs4_close_done:

> > > 

> > >        nfs_clear_open_stateid(state, &calldata->arg.stateid,

> > >                         res_stateid, 

> > > calldata->arg.fmode);

> > > 

> > > Note that we pass two nfs4_stateids to this call. The first is

> > > the

> > > stateid that got sent in the CLOSE call, and the second is the

> > > stateid

> > > that came back in the CLOSE response.

> > > 

> > > RFC5661 and RFC7530 both indicate that the stateid in a CLOSE

> > > response

> > > should be ignored.

> > > 

> > > So, I think a patch like this may be in order. As to whether it

> > > will

> > > fix this bug, I sort of doubt it, but it might not hurt to test

> > > it

> > > out?

> > > 

> > > ----------------------8<--------------------------

> > > 

> > > [RFC PATCH] nfs: properly ignore the stateid in a CLOSE response

> > > 

> > > Signed-off-by: Jeff Layton 

> > > ---

> > >  fs/nfs/nfs4proc.c | 14 +++-----------

> > >  1 file changed, 3 insertions(+), 11 deletions(-)

> > > 

> > > diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c

> > > index 7897826d7c51..58413bd0aae2 100644

> > > --- a/fs/nfs/nfs4proc.c

> > > +++ b/fs/nfs/nfs4proc.c

> > > @@ -1451,7 +1451,6 @@ static void

> > > nfs_resync_open_stateid_locked(struct nfs4_state *state)

> > >  }

> > >  

> > >  static void nfs_clear_open_stateid_locked(struct nfs4_state

> > > *state,

> > > -		nfs4_stateid *arg_stateid,

> > >  		nfs4_stateid *stateid, fmode_t fmode)

> > >  {

> > >  	clear_bit(NFS_O_RDWR_STATE, &state->flags);

> > > @@ -1467,12 +1466,8 @@ static void

> > > nfs_clear_open_stateid_locked(struct nfs4_state *state,

> > >  		clear_bit(NFS_O_WRONLY_STATE, &state->flags);

> > >  		clear_bit(NFS_OPEN_STATE, &state->flags);

> > >  	}

> > > -	if (stateid == NULL)

> > > -		return;

> > >  	/* Handle races with OPEN */

> > > -	if (!nfs4_stateid_match_other(arg_stateid, &state-

> > > > 

> > > > open_stateid) ||

> > > -	    (nfs4_stateid_match_other(stateid, &state-

> > > >open_stateid)

> > > &&

> > > -	    !nfs4_stateid_is_newer(stateid, &state-

> > > >open_stateid)))

> > > {

> > > +	if (!nfs4_stateid_match_other(stateid, &state-

> > > > 

> > > > open_stateid)) {

> > 

> > No. I think what we should be doing here is

> > 

> > 1) if (nfs4_stateid_match_other(arg_stateid, &state->open_stateid) 

> > then

> 

> You must mean (!nfs4_stateid_match_other(arg_stateid, 

> &state->open_stateid)


Sorry. Yes.

> > 

> > just ignore the result and return immediately, because it applies

> > to a

> > completely different stateid.

> > 

> > 2) if (nfs4_stateid_match_other(stateid, &state->open_stateid)

> > && !nfs4_stateid_is_newer(stateid, &state->open_stateid))) then 

> > resync,

> > because this was likely an OPEN_DOWNGRADE that has raced with one

> > or

> > more OPEN calls.

> 

> OK, but these do not help the originally reported race because at

> the 

> time

> that the CLOSE response handling does a resync - I presume

> nfs_resync_open_stateid_locked() - all the state counters are zero, 

> which

> bypasses resetting NFS_OPEN_STATE, which, if unset, allows any

> stateid 

> to

> update the open_stateid.



Please see the 2 patches I just posted. They should fix the problem.
Benjamin Coddington Nov. 14, 2016, 6:40 p.m. UTC | #2
On 14 Nov 2016, at 11:29, Trond Myklebust wrote:
>
> Please see the 2 patches I just posted. They should fix the 
> problem.N‹§²æìr¸›yúèšØb²X¬¶Ç§vØ^–)Þº{.nÇ+‰·¥Š{±û"žØ^n‡r¡ö¦zˁëh™¨è­Ú&¢ø®G«éh®(­éšŽŠÝ¢j"ú¶m§ÿïêäz¹Þ–Šàþf£¢·hšˆ§~ˆmš

Seems like they do.  Thanks!  I'll leave them spinning in that test
overnight to really give them a workout.

Ben

Patch
diff mbox

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index a230764b7d07..c9aa166c45aa 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -1425,8 +1425,13 @@  static void 
nfs_test_and_clear_all_open_stateid(struct nfs4_state *state)
  static bool nfs_need_update_open_stateid(struct nfs4_state *state,
                 const nfs4_stateid *stateid, nfs4_stateid *freeme)
  {
-       if (test_and_set_bit(NFS_OPEN_STATE, &state->flags) == 0)
+       if (test_and_set_bit(NFS_OPEN_STATE, &state->flags) == 0) {
+               if (nfs4_stateid_match_other(stateid, 
&state->open_stateid) &&
+                       !nfs4_stateid_is_newer(stateid, 
&state->open_stateid))
+                       return false;
                 return true;
+       }
+