ocfs2/dlm: return DLM_CANCELGRANT if the lock is on granted list and the operation is canceled
diff mbox series

Message ID 98f0e80c-9c13-dbb6-047c-b40e100082b1@huawei.com
State New
Headers show
Series
  • ocfs2/dlm: return DLM_CANCELGRANT if the lock is on granted list and the operation is canceled
Related show

Commit Message

w00426999 Dec. 3, 2018, 12:20 p.m. UTC
In the dlm_move_lockres_to_recovery_list function, if the lock
is in the granted queue and cancel_pending is set, it will
encounter a BUG. I think this is a meaningless BUG,
so be prepared to remove it. A scenario that causes
this BUG will be given below.

At the beginning, Node 1 is the master and has NL lock,
Node 2 has PR lock, Node 3 has PR lock too.

Node 1          Node 2          Node 3
             want to get EX lock.

                             want to get EX lock.

Node 3 hinder
Node 2 to get
EX lock, send
Node 3 a BAST.

                             receive BAST from
                             Node 1. downconvert
                             thread begin to
                             cancel PR to EX conversion.
                             In dlmunlock_common function,
                             downconvert thread has set
                             lock->cancel_pending,
                             but did not enter
                             dlm_send_remote_unlock_request
                             function.

             Node2 dies because
             the host is powered down.

In recovery process,
clean the lock that
related to Node2.
then finish Node 3
PR to EX request.
give Node 3 a AST.

                             receive AST from Node 1.
                             change lock level to EX,
                             move lock to granted list.

Node1 dies because
the host is powered down.

                             In dlm_move_lockres_to_recovery_list
                             function. the lock is in the
                             granted queue and cancel_pending
                             is set. BUG_ON.

But after clearing this BUG, process will encounter
the second BUG in the ocfs2_unlock_ast function.
Here is a scenario that will cause the second BUG
in ocfs2_unlock_ast as follows:

At the beginning, Node 1 is the master and has NL lock,
Node 2 has PR lock, Node 3 has PR lock too.

Node 1          Node 2          Node 3
             want to get EX lock.

                             want to get EX lock.

Node 3 hinder
Node 2 to get
EX lock, send
Node 3 a BAST.

                             receive BAST from
                             Node 1. downconvert
                             thread begin to
                             cancel PR to EX conversion.
                             In dlmunlock_common function,
                             downconvert thread has released
                             lock->spinlock and res->spinlock,
                             but did not enter
                             dlm_send_remote_unlock_request
                             function.

             Node2 dies because
             the host is powered down.

In recovery process,
clean the lock that
related to Node2.
then finish Node 3
PR to EX request.
give Node 3 a AST.

                             receive AST from Node 1.
                             change lock level to EX,
                             move lock to granted list,
                             set lockres->l_unlock_action
                             as OCFS2_UNLOCK_INVALID
                             in ocfs2_locking_ast function.

Node2 dies because
the host is powered down.

                             Node 3 realize that Node 1
                             is dead, remove Node 1 from
                             domain_map. downconvert thread
                             get DLM_NORMAL from
                             dlm_send_remote_unlock_request
                             function and set *call_ast as 1.
                             Then downconvert thread meet
                             BUG in ocfs2_unlock_ast function.

To avoid meet the second BUG, function dlmunlock_common shuold
return DLM_CANCELGRANT if the lock is on granted list and
the operation is canceled.

Signed-off-by: Jian Wang <wangjian161@huawei.com>
Reviewed-by: Yiwen Jiang <jiangyiwen@huawei.com>
---
  fs/ocfs2/dlm/dlmrecovery.c | 1 -
  fs/ocfs2/dlm/dlmunlock.c   | 5 +++++
  2 files changed, 5 insertions(+), 1 deletion(-)

Comments

Changwei Ge Dec. 5, 2018, 1:49 a.m. UTC | #1
Hi Jian,

I suppose you can't just remove the BUG_ON() check.
If you remove it, below code violates the original logic.

'''
2141                                 dlm_commit_pending_cancel(res, lock);  
'''

What's more important is *locking cancel* must be against a *locking conversion* progress.
So it makes sense to check if this lock is on converting list.

So I have to NACK to this patch.

Thanks,
Changwei


On 2018/12/3 20:23, wangjian wrote:
> In the dlm_move_lockres_to_recovery_list function, if the lock
> is in the granted queue and cancel_pending is set, it will
> encounter a BUG. I think this is a meaningless BUG,
> so be prepared to remove it. A scenario that causes
> this BUG will be given below.
> 
> At the beginning, Node 1 is the master and has NL lock,
> Node 2 has PR lock, Node 3 has PR lock too.
> 
> Node 1          Node 2          Node 3
>              want to get EX lock.
> 
>                              want to get EX lock.
> 
> Node 3 hinder
> Node 2 to get
> EX lock, send
> Node 3 a BAST.
> 
>                              receive BAST from
>                              Node 1. downconvert
>                              thread begin to
>                              cancel PR to EX conversion.
>                              In dlmunlock_common function,
>                              downconvert thread has set
>                              lock->cancel_pending,
>                              but did not enter
>                              dlm_send_remote_unlock_request
>                              function.
> 
>              Node2 dies because
>              the host is powered down.
> 
> In recovery process,
> clean the lock that
> related to Node2.
> then finish Node 3
> PR to EX request.
> give Node 3 a AST.
> 
>                              receive AST from Node 1.
>                              change lock level to EX,
>                              move lock to granted list.
> 
> Node1 dies because
> the host is powered down.
> 
>                              In dlm_move_lockres_to_recovery_list
>                              function. the lock is in the
>                              granted queue and cancel_pending
>                              is set. BUG_ON.
> 
> But after clearing this BUG, process will encounter
> the second BUG in the ocfs2_unlock_ast function.
> Here is a scenario that will cause the second BUG
> in ocfs2_unlock_ast as follows:
> 
> At the beginning, Node 1 is the master and has NL lock,
> Node 2 has PR lock, Node 3 has PR lock too.
> 
> Node 1          Node 2          Node 3
>              want to get EX lock.
> 
>                              want to get EX lock.
> 
> Node 3 hinder
> Node 2 to get
> EX lock, send
> Node 3 a BAST.
> 
>                              receive BAST from
>                              Node 1. downconvert
>                              thread begin to
>                              cancel PR to EX conversion.
>                              In dlmunlock_common function,
>                              downconvert thread has released
>                              lock->spinlock and res->spinlock,
>                              but did not enter
>                              dlm_send_remote_unlock_request
>                              function.
> 
>              Node2 dies because
>              the host is powered down.
> 
> In recovery process,
> clean the lock that
> related to Node2.
> then finish Node 3
> PR to EX request.
> give Node 3 a AST.
> 
>                              receive AST from Node 1.
>                              change lock level to EX,
>                              move lock to granted list,
>                              set lockres->l_unlock_action
>                              as OCFS2_UNLOCK_INVALID
>                              in ocfs2_locking_ast function.
> 
> Node2 dies because
> the host is powered down.
> 
>                              Node 3 realize that Node 1
>                              is dead, remove Node 1 from
>                              domain_map. downconvert thread
>                              get DLM_NORMAL from
>                              dlm_send_remote_unlock_request
>                              function and set *call_ast as 1.
>                              Then downconvert thread meet
>                              BUG in ocfs2_unlock_ast function.
> 
> To avoid meet the second BUG, function dlmunlock_common shuold
> return DLM_CANCELGRANT if the lock is on granted list and
> the operation is canceled.
> 
> Signed-off-by: Jian Wang<wangjian161@huawei.com>
> Reviewed-by: Yiwen Jiang<jiangyiwen@huawei.com>
> ---
>   fs/ocfs2/dlm/dlmrecovery.c | 1 -
>   fs/ocfs2/dlm/dlmunlock.c   | 5 +++++
>   2 files changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
> index 802636d..7489652 100644
> --- a/fs/ocfs2/dlm/dlmrecovery.c
> +++ b/fs/ocfs2/dlm/dlmrecovery.c
> @@ -2134,7 +2134,6 @@ void dlm_move_lockres_to_recovery_list(struct dlm_ctxt *dlm,
>   				 * if this had completed successfully
>   				 * before sending this lock state to the
>   				 * new master */
> -				BUG_ON(i != DLM_CONVERTING_LIST);
>   				mlog(0, "node died with cancel pending "
>   				     "on %.*s. move back to granted list.\n",
>   				     res->lockname.len, res->lockname.name);
> diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
> index 63d701c..505bb6c 100644
> --- a/fs/ocfs2/dlm/dlmunlock.c
> +++ b/fs/ocfs2/dlm/dlmunlock.c
> @@ -183,6 +183,11 @@ static enum dlm_status dlmunlock_common(struct dlm_ctxt *dlm,
>   							flags, owner);
>   		spin_lock(&res->spinlock);
>   		spin_lock(&lock->spinlock);
> +
> +		if ((flags & LKM_CANCEL) &&
> +				dlm_lock_on_list(&res->granted, lock))
> +			status = DLM_CANCELGRANT;
> +
>   		/* if the master told us the lock was already granted,
>   		 * let the ast handle all of these actions */
>   		if (status == DLM_CANCELGRANT) {
> -- 
> 1.8.3.1
>
w00426999 Dec. 6, 2018, 12:05 p.m. UTC | #2
Hi Changwei,

I don't fully agree with your point of view. In my opinion, the lock cancellation process and the lock
conversion process are asynchronous. We can't guarantee that the lock must be in the convert list
during the lock conversion process, otherwise this BUG will not happen.
So, I think this is a meaningless BUG.

Thanks,
Jian

On 12/5/2018 9:49 AM, Changwei Ge wrote:
> Hi Jian,
>
> I suppose you can't just remove the BUG_ON() check.
> If you remove it, below code violates the original logic.
>
> '''
> 2141                                 dlm_commit_pending_cancel(res, lock);
> '''
>
> What's more important is *locking cancel* must be against a *locking conversion* progress.
> So it makes sense to check if this lock is on converting list.
>
> So I have to NACK to this patch.
>
> Thanks,
> Changwei
>
>
> On 2018/12/3 20:23, wangjian wrote:
>> In the dlm_move_lockres_to_recovery_list function, if the lock
>> is in the granted queue and cancel_pending is set, it will
>> encounter a BUG. I think this is a meaningless BUG,
>> so be prepared to remove it. A scenario that causes
>> this BUG will be given below.
>>
>> At the beginning, Node 1 is the master and has NL lock,
>> Node 2 has PR lock, Node 3 has PR lock too.
>>
>> Node 1          Node 2          Node 3
>>               want to get EX lock.
>>
>>                               want to get EX lock.
>>
>> Node 3 hinder
>> Node 2 to get
>> EX lock, send
>> Node 3 a BAST.
>>
>>                               receive BAST from
>>                               Node 1. downconvert
>>                               thread begin to
>>                               cancel PR to EX conversion.
>>                               In dlmunlock_common function,
>>                               downconvert thread has set
>>                               lock->cancel_pending,
>>                               but did not enter
>>                               dlm_send_remote_unlock_request
>>                               function.
>>
>>               Node2 dies because
>>               the host is powered down.
>>
>> In recovery process,
>> clean the lock that
>> related to Node2.
>> then finish Node 3
>> PR to EX request.
>> give Node 3 a AST.
>>
>>                               receive AST from Node 1.
>>                               change lock level to EX,
>>                               move lock to granted list.
>>
>> Node1 dies because
>> the host is powered down.
>>
>>                               In dlm_move_lockres_to_recovery_list
>>                               function. the lock is in the
>>                               granted queue and cancel_pending
>>                               is set. BUG_ON.
>>
>> But after clearing this BUG, process will encounter
>> the second BUG in the ocfs2_unlock_ast function.
>> Here is a scenario that will cause the second BUG
>> in ocfs2_unlock_ast as follows:
>>
>> At the beginning, Node 1 is the master and has NL lock,
>> Node 2 has PR lock, Node 3 has PR lock too.
>>
>> Node 1          Node 2          Node 3
>>               want to get EX lock.
>>
>>                               want to get EX lock.
>>
>> Node 3 hinder
>> Node 2 to get
>> EX lock, send
>> Node 3 a BAST.
>>
>>                               receive BAST from
>>                               Node 1. downconvert
>>                               thread begin to
>>                               cancel PR to EX conversion.
>>                               In dlmunlock_common function,
>>                               downconvert thread has released
>>                               lock->spinlock and res->spinlock,
>>                               but did not enter
>>                               dlm_send_remote_unlock_request
>>                               function.
>>
>>               Node2 dies because
>>               the host is powered down.
>>
>> In recovery process,
>> clean the lock that
>> related to Node2.
>> then finish Node 3
>> PR to EX request.
>> give Node 3 a AST.
>>
>>                               receive AST from Node 1.
>>                               change lock level to EX,
>>                               move lock to granted list,
>>                               set lockres->l_unlock_action
>>                               as OCFS2_UNLOCK_INVALID
>>                               in ocfs2_locking_ast function.
>>
>> Node2 dies because
>> the host is powered down.
>>
>>                               Node 3 realize that Node 1
>>                               is dead, remove Node 1 from
>>                               domain_map. downconvert thread
>>                               get DLM_NORMAL from
>>                               dlm_send_remote_unlock_request
>>                               function and set *call_ast as 1.
>>                               Then downconvert thread meet
>>                               BUG in ocfs2_unlock_ast function.
>>
>> To avoid meet the second BUG, function dlmunlock_common shuold
>> return DLM_CANCELGRANT if the lock is on granted list and
>> the operation is canceled.
>>
>> Signed-off-by: Jian Wang<wangjian161@huawei.com>
>> Reviewed-by: Yiwen Jiang<jiangyiwen@huawei.com>
>> ---
>>    fs/ocfs2/dlm/dlmrecovery.c | 1 -
>>    fs/ocfs2/dlm/dlmunlock.c   | 5 +++++
>>    2 files changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
>> index 802636d..7489652 100644
>> --- a/fs/ocfs2/dlm/dlmrecovery.c
>> +++ b/fs/ocfs2/dlm/dlmrecovery.c
>> @@ -2134,7 +2134,6 @@ void dlm_move_lockres_to_recovery_list(struct dlm_ctxt *dlm,
>>    				 * if this had completed successfully
>>    				 * before sending this lock state to the
>>    				 * new master */
>> -				BUG_ON(i != DLM_CONVERTING_LIST);
>>    				mlog(0, "node died with cancel pending "
>>    				     "on %.*s. move back to granted list.\n",
>>    				     res->lockname.len, res->lockname.name);
>> diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
>> index 63d701c..505bb6c 100644
>> --- a/fs/ocfs2/dlm/dlmunlock.c
>> +++ b/fs/ocfs2/dlm/dlmunlock.c
>> @@ -183,6 +183,11 @@ static enum dlm_status dlmunlock_common(struct dlm_ctxt *dlm,
>>    							flags, owner);
>>    		spin_lock(&res->spinlock);
>>    		spin_lock(&lock->spinlock);
>> +
>> +		if ((flags & LKM_CANCEL) &&
>> +				dlm_lock_on_list(&res->granted, lock))
>> +			status = DLM_CANCELGRANT;
>> +
>>    		/* if the master told us the lock was already granted,
>>    		 * let the ast handle all of these actions */
>>    		if (status == DLM_CANCELGRANT) {
>> -- 
>> 1.8.3.1
>>
> .
>
<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <pre>Hi Changwei,

I don't fully agree with your point of view. In my opinion, the lock cancellation process and the lock 
conversion process are asynchronous. We can't guarantee that the lock must be in the convert list
during the lock conversion process, otherwise this BUG will not happen.
So, I think this is a meaningless BUG.

Thanks,
Jian
</pre>
    <div class="moz-cite-prefix">On 12/5/2018 9:49 AM, Changwei Ge
      wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:63ADC13FD55D6546B7DECE290D39E37301277EFAA1@H3CMLB12-EX.srv.huawei-3com.com">
      <pre class="moz-quote-pre" wrap="">Hi Jian,

I suppose you can't just remove the BUG_ON() check.
If you remove it, below code violates the original logic.

'''
2141                                 dlm_commit_pending_cancel(res, lock);  
'''

What's more important is *locking cancel* must be against a *locking conversion* progress.
So it makes sense to check if this lock is on converting list.

So I have to NACK to this patch.

Thanks,
Changwei


On 2018/12/3 20:23, wangjian wrote:
</pre>
      <blockquote type="cite">
        <pre class="moz-quote-pre" wrap="">In the dlm_move_lockres_to_recovery_list function, if the lock
is in the granted queue and cancel_pending is set, it will
encounter a BUG. I think this is a meaningless BUG,
so be prepared to remove it. A scenario that causes
this BUG will be given below.

At the beginning, Node 1 is the master and has NL lock,
Node 2 has PR lock, Node 3 has PR lock too.

Node 1          Node 2          Node 3
             want to get EX lock.

                             want to get EX lock.

Node 3 hinder
Node 2 to get
EX lock, send
Node 3 a BAST.

                             receive BAST from
                             Node 1. downconvert
                             thread begin to
                             cancel PR to EX conversion.
                             In dlmunlock_common function,
                             downconvert thread has set
                             lock-&gt;cancel_pending,
                             but did not enter
                             dlm_send_remote_unlock_request
                             function.

             Node2 dies because
             the host is powered down.

In recovery process,
clean the lock that
related to Node2.
then finish Node 3
PR to EX request.
give Node 3 a AST.

                             receive AST from Node 1.
                             change lock level to EX,
                             move lock to granted list.

Node1 dies because
the host is powered down.

                             In dlm_move_lockres_to_recovery_list
                             function. the lock is in the
                             granted queue and cancel_pending
                             is set. BUG_ON.

But after clearing this BUG, process will encounter
the second BUG in the ocfs2_unlock_ast function.
Here is a scenario that will cause the second BUG
in ocfs2_unlock_ast as follows:

At the beginning, Node 1 is the master and has NL lock,
Node 2 has PR lock, Node 3 has PR lock too.

Node 1          Node 2          Node 3
             want to get EX lock.

                             want to get EX lock.

Node 3 hinder
Node 2 to get
EX lock, send
Node 3 a BAST.

                             receive BAST from
                             Node 1. downconvert
                             thread begin to
                             cancel PR to EX conversion.
                             In dlmunlock_common function,
                             downconvert thread has released
                             lock-&gt;spinlock and res-&gt;spinlock,
                             but did not enter
                             dlm_send_remote_unlock_request
                             function.

             Node2 dies because
             the host is powered down.

In recovery process,
clean the lock that
related to Node2.
then finish Node 3
PR to EX request.
give Node 3 a AST.

                             receive AST from Node 1.
                             change lock level to EX,
                             move lock to granted list,
                             set lockres-&gt;l_unlock_action
                             as OCFS2_UNLOCK_INVALID
                             in ocfs2_locking_ast function.

Node2 dies because
the host is powered down.

                             Node 3 realize that Node 1
                             is dead, remove Node 1 from
                             domain_map. downconvert thread
                             get DLM_NORMAL from
                             dlm_send_remote_unlock_request
                             function and set *call_ast as 1.
                             Then downconvert thread meet
                             BUG in ocfs2_unlock_ast function.

To avoid meet the second BUG, function dlmunlock_common shuold
return DLM_CANCELGRANT if the lock is on granted list and
the operation is canceled.

Signed-off-by: Jian Wang<a class="moz-txt-link-rfc2396E" href="mailto:wangjian161@huawei.com">&lt;wangjian161@huawei.com&gt;</a>
Reviewed-by: Yiwen Jiang<a class="moz-txt-link-rfc2396E" href="mailto:jiangyiwen@huawei.com">&lt;jiangyiwen@huawei.com&gt;</a>
---
  fs/ocfs2/dlm/dlmrecovery.c | 1 -
  fs/ocfs2/dlm/dlmunlock.c   | 5 +++++
  2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
index 802636d..7489652 100644
--- a/fs/ocfs2/dlm/dlmrecovery.c
+++ b/fs/ocfs2/dlm/dlmrecovery.c
@@ -2134,7 +2134,6 @@ void dlm_move_lockres_to_recovery_list(struct dlm_ctxt *dlm,
  				 * if this had completed successfully
  				 * before sending this lock state to the
  				 * new master */
-				BUG_ON(i != DLM_CONVERTING_LIST);
  				mlog(0, "node died with cancel pending "
  				     "on %.*s. move back to granted list.\n",
  				     res-&gt;lockname.len, res-&gt;lockname.name);
diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
index 63d701c..505bb6c 100644
--- a/fs/ocfs2/dlm/dlmunlock.c
+++ b/fs/ocfs2/dlm/dlmunlock.c
@@ -183,6 +183,11 @@ static enum dlm_status dlmunlock_common(struct dlm_ctxt *dlm,
  							flags, owner);
  		spin_lock(&amp;res-&gt;spinlock);
  		spin_lock(&amp;lock-&gt;spinlock);
+
+		if ((flags &amp; LKM_CANCEL) &amp;&amp;
+				dlm_lock_on_list(&amp;res-&gt;granted, lock))
+			status = DLM_CANCELGRANT;
+
  		/* if the master told us the lock was already granted,
  		 * let the ast handle all of these actions */
  		if (status == DLM_CANCELGRANT) {
Changwei Ge Dec. 7, 2018, 3:12 a.m. UTC | #3
Hi Jian,

I suppose that the situation you described truly exists.
But the way you fix the issue is not in my favor.

If you remove the BUG check why you still call dlm_commit_pending_cancel() to
move the lock back to grant on matter it's on converting list or not?

So how about keeping the BUG check in dlm_move_lockres_to_recovery_list().
If the locking _ast_ comes back very fast just check ::cancel_pending in dlm_proxy_ast_handler() and clear it.
Then with the logic checking if the lock is on grant list (in dlmunlock_common() more less like your current method)
or not we can easily tell if the cancellation succeeds or not.

That complies the original dlm design, which I think is better and easier for maintainers to understand.

Thanks,
Changwei

On 2018/12/6 20:06, wangjian wrote:
> Hi Changwei,
> 
> I don't fully agree with your point of view. In my opinion, the lock cancellation process and the lock
> conversion process are asynchronous. We can't guarantee that the lock must be in the convert list
> during the lock conversion process, otherwise this BUG will not happen.
> So, I think this is a meaningless BUG.
> 
> Thanks,
> Jian
> 
> On 12/5/2018 9:49 AM, Changwei Ge wrote:
>> Hi Jian,
>>
>> I suppose you can't just remove the BUG_ON() check.
>> If you remove it, below code violates the original logic.
>>
>> '''
>> 2141                                 dlm_commit_pending_cancel(res, lock);
>> '''
>>
>> What's more important is *locking cancel* must be against a *locking conversion* progress.
>> So it makes sense to check if this lock is on converting list.
>>
>> So I have to NACK to this patch.
>>
>> Thanks,
>> Changwei
>>
>>
>> On 2018/12/3 20:23, wangjian wrote:
>>> In the dlm_move_lockres_to_recovery_list function, if the lock
>>> is in the granted queue and cancel_pending is set, it will
>>> encounter a BUG. I think this is a meaningless BUG,
>>> so be prepared to remove it. A scenario that causes
>>> this BUG will be given below.
>>>
>>> At the beginning, Node 1 is the master and has NL lock,
>>> Node 2 has PR lock, Node 3 has PR lock too.
>>>
>>> Node 1          Node 2          Node 3
>>>               want to get EX lock.
>>>
>>>                               want to get EX lock.
>>>
>>> Node 3 hinder
>>> Node 2 to get
>>> EX lock, send
>>> Node 3 a BAST.
>>>
>>>                               receive BAST from
>>>                               Node 1. downconvert
>>>                               thread begin to
>>>                               cancel PR to EX conversion.
>>>                               In dlmunlock_common function,
>>>                               downconvert thread has set
>>>                               lock->cancel_pending,
>>>                               but did not enter
>>>                               dlm_send_remote_unlock_request
>>>                               function.
>>>
>>>               Node2 dies because
>>>               the host is powered down.
>>>
>>> In recovery process,
>>> clean the lock that
>>> related to Node2.
>>> then finish Node 3
>>> PR to EX request.
>>> give Node 3 a AST.
>>>
>>>                               receive AST from Node 1.
>>>                               change lock level to EX,
>>>                               move lock to granted list.
>>>
>>> Node1 dies because
>>> the host is powered down.
>>>
>>>                               In dlm_move_lockres_to_recovery_list
>>>                               function. the lock is in the
>>>                               granted queue and cancel_pending
>>>                               is set. BUG_ON.
>>>
>>> But after clearing this BUG, process will encounter
>>> the second BUG in the ocfs2_unlock_ast function.
>>> Here is a scenario that will cause the second BUG
>>> in ocfs2_unlock_ast as follows:
>>>
>>> At the beginning, Node 1 is the master and has NL lock,
>>> Node 2 has PR lock, Node 3 has PR lock too.
>>>
>>> Node 1          Node 2          Node 3
>>>               want to get EX lock.
>>>
>>>                               want to get EX lock.
>>>
>>> Node 3 hinder
>>> Node 2 to get
>>> EX lock, send
>>> Node 3 a BAST.
>>>
>>>                               receive BAST from
>>>                               Node 1. downconvert
>>>                               thread begin to
>>>                               cancel PR to EX conversion.
>>>                               In dlmunlock_common function,
>>>                               downconvert thread has released
>>>                               lock->spinlock and res->spinlock,
>>>                               but did not enter
>>>                               dlm_send_remote_unlock_request
>>>                               function.
>>>
>>>               Node2 dies because
>>>               the host is powered down.
>>>
>>> In recovery process,
>>> clean the lock that
>>> related to Node2.
>>> then finish Node 3
>>> PR to EX request.
>>> give Node 3 a AST.
>>>
>>>                               receive AST from Node 1.
>>>                               change lock level to EX,
>>>                               move lock to granted list,
>>>                               set lockres->l_unlock_action
>>>                               as OCFS2_UNLOCK_INVALID
>>>                               in ocfs2_locking_ast function.
>>>
>>> Node2 dies because
>>> the host is powered down.
>>>
>>>                               Node 3 realize that Node 1
>>>                               is dead, remove Node 1 from
>>>                               domain_map. downconvert thread
>>>                               get DLM_NORMAL from
>>>                               dlm_send_remote_unlock_request
>>>                               function and set *call_ast as 1.
>>>                               Then downconvert thread meet
>>>                               BUG in ocfs2_unlock_ast function.
>>>
>>> To avoid meet the second BUG, function dlmunlock_common shuold
>>> return DLM_CANCELGRANT if the lock is on granted list and
>>> the operation is canceled.
>>>
>>> Signed-off-by: Jian Wang<wangjian161@huawei.com>
>>> Reviewed-by: Yiwen Jiang<jiangyiwen@huawei.com>
>>> ---
>>>    fs/ocfs2/dlm/dlmrecovery.c | 1 -
>>>    fs/ocfs2/dlm/dlmunlock.c   | 5 +++++
>>>    2 files changed, 5 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
>>> index 802636d..7489652 100644
>>> --- a/fs/ocfs2/dlm/dlmrecovery.c
>>> +++ b/fs/ocfs2/dlm/dlmrecovery.c
>>> @@ -2134,7 +2134,6 @@ void dlm_move_lockres_to_recovery_list(struct dlm_ctxt *dlm,
>>>    				 * if this had completed successfully
>>>    				 * before sending this lock state to the
>>>    				 * new master */
>>> -				BUG_ON(i != DLM_CONVERTING_LIST);
>>>    				mlog(0, "node died with cancel pending "
>>>    				     "on %.*s. move back to granted list.\n",
>>>    				     res->lockname.len, res->lockname.name);
>>> diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
>>> index 63d701c..505bb6c 100644
>>> --- a/fs/ocfs2/dlm/dlmunlock.c
>>> +++ b/fs/ocfs2/dlm/dlmunlock.c
>>> @@ -183,6 +183,11 @@ static enum dlm_status dlmunlock_common(struct dlm_ctxt *dlm,
>>>    							flags, owner);
>>>    		spin_lock(&res->spinlock);
>>>    		spin_lock(&lock->spinlock);
>>> +
>>> +		if ((flags & LKM_CANCEL) &&
>>> +				dlm_lock_on_list(&res->granted, lock))
>>> +			status = DLM_CANCELGRANT;
>>> +
>>>    		/* if the master told us the lock was already granted,
>>>    		 * let the ast handle all of these actions */
>>>    		if (status == DLM_CANCELGRANT) {
>>> -- 
>>> 1.8.3.1
>>>
>> .
>>
w00426999 Dec. 8, 2018, 10:05 a.m. UTC | #4
Hi Changwei,

I understand your idea. But we should be aware that the cancel_convert process and
other processes (accepting the AST process, the recovery process) are asynchronous.
For example, according to your idea, check if the lock is in the grant queue before
calling the dlm_send_remote_unlock_request function in the dlm_proxy_ast_handler function.
Then decide whether to clear cancel_pending. But if the AST does not come at this time,
the check passes and cancel_pendig will not be cleared. Then AST immediately came over again,
which also led to a bug. I personally think that for asynchronous processes we can't guarantee
the speed of execution of each process. All we can do is to avoid the BUG scene.

As for the question you said ("If you remove the BUG check why you still call dlm_commit_pending_cancel()
to move the lock back to grant on matter it's on converting list or not?").
I think we should first check if the lock is in the grant queue
(at this time, dlm->spinlock and res->spinlock have been added), then decide whether to call
dlm_commit_pending_cancel function.

Thanks,
Jian

On 12/7/2018 11:12 AM, Changwei Ge wrote:
> Hi Jian,
>
> I suppose that the situation you described truly exists.
> But the way you fix the issue is not in my favor.
>
> If you remove the BUG check why you still call dlm_commit_pending_cancel() to
> move the lock back to grant on matter it's on converting list or not?
>
> So how about keeping the BUG check in dlm_move_lockres_to_recovery_list().
> If the locking _ast_ comes back very fast just check ::cancel_pending in dlm_proxy_ast_handler() and clear it.
> Then with the logic checking if the lock is on grant list (in dlmunlock_common() more less like your current method)
> or not we can easily tell if the cancellation succeeds or not.
>
> That complies the original dlm design, which I think is better and easier for maintainers to understand.
>
> Thanks,
> Changwei
>
> On 2018/12/6 20:06, wangjian wrote:
>> Hi Changwei,
>>
>> I don't fully agree with your point of view. In my opinion, the lock cancellation process and the lock
>> conversion process are asynchronous. We can't guarantee that the lock must be in the convert list
>> during the lock conversion process, otherwise this BUG will not happen.
>> So, I think this is a meaningless BUG.
>>
>> Thanks,
>> Jian
>>
>> On 12/5/2018 9:49 AM, Changwei Ge wrote:
>>> Hi Jian,
>>>
>>> I suppose you can't just remove the BUG_ON() check.
>>> If you remove it, below code violates the original logic.
>>>
>>> '''
>>> 2141                                 dlm_commit_pending_cancel(res, lock);
>>> '''
>>>
>>> What's more important is *locking cancel* must be against a *locking conversion* progress.
>>> So it makes sense to check if this lock is on converting list.
>>>
>>> So I have to NACK to this patch.
>>>
>>> Thanks,
>>> Changwei
>>>
>>>
>>> On 2018/12/3 20:23, wangjian wrote:
>>>> In the dlm_move_lockres_to_recovery_list function, if the lock
>>>> is in the granted queue and cancel_pending is set, it will
>>>> encounter a BUG. I think this is a meaningless BUG,
>>>> so be prepared to remove it. A scenario that causes
>>>> this BUG will be given below.
>>>>
>>>> At the beginning, Node 1 is the master and has NL lock,
>>>> Node 2 has PR lock, Node 3 has PR lock too.
>>>>
>>>> Node 1          Node 2          Node 3
>>>>                want to get EX lock.
>>>>
>>>>                                want to get EX lock.
>>>>
>>>> Node 3 hinder
>>>> Node 2 to get
>>>> EX lock, send
>>>> Node 3 a BAST.
>>>>
>>>>                                receive BAST from
>>>>                                Node 1. downconvert
>>>>                                thread begin to
>>>>                                cancel PR to EX conversion.
>>>>                                In dlmunlock_common function,
>>>>                                downconvert thread has set
>>>>                                lock->cancel_pending,
>>>>                                but did not enter
>>>>                                dlm_send_remote_unlock_request
>>>>                                function.
>>>>
>>>>                Node2 dies because
>>>>                the host is powered down.
>>>>
>>>> In recovery process,
>>>> clean the lock that
>>>> related to Node2.
>>>> then finish Node 3
>>>> PR to EX request.
>>>> give Node 3 a AST.
>>>>
>>>>                                receive AST from Node 1.
>>>>                                change lock level to EX,
>>>>                                move lock to granted list.
>>>>
>>>> Node1 dies because
>>>> the host is powered down.
>>>>
>>>>                                In dlm_move_lockres_to_recovery_list
>>>>                                function. the lock is in the
>>>>                                granted queue and cancel_pending
>>>>                                is set. BUG_ON.
>>>>
>>>> But after clearing this BUG, process will encounter
>>>> the second BUG in the ocfs2_unlock_ast function.
>>>> Here is a scenario that will cause the second BUG
>>>> in ocfs2_unlock_ast as follows:
>>>>
>>>> At the beginning, Node 1 is the master and has NL lock,
>>>> Node 2 has PR lock, Node 3 has PR lock too.
>>>>
>>>> Node 1          Node 2          Node 3
>>>>                want to get EX lock.
>>>>
>>>>                                want to get EX lock.
>>>>
>>>> Node 3 hinder
>>>> Node 2 to get
>>>> EX lock, send
>>>> Node 3 a BAST.
>>>>
>>>>                                receive BAST from
>>>>                                Node 1. downconvert
>>>>                                thread begin to
>>>>                                cancel PR to EX conversion.
>>>>                                In dlmunlock_common function,
>>>>                                downconvert thread has released
>>>>                                lock->spinlock and res->spinlock,
>>>>                                but did not enter
>>>>                                dlm_send_remote_unlock_request
>>>>                                function.
>>>>
>>>>                Node2 dies because
>>>>                the host is powered down.
>>>>
>>>> In recovery process,
>>>> clean the lock that
>>>> related to Node2.
>>>> then finish Node 3
>>>> PR to EX request.
>>>> give Node 3 a AST.
>>>>
>>>>                                receive AST from Node 1.
>>>>                                change lock level to EX,
>>>>                                move lock to granted list,
>>>>                                set lockres->l_unlock_action
>>>>                                as OCFS2_UNLOCK_INVALID
>>>>                                in ocfs2_locking_ast function.
>>>>
>>>> Node2 dies because
>>>> the host is powered down.
>>>>
>>>>                                Node 3 realize that Node 1
>>>>                                is dead, remove Node 1 from
>>>>                                domain_map. downconvert thread
>>>>                                get DLM_NORMAL from
>>>>                                dlm_send_remote_unlock_request
>>>>                                function and set *call_ast as 1.
>>>>                                Then downconvert thread meet
>>>>                                BUG in ocfs2_unlock_ast function.
>>>>
>>>> To avoid meet the second BUG, function dlmunlock_common shuold
>>>> return DLM_CANCELGRANT if the lock is on granted list and
>>>> the operation is canceled.
>>>>
>>>> Signed-off-by: Jian Wang<wangjian161@huawei.com>
>>>> Reviewed-by: Yiwen Jiang<jiangyiwen@huawei.com>
>>>> ---
>>>>     fs/ocfs2/dlm/dlmrecovery.c | 1 -
>>>>     fs/ocfs2/dlm/dlmunlock.c   | 5 +++++
>>>>     2 files changed, 5 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
>>>> index 802636d..7489652 100644
>>>> --- a/fs/ocfs2/dlm/dlmrecovery.c
>>>> +++ b/fs/ocfs2/dlm/dlmrecovery.c
>>>> @@ -2134,7 +2134,6 @@ void dlm_move_lockres_to_recovery_list(struct dlm_ctxt *dlm,
>>>>     				 * if this had completed successfully
>>>>     				 * before sending this lock state to the
>>>>     				 * new master */
>>>> -				BUG_ON(i != DLM_CONVERTING_LIST);
>>>>     				mlog(0, "node died with cancel pending "
>>>>     				     "on %.*s. move back to granted list.\n",
>>>>     				     res->lockname.len, res->lockname.name);
>>>> diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
>>>> index 63d701c..505bb6c 100644
>>>> --- a/fs/ocfs2/dlm/dlmunlock.c
>>>> +++ b/fs/ocfs2/dlm/dlmunlock.c
>>>> @@ -183,6 +183,11 @@ static enum dlm_status dlmunlock_common(struct dlm_ctxt *dlm,
>>>>     							flags, owner);
>>>>     		spin_lock(&res->spinlock);
>>>>     		spin_lock(&lock->spinlock);
>>>> +
>>>> +		if ((flags & LKM_CANCEL) &&
>>>> +				dlm_lock_on_list(&res->granted, lock))
>>>> +			status = DLM_CANCELGRANT;
>>>> +
>>>>     		/* if the master told us the lock was already granted,
>>>>     		 * let the ast handle all of these actions */
>>>>     		if (status == DLM_CANCELGRANT) {
>>>> -- 
>>>> 1.8.3.1
>>>>
>>> .
>>>
> .
>
<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <pre>Hi Changwei,

I understand your idea. But we should be aware that the cancel_convert process and
other processes (accepting the AST process, the recovery process) are asynchronous.
For example, according to your idea, check if the lock is in the grant queue before
calling the dlm_send_remote_unlock_request function in the dlm_proxy_ast_handler function.
Then decide whether to clear cancel_pending. But if the AST does not come at this time,
the check passes and cancel_pendig will not be cleared. Then AST immediately came over again,
which also led to a bug. I personally think that for asynchronous processes we can't guarantee
the speed of execution of each process. All we can do is to avoid the BUG scene.

As for the question you said ("If you remove the BUG check why you still call dlm_commit_pending_cancel()
to move the lock back to grant on matter it's on converting list or not?").
I think we should first check if the lock is in the grant queue
(at this time, dlm-&gt;spinlock and res-&gt;spinlock have been added), then decide whether to call
dlm_commit_pending_cancel function. 

Thanks,
Jian
</pre>
    <div class="moz-cite-prefix">On 12/7/2018 11:12 AM, Changwei Ge
      wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:63ADC13FD55D6546B7DECE290D39E37301277F1589@H3CMLB12-EX.srv.huawei-3com.com">
      <pre class="moz-quote-pre" wrap="">Hi Jian,

I suppose that the situation you described truly exists.
But the way you fix the issue is not in my favor.

If you remove the BUG check why you still call dlm_commit_pending_cancel() to
move the lock back to grant on matter it's on converting list or not?

So how about keeping the BUG check in dlm_move_lockres_to_recovery_list().
If the locking _ast_ comes back very fast just check ::cancel_pending in dlm_proxy_ast_handler() and clear it.
Then with the logic checking if the lock is on grant list (in dlmunlock_common() more less like your current method)
or not we can easily tell if the cancellation succeeds or not.

That complies the original dlm design, which I think is better and easier for maintainers to understand.

Thanks,
Changwei

On 2018/12/6 20:06, wangjian wrote:
</pre>
      <blockquote type="cite">
        <pre class="moz-quote-pre" wrap="">Hi Changwei,

I don't fully agree with your point of view. In my opinion, the lock cancellation process and the lock
conversion process are asynchronous. We can't guarantee that the lock must be in the convert list
during the lock conversion process, otherwise this BUG will not happen.
So, I think this is a meaningless BUG.

Thanks,
Jian

On 12/5/2018 9:49 AM, Changwei Ge wrote:
</pre>
        <blockquote type="cite">
          <pre class="moz-quote-pre" wrap="">Hi Jian,

I suppose you can't just remove the BUG_ON() check.
If you remove it, below code violates the original logic.

'''
2141                                 dlm_commit_pending_cancel(res, lock);
'''

What's more important is *locking cancel* must be against a *locking conversion* progress.
So it makes sense to check if this lock is on converting list.

So I have to NACK to this patch.

Thanks,
Changwei


On 2018/12/3 20:23, wangjian wrote:
</pre>
          <blockquote type="cite">
            <pre class="moz-quote-pre" wrap="">In the dlm_move_lockres_to_recovery_list function, if the lock
is in the granted queue and cancel_pending is set, it will
encounter a BUG. I think this is a meaningless BUG,
so be prepared to remove it. A scenario that causes
this BUG will be given below.

At the beginning, Node 1 is the master and has NL lock,
Node 2 has PR lock, Node 3 has PR lock too.

Node 1          Node 2          Node 3
              want to get EX lock.

                              want to get EX lock.

Node 3 hinder
Node 2 to get
EX lock, send
Node 3 a BAST.

                              receive BAST from
                              Node 1. downconvert
                              thread begin to
                              cancel PR to EX conversion.
                              In dlmunlock_common function,
                              downconvert thread has set
                              lock-&gt;cancel_pending,
                              but did not enter
                              dlm_send_remote_unlock_request
                              function.

              Node2 dies because
              the host is powered down.

In recovery process,
clean the lock that
related to Node2.
then finish Node 3
PR to EX request.
give Node 3 a AST.

                              receive AST from Node 1.
                              change lock level to EX,
                              move lock to granted list.

Node1 dies because
the host is powered down.

                              In dlm_move_lockres_to_recovery_list
                              function. the lock is in the
                              granted queue and cancel_pending
                              is set. BUG_ON.

But after clearing this BUG, process will encounter
the second BUG in the ocfs2_unlock_ast function.
Here is a scenario that will cause the second BUG
in ocfs2_unlock_ast as follows:

At the beginning, Node 1 is the master and has NL lock,
Node 2 has PR lock, Node 3 has PR lock too.

Node 1          Node 2          Node 3
              want to get EX lock.

                              want to get EX lock.

Node 3 hinder
Node 2 to get
EX lock, send
Node 3 a BAST.

                              receive BAST from
                              Node 1. downconvert
                              thread begin to
                              cancel PR to EX conversion.
                              In dlmunlock_common function,
                              downconvert thread has released
                              lock-&gt;spinlock and res-&gt;spinlock,
                              but did not enter
                              dlm_send_remote_unlock_request
                              function.

              Node2 dies because
              the host is powered down.

In recovery process,
clean the lock that
related to Node2.
then finish Node 3
PR to EX request.
give Node 3 a AST.

                              receive AST from Node 1.
                              change lock level to EX,
                              move lock to granted list,
                              set lockres-&gt;l_unlock_action
                              as OCFS2_UNLOCK_INVALID
                              in ocfs2_locking_ast function.

Node2 dies because
the host is powered down.

                              Node 3 realize that Node 1
                              is dead, remove Node 1 from
                              domain_map. downconvert thread
                              get DLM_NORMAL from
                              dlm_send_remote_unlock_request
                              function and set *call_ast as 1.
                              Then downconvert thread meet
                              BUG in ocfs2_unlock_ast function.

To avoid meet the second BUG, function dlmunlock_common shuold
return DLM_CANCELGRANT if the lock is on granted list and
the operation is canceled.

Signed-off-by: Jian Wang<a class="moz-txt-link-rfc2396E" href="mailto:wangjian161@huawei.com">&lt;wangjian161@huawei.com&gt;</a>
Reviewed-by: Yiwen Jiang<a class="moz-txt-link-rfc2396E" href="mailto:jiangyiwen@huawei.com">&lt;jiangyiwen@huawei.com&gt;</a>
---
   fs/ocfs2/dlm/dlmrecovery.c | 1 -
   fs/ocfs2/dlm/dlmunlock.c   | 5 +++++
   2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
index 802636d..7489652 100644
--- a/fs/ocfs2/dlm/dlmrecovery.c
+++ b/fs/ocfs2/dlm/dlmrecovery.c
@@ -2134,7 +2134,6 @@ void dlm_move_lockres_to_recovery_list(struct dlm_ctxt *dlm,
   				 * if this had completed successfully
   				 * before sending this lock state to the
   				 * new master */
-				BUG_ON(i != DLM_CONVERTING_LIST);
   				mlog(0, "node died with cancel pending "
   				     "on %.*s. move back to granted list.\n",
   				     res-&gt;lockname.len, res-&gt;lockname.name);
diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
index 63d701c..505bb6c 100644
--- a/fs/ocfs2/dlm/dlmunlock.c
+++ b/fs/ocfs2/dlm/dlmunlock.c
@@ -183,6 +183,11 @@ static enum dlm_status dlmunlock_common(struct dlm_ctxt *dlm,
   							flags, owner);
   		spin_lock(&amp;res-&gt;spinlock);
   		spin_lock(&amp;lock-&gt;spinlock);
+
+		if ((flags &amp; LKM_CANCEL) &amp;&amp;
+				dlm_lock_on_list(&amp;res-&gt;granted, lock))
+			status = DLM_CANCELGRANT;
+
   		/* if the master told us the lock was already granted,
   		 * let the ast handle all of these actions */
   		if (status == DLM_CANCELGRANT) {
piaojun Feb. 14, 2019, 9:04 a.m. UTC | #5
Hi Changwei,

The problem can't be solved completely if clear ::cancel_pending in
dlm_proxy_ast_handler, as AST will come at anytime just before
::cancel_pendig is set. If there is not any other better solutions,
could we accept this patch? This bug is very harmful.

Thanks,
Jun

On 2018/12/8 18:05, wangjian wrote:
> Hi Changwei,
> 
> I understand your idea. But we should be aware that the cancel_convert process and
> other processes (accepting the AST process, the recovery process) are asynchronous.
> For example, according to your idea, check if the lock is in the grant queue before
> calling the dlm_send_remote_unlock_request function in the dlm_proxy_ast_handler function.
> Then decide whether to clear cancel_pending. But if the AST does not come at this time,
> the check passes and cancel_pendig will not be cleared. Then AST immediately came over again,
> which also led to a bug. I personally think that for asynchronous processes we can't guarantee
> the speed of execution of each process. All we can do is to avoid the BUG scene.
> 
> As for the question you said ("If you remove the BUG check why you still call dlm_commit_pending_cancel()
> to move the lock back to grant on matter it's on converting list or not?").
> I think we should first check if the lock is in the grant queue
> (at this time, dlm->spinlock and res->spinlock have been added), then decide whether to call
> dlm_commit_pending_cancel function. 
> 
> Thanks,
> Jian
> 
> On 12/7/2018 11:12 AM, Changwei Ge wrote:
>> Hi Jian,
>>
>> I suppose that the situation you described truly exists.
>> But the way you fix the issue is not in my favor.
>>
>> If you remove the BUG check why you still call dlm_commit_pending_cancel() to
>> move the lock back to grant on matter it's on converting list or not?
>>
>> So how about keeping the BUG check in dlm_move_lockres_to_recovery_list().
>> If the locking _ast_ comes back very fast just check ::cancel_pending in dlm_proxy_ast_handler() and clear it.
>> Then with the logic checking if the lock is on grant list (in dlmunlock_common() more less like your current method)
>> or not we can easily tell if the cancellation succeeds or not.
>>
>> That complies the original dlm design, which I think is better and easier for maintainers to understand.
>>
>> Thanks,
>> Changwei
>>
>> On 2018/12/6 20:06, wangjian wrote:
>>> Hi Changwei,
>>>
>>> I don't fully agree with your point of view. In my opinion, the lock cancellation process and the lock
>>> conversion process are asynchronous. We can't guarantee that the lock must be in the convert list
>>> during the lock conversion process, otherwise this BUG will not happen.
>>> So, I think this is a meaningless BUG.
>>>
>>> Thanks,
>>> Jian
>>>
>>> On 12/5/2018 9:49 AM, Changwei Ge wrote:
>>>> Hi Jian,
>>>>
>>>> I suppose you can't just remove the BUG_ON() check.
>>>> If you remove it, below code violates the original logic.
>>>>
>>>> '''
>>>> 2141                                 dlm_commit_pending_cancel(res, lock);
>>>> '''
>>>>
>>>> What's more important is *locking cancel* must be against a *locking conversion* progress.
>>>> So it makes sense to check if this lock is on converting list.
>>>>
>>>> So I have to NACK to this patch.
>>>>
>>>> Thanks,
>>>> Changwei
>>>>
>>>>
>>>> On 2018/12/3 20:23, wangjian wrote:
>>>>> In the dlm_move_lockres_to_recovery_list function, if the lock
>>>>> is in the granted queue and cancel_pending is set, it will
>>>>> encounter a BUG. I think this is a meaningless BUG,
>>>>> so be prepared to remove it. A scenario that causes
>>>>> this BUG will be given below.
>>>>>
>>>>> At the beginning, Node 1 is the master and has NL lock,
>>>>> Node 2 has PR lock, Node 3 has PR lock too.
>>>>>
>>>>> Node 1          Node 2          Node 3
>>>>>               want to get EX lock.
>>>>>
>>>>>                               want to get EX lock.
>>>>>
>>>>> Node 3 hinder
>>>>> Node 2 to get
>>>>> EX lock, send
>>>>> Node 3 a BAST.
>>>>>
>>>>>                               receive BAST from
>>>>>                               Node 1. downconvert
>>>>>                               thread begin to
>>>>>                               cancel PR to EX conversion.
>>>>>                               In dlmunlock_common function,
>>>>>                               downconvert thread has set
>>>>>                               lock->cancel_pending,
>>>>>                               but did not enter
>>>>>                               dlm_send_remote_unlock_request
>>>>>                               function.
>>>>>
>>>>>               Node2 dies because
>>>>>               the host is powered down.
>>>>>
>>>>> In recovery process,
>>>>> clean the lock that
>>>>> related to Node2.
>>>>> then finish Node 3
>>>>> PR to EX request.
>>>>> give Node 3 a AST.
>>>>>
>>>>>                               receive AST from Node 1.
>>>>>                               change lock level to EX,
>>>>>                               move lock to granted list.
>>>>>
>>>>> Node1 dies because
>>>>> the host is powered down.
>>>>>
>>>>>                               In dlm_move_lockres_to_recovery_list
>>>>>                               function. the lock is in the
>>>>>                               granted queue and cancel_pending
>>>>>                               is set. BUG_ON.
>>>>>
>>>>> But after clearing this BUG, process will encounter
>>>>> the second BUG in the ocfs2_unlock_ast function.
>>>>> Here is a scenario that will cause the second BUG
>>>>> in ocfs2_unlock_ast as follows:
>>>>>
>>>>> At the beginning, Node 1 is the master and has NL lock,
>>>>> Node 2 has PR lock, Node 3 has PR lock too.
>>>>>
>>>>> Node 1          Node 2          Node 3
>>>>>               want to get EX lock.
>>>>>
>>>>>                               want to get EX lock.
>>>>>
>>>>> Node 3 hinder
>>>>> Node 2 to get
>>>>> EX lock, send
>>>>> Node 3 a BAST.
>>>>>
>>>>>                               receive BAST from
>>>>>                               Node 1. downconvert
>>>>>                               thread begin to
>>>>>                               cancel PR to EX conversion.
>>>>>                               In dlmunlock_common function,
>>>>>                               downconvert thread has released
>>>>>                               lock->spinlock and res->spinlock,
>>>>>                               but did not enter
>>>>>                               dlm_send_remote_unlock_request
>>>>>                               function.
>>>>>
>>>>>               Node2 dies because
>>>>>               the host is powered down.
>>>>>
>>>>> In recovery process,
>>>>> clean the lock that
>>>>> related to Node2.
>>>>> then finish Node 3
>>>>> PR to EX request.
>>>>> give Node 3 a AST.
>>>>>
>>>>>                               receive AST from Node 1.
>>>>>                               change lock level to EX,
>>>>>                               move lock to granted list,
>>>>>                               set lockres->l_unlock_action
>>>>>                               as OCFS2_UNLOCK_INVALID
>>>>>                               in ocfs2_locking_ast function.
>>>>>
>>>>> Node2 dies because
>>>>> the host is powered down.
>>>>>
>>>>>                               Node 3 realize that Node 1
>>>>>                               is dead, remove Node 1 from
>>>>>                               domain_map. downconvert thread
>>>>>                               get DLM_NORMAL from
>>>>>                               dlm_send_remote_unlock_request
>>>>>                               function and set *call_ast as 1.
>>>>>                               Then downconvert thread meet
>>>>>                               BUG in ocfs2_unlock_ast function.
>>>>>
>>>>> To avoid meet the second BUG, function dlmunlock_common shuold
>>>>> return DLM_CANCELGRANT if the lock is on granted list and
>>>>> the operation is canceled.
>>>>>
>>>>> Signed-off-by: Jian Wang<wangjian161@huawei.com>
>>>>> Reviewed-by: Yiwen Jiang<jiangyiwen@huawei.com>
>>>>> ---
>>>>>    fs/ocfs2/dlm/dlmrecovery.c | 1 -
>>>>>    fs/ocfs2/dlm/dlmunlock.c   | 5 +++++
>>>>>    2 files changed, 5 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
>>>>> index 802636d..7489652 100644
>>>>> --- a/fs/ocfs2/dlm/dlmrecovery.c
>>>>> +++ b/fs/ocfs2/dlm/dlmrecovery.c
>>>>> @@ -2134,7 +2134,6 @@ void dlm_move_lockres_to_recovery_list(struct dlm_ctxt *dlm,
>>>>>    				 * if this had completed successfully
>>>>>    				 * before sending this lock state to the
>>>>>    				 * new master */
>>>>> -				BUG_ON(i != DLM_CONVERTING_LIST);
>>>>>    				mlog(0, "node died with cancel pending "
>>>>>    				     "on %.*s. move back to granted list.\n",
>>>>>    				     res->lockname.len, res->lockname.name);
>>>>> diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
>>>>> index 63d701c..505bb6c 100644
>>>>> --- a/fs/ocfs2/dlm/dlmunlock.c
>>>>> +++ b/fs/ocfs2/dlm/dlmunlock.c
>>>>> @@ -183,6 +183,11 @@ static enum dlm_status dlmunlock_common(struct dlm_ctxt *dlm,
>>>>>    							flags, owner);
>>>>>    		spin_lock(&res->spinlock);
>>>>>    		spin_lock(&lock->spinlock);
>>>>> +
>>>>> +		if ((flags & LKM_CANCEL) &&
>>>>> +				dlm_lock_on_list(&res->granted, lock))
>>>>> +			status = DLM_CANCELGRANT;
>>>>> +
>>>>>    		/* if the master told us the lock was already granted,
>>>>>    		 * let the ast handle all of these actions */
>>>>>    		if (status == DLM_CANCELGRANT) {
>>>>> -- 
>>>>> 1.8.3.1
>>>>>
>>>> .
>>>>
>> .
>>
> 
> 
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel@oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>
Changwei Ge Feb. 14, 2019, 9:59 a.m. UTC | #6
Hi Jun,

On 2019/2/14 17:09, piaojun wrote:
> Hi Changwei,
> 
> The problem can't be solved completely if clear ::cancel_pending in
> dlm_proxy_ast_handler, as AST will come at anytime just before
> ::cancel_pendig is set. If there is not any other better solutions,
> could we accept this patch? This bug is very harmful.

Um, I am recalling the problem this patch is trying to address...
If I miss something, please let me know.

I read the patch again, it seems that NODE3 finds a lock on its _grant list_ with ::cancel_pending set, which makes dlm insane.
Actually, I don't like patches addressing problems killing BUG checks as it keeps the whole software consistent with its original design.
With those BUG checks, we can add feature and fix bugs in an elegant way.

As we the lock is *already granted*, why not just check ::cancel_pending and if it is set, clear it and then move the lock to
grant list since cancellation obviously cant work anymore. And I think we even don't have to worry about dlm_send_remote_unlock_request()
since message sending will fail anyway, but function will return NORMAL as NODE1 was *dead* already.

  
I admit that the problem reported truly exists and should be solved but the method is not in my favor. :-)
I have several reasons for your reference:
1) My biggest concern is we should not remove the BUG check ( BUG_ON(i != DLM_CONVERTING_LIST) ),
    because it might violate the original ocfs2/dlm design making following maintenance work hard.
2) As for another patch from Jian, it can't use LVB to boost performance that single locking procedure.

I am providing a draft code for you and Jian's reference.

root@ubuntu16:/home/chge/linux[master]# git diff
diff --git a/fs/ocfs2/dlm/dlmast.c b/fs/ocfs2/dlm/dlmast.c
index 39831fc..812843b 100644
--- a/fs/ocfs2/dlm/dlmast.c
+++ b/fs/ocfs2/dlm/dlmast.c
@@ -372,8 +372,11 @@ int dlm_proxy_ast_handler(struct o2net_msg *msg, u32 len, void *data,
         head = &res->converting;
         lock = NULL;
         list_for_each_entry(lock, head, list) {
-               if (lock->ml.cookie == cookie)
+               if (lock->ml.cookie == cookie) {
+                       if (lock->cancel_pending)
+                               lock->cancel_pending = 0;
                         goto do_ast;
+               }
         }

Thanks,
Changwei


> 
> Thanks,
> Jun
> 
> On 2018/12/8 18:05, wangjian wrote:
>> Hi Changwei,
>>
>> I understand your idea. But we should be aware that the cancel_convert process and
>> other processes (accepting the AST process, the recovery process) are asynchronous.
>> For example, according to your idea, check if the lock is in the grant queue before
>> calling the dlm_send_remote_unlock_request function in the dlm_proxy_ast_handler function.
>> Then decide whether to clear cancel_pending. But if the AST does not come at this time,
>> the check passes and cancel_pendig will not be cleared. Then AST immediately came over again,
>> which also led to a bug. I personally think that for asynchronous processes we can't guarantee
>> the speed of execution of each process. All we can do is to avoid the BUG scene.
>>
>> As for the question you said ("If you remove the BUG check why you still call dlm_commit_pending_cancel()
>> to move the lock back to grant on matter it's on converting list or not?").
>> I think we should first check if the lock is in the grant queue
>> (at this time, dlm->spinlock and res->spinlock have been added), then decide whether to call
>> dlm_commit_pending_cancel function.
>>
>> Thanks,
>> Jian
>>
>> On 12/7/2018 11:12 AM, Changwei Ge wrote:
>>> Hi Jian,
>>>
>>> I suppose that the situation you described truly exists.
>>> But the way you fix the issue is not in my favor.
>>>
>>> If you remove the BUG check why you still call dlm_commit_pending_cancel() to
>>> move the lock back to grant on matter it's on converting list or not?
>>>
>>> So how about keeping the BUG check in dlm_move_lockres_to_recovery_list().
>>> If the locking _ast_ comes back very fast just check ::cancel_pending in dlm_proxy_ast_handler() and clear it.
>>> Then with the logic checking if the lock is on grant list (in dlmunlock_common() more less like your current method)
>>> or not we can easily tell if the cancellation succeeds or not.
>>>
>>> That complies the original dlm design, which I think is better and easier for maintainers to understand.
>>>
>>> Thanks,
>>> Changwei
>>>
>>> On 2018/12/6 20:06, wangjian wrote:
>>>> Hi Changwei,
>>>>
>>>> I don't fully agree with your point of view. In my opinion, the lock cancellation process and the lock
>>>> conversion process are asynchronous. We can't guarantee that the lock must be in the convert list
>>>> during the lock conversion process, otherwise this BUG will not happen.
>>>> So, I think this is a meaningless BUG.
>>>>
>>>> Thanks,
>>>> Jian
>>>>
>>>> On 12/5/2018 9:49 AM, Changwei Ge wrote:
>>>>> Hi Jian,
>>>>>
>>>>> I suppose you can't just remove the BUG_ON() check.
>>>>> If you remove it, below code violates the original logic.
>>>>>
>>>>> '''
>>>>> 2141                                 dlm_commit_pending_cancel(res, lock);
>>>>> '''
>>>>>
>>>>> What's more important is *locking cancel* must be against a *locking conversion* progress.
>>>>> So it makes sense to check if this lock is on converting list.
>>>>>
>>>>> So I have to NACK to this patch.
>>>>>
>>>>> Thanks,
>>>>> Changwei
>>>>>
>>>>>
>>>>> On 2018/12/3 20:23, wangjian wrote:
>>>>>> In the dlm_move_lockres_to_recovery_list function, if the lock
>>>>>> is in the granted queue and cancel_pending is set, it will
>>>>>> encounter a BUG. I think this is a meaningless BUG,
>>>>>> so be prepared to remove it. A scenario that causes
>>>>>> this BUG will be given below.
>>>>>>
>>>>>> At the beginning, Node 1 is the master and has NL lock,
>>>>>> Node 2 has PR lock, Node 3 has PR lock too.
>>>>>>
>>>>>> Node 1          Node 2          Node 3
>>>>>>                want to get EX lock.
>>>>>>
>>>>>>                                want to get EX lock.
>>>>>>
>>>>>> Node 3 hinder
>>>>>> Node 2 to get
>>>>>> EX lock, send
>>>>>> Node 3 a BAST.
>>>>>>
>>>>>>                                receive BAST from
>>>>>>                                Node 1. downconvert
>>>>>>                                thread begin to
>>>>>>                                cancel PR to EX conversion.
>>>>>>                                In dlmunlock_common function,
>>>>>>                                downconvert thread has set
>>>>>>                                lock->cancel_pending,
>>>>>>                                but did not enter
>>>>>>                                dlm_send_remote_unlock_request
>>>>>>                                function.
>>>>>>
>>>>>>                Node2 dies because
>>>>>>                the host is powered down.
>>>>>>
>>>>>> In recovery process,
>>>>>> clean the lock that
>>>>>> related to Node2.
>>>>>> then finish Node 3
>>>>>> PR to EX request.
>>>>>> give Node 3 a AST.
>>>>>>
>>>>>>                                receive AST from Node 1.
>>>>>>                                change lock level to EX,
>>>>>>                                move lock to granted list.
>>>>>>
>>>>>> Node1 dies because
>>>>>> the host is powered down.
>>>>>>
>>>>>>                                In dlm_move_lockres_to_recovery_list
>>>>>>                                function. the lock is in the
>>>>>>                                granted queue and cancel_pending
>>>>>>                                is set. BUG_ON.
>>>>>>
>>>>>> But after clearing this BUG, process will encounter
>>>>>> the second BUG in the ocfs2_unlock_ast function.
>>>>>> Here is a scenario that will cause the second BUG
>>>>>> in ocfs2_unlock_ast as follows:
>>>>>>
>>>>>> At the beginning, Node 1 is the master and has NL lock,
>>>>>> Node 2 has PR lock, Node 3 has PR lock too.
>>>>>>
>>>>>> Node 1          Node 2          Node 3
>>>>>>                want to get EX lock.
>>>>>>
>>>>>>                                want to get EX lock.
>>>>>>
>>>>>> Node 3 hinder
>>>>>> Node 2 to get
>>>>>> EX lock, send
>>>>>> Node 3 a BAST.
>>>>>>
>>>>>>                                receive BAST from
>>>>>>                                Node 1. downconvert
>>>>>>                                thread begin to
>>>>>>                                cancel PR to EX conversion.
>>>>>>                                In dlmunlock_common function,
>>>>>>                                downconvert thread has released
>>>>>>                                lock->spinlock and res->spinlock,
>>>>>>                                but did not enter
>>>>>>                                dlm_send_remote_unlock_request
>>>>>>                                function.
>>>>>>
>>>>>>                Node2 dies because
>>>>>>                the host is powered down.
>>>>>>
>>>>>> In recovery process,
>>>>>> clean the lock that
>>>>>> related to Node2.
>>>>>> then finish Node 3
>>>>>> PR to EX request.
>>>>>> give Node 3 a AST.
>>>>>>
>>>>>>                                receive AST from Node 1.
>>>>>>                                change lock level to EX,
>>>>>>                                move lock to granted list,
>>>>>>                                set lockres->l_unlock_action
>>>>>>                                as OCFS2_UNLOCK_INVALID
>>>>>>                                in ocfs2_locking_ast function.
>>>>>>
>>>>>> Node2 dies because
>>>>>> the host is powered down.
>>>>>>
>>>>>>                                Node 3 realize that Node 1
>>>>>>                                is dead, remove Node 1 from
>>>>>>                                domain_map. downconvert thread
>>>>>>                                get DLM_NORMAL from
>>>>>>                                dlm_send_remote_unlock_request
>>>>>>                                function and set *call_ast as 1.
>>>>>>                                Then downconvert thread meet
>>>>>>                                BUG in ocfs2_unlock_ast function.
>>>>>>
>>>>>> To avoid meet the second BUG, function dlmunlock_common shuold
>>>>>> return DLM_CANCELGRANT if the lock is on granted list and
>>>>>> the operation is canceled.
>>>>>>
>>>>>> Signed-off-by: Jian Wang<wangjian161@huawei.com>
>>>>>> Reviewed-by: Yiwen Jiang<jiangyiwen@huawei.com>
>>>>>> ---
>>>>>>     fs/ocfs2/dlm/dlmrecovery.c | 1 -
>>>>>>     fs/ocfs2/dlm/dlmunlock.c   | 5 +++++
>>>>>>     2 files changed, 5 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
>>>>>> index 802636d..7489652 100644
>>>>>> --- a/fs/ocfs2/dlm/dlmrecovery.c
>>>>>> +++ b/fs/ocfs2/dlm/dlmrecovery.c
>>>>>> @@ -2134,7 +2134,6 @@ void dlm_move_lockres_to_recovery_list(struct dlm_ctxt *dlm,
>>>>>>     				 * if this had completed successfully
>>>>>>     				 * before sending this lock state to the
>>>>>>     				 * new master */
>>>>>> -				BUG_ON(i != DLM_CONVERTING_LIST);
>>>>>>     				mlog(0, "node died with cancel pending "
>>>>>>     				     "on %.*s. move back to granted list.\n",
>>>>>>     				     res->lockname.len, res->lockname.name);
>>>>>> diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
>>>>>> index 63d701c..505bb6c 100644
>>>>>> --- a/fs/ocfs2/dlm/dlmunlock.c
>>>>>> +++ b/fs/ocfs2/dlm/dlmunlock.c
>>>>>> @@ -183,6 +183,11 @@ static enum dlm_status dlmunlock_common(struct dlm_ctxt *dlm,
>>>>>>     							flags, owner);
>>>>>>     		spin_lock(&res->spinlock);
>>>>>>     		spin_lock(&lock->spinlock);
>>>>>> +
>>>>>> +		if ((flags & LKM_CANCEL) &&
>>>>>> +				dlm_lock_on_list(&res->granted, lock))
>>>>>> +			status = DLM_CANCELGRANT;
>>>>>> +
>>>>>>     		/* if the master told us the lock was already granted,
>>>>>>     		 * let the ast handle all of these actions */
>>>>>>     		if (status == DLM_CANCELGRANT) {
>>>>>> -- 
>>>>>> 1.8.3.1
>>>>>>
>>>>> .
>>>>>
>>> .
>>>
>>
>>
>> _______________________________________________
>> Ocfs2-devel mailing list
>> Ocfs2-devel@oss.oracle.com
>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>>
>
Changwei Ge Feb. 14, 2019, 10:13 a.m. UTC | #7
On 2019/2/14 17:09, piaojun wrote:
> Hi Changwei,
> 
> The problem can't be solved completely if clear ::cancel_pending in
> dlm_proxy_ast_handler, as AST will come at anytime just before

So how about also add check here bere setting ::cancel_pending in dlmunlock_common() before invoking dlm_send_remote_unlock_request().
If already on grant list just return DLM_CANCELGRANT

Then a further reference code might look like:

root@ubuntu16:/home/chge/linux[master]# git diff
diff --git a/fs/ocfs2/dlm/dlmast.c b/fs/ocfs2/dlm/dlmast.c
index 39831fc..812843b 100644
--- a/fs/ocfs2/dlm/dlmast.c
+++ b/fs/ocfs2/dlm/dlmast.c
@@ -372,8 +372,11 @@ int dlm_proxy_ast_handler(struct o2net_msg *msg, u32 len, void *data,
         head = &res->converting;
         lock = NULL;
         list_for_each_entry(lock, head, list) {
-               if (lock->ml.cookie == cookie)
+               if (lock->ml.cookie == cookie) {
+                       if (lock->cancel_pending)
+                               lock->cancel_pending = 0;
                         goto do_ast;
+               }
         }

         /* if not on convert, try blocked for ast, granted for bast */
diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
index c8e9b70..b4728b5 100644
--- a/fs/ocfs2/dlm/dlmunlock.c
+++ b/fs/ocfs2/dlm/dlmunlock.c
@@ -174,9 +174,14 @@ static enum dlm_status dlmunlock_common(struct dlm_ctxt *dlm,
         if (!master_node) {
                 owner = res->owner;
                 /* drop locks and send message */
-               if (flags & LKM_CANCEL)
+               if (flags & LKM_CANCEL) {
+                       if (dlm_lock_on_list(&res->granted, lock)) {
+                               status = DLM_CANCELGRANT;
+                               goto leave;
+                       }
+
                         lock->cancel_pending = 1;
-               else
+               } else
                         lock->unlock_pending = 1;
                 spin_unlock(&lock->spinlock);
                 spin_unlock(&res->spinlock);



Thanks,
Changwei

> ::cancel_pendig is set. If there is not any other better solutions,
> could we accept this patch? This bug is very harmful.
> 
> Thanks,
> Jun
> 
> On 2018/12/8 18:05, wangjian wrote:
>> Hi Changwei,
>>
>> I understand your idea. But we should be aware that the cancel_convert process and
>> other processes (accepting the AST process, the recovery process) are asynchronous.
>> For example, according to your idea, check if the lock is in the grant queue before
>> calling the dlm_send_remote_unlock_request function in the dlm_proxy_ast_handler function.
>> Then decide whether to clear cancel_pending. But if the AST does not come at this time,
>> the check passes and cancel_pendig will not be cleared. Then AST immediately came over again,
>> which also led to a bug. I personally think that for asynchronous processes we can't guarantee
>> the speed of execution of each process. All we can do is to avoid the BUG scene.
>>
>> As for the question you said ("If you remove the BUG check why you still call dlm_commit_pending_cancel()
>> to move the lock back to grant on matter it's on converting list or not?").
>> I think we should first check if the lock is in the grant queue
>> (at this time, dlm->spinlock and res->spinlock have been added), then decide whether to call
>> dlm_commit_pending_cancel function.
>>
>> Thanks,
>> Jian
>>
>> On 12/7/2018 11:12 AM, Changwei Ge wrote:
>>> Hi Jian,
>>>
>>> I suppose that the situation you described truly exists.
>>> But the way you fix the issue is not in my favor.
>>>
>>> If you remove the BUG check why you still call dlm_commit_pending_cancel() to
>>> move the lock back to grant on matter it's on converting list or not?
>>>
>>> So how about keeping the BUG check in dlm_move_lockres_to_recovery_list().
>>> If the locking _ast_ comes back very fast just check ::cancel_pending in dlm_proxy_ast_handler() and clear it.
>>> Then with the logic checking if the lock is on grant list (in dlmunlock_common() more less like your current method)
>>> or not we can easily tell if the cancellation succeeds or not.
>>>
>>> That complies the original dlm design, which I think is better and easier for maintainers to understand.
>>>
>>> Thanks,
>>> Changwei
>>>
>>> On 2018/12/6 20:06, wangjian wrote:
>>>> Hi Changwei,
>>>>
>>>> I don't fully agree with your point of view. In my opinion, the lock cancellation process and the lock
>>>> conversion process are asynchronous. We can't guarantee that the lock must be in the convert list
>>>> during the lock conversion process, otherwise this BUG will not happen.
>>>> So, I think this is a meaningless BUG.
>>>>
>>>> Thanks,
>>>> Jian
>>>>
>>>> On 12/5/2018 9:49 AM, Changwei Ge wrote:
>>>>> Hi Jian,
>>>>>
>>>>> I suppose you can't just remove the BUG_ON() check.
>>>>> If you remove it, below code violates the original logic.
>>>>>
>>>>> '''
>>>>> 2141                                 dlm_commit_pending_cancel(res, lock);
>>>>> '''
>>>>>
>>>>> What's more important is *locking cancel* must be against a *locking conversion* progress.
>>>>> So it makes sense to check if this lock is on converting list.
>>>>>
>>>>> So I have to NACK to this patch.
>>>>>
>>>>> Thanks,
>>>>> Changwei
>>>>>
>>>>>
>>>>> On 2018/12/3 20:23, wangjian wrote:
>>>>>> In the dlm_move_lockres_to_recovery_list function, if the lock
>>>>>> is in the granted queue and cancel_pending is set, it will
>>>>>> encounter a BUG. I think this is a meaningless BUG,
>>>>>> so be prepared to remove it. A scenario that causes
>>>>>> this BUG will be given below.
>>>>>>
>>>>>> At the beginning, Node 1 is the master and has NL lock,
>>>>>> Node 2 has PR lock, Node 3 has PR lock too.
>>>>>>
>>>>>> Node 1          Node 2          Node 3
>>>>>>                want to get EX lock.
>>>>>>
>>>>>>                                want to get EX lock.
>>>>>>
>>>>>> Node 3 hinder
>>>>>> Node 2 to get
>>>>>> EX lock, send
>>>>>> Node 3 a BAST.
>>>>>>
>>>>>>                                receive BAST from
>>>>>>                                Node 1. downconvert
>>>>>>                                thread begin to
>>>>>>                                cancel PR to EX conversion.
>>>>>>                                In dlmunlock_common function,
>>>>>>                                downconvert thread has set
>>>>>>                                lock->cancel_pending,
>>>>>>                                but did not enter
>>>>>>                                dlm_send_remote_unlock_request
>>>>>>                                function.
>>>>>>
>>>>>>                Node2 dies because
>>>>>>                the host is powered down.
>>>>>>
>>>>>> In recovery process,
>>>>>> clean the lock that
>>>>>> related to Node2.
>>>>>> then finish Node 3
>>>>>> PR to EX request.
>>>>>> give Node 3 a AST.
>>>>>>
>>>>>>                                receive AST from Node 1.
>>>>>>                                change lock level to EX,
>>>>>>                                move lock to granted list.
>>>>>>
>>>>>> Node1 dies because
>>>>>> the host is powered down.
>>>>>>
>>>>>>                                In dlm_move_lockres_to_recovery_list
>>>>>>                                function. the lock is in the
>>>>>>                                granted queue and cancel_pending
>>>>>>                                is set. BUG_ON.
>>>>>>
>>>>>> But after clearing this BUG, process will encounter
>>>>>> the second BUG in the ocfs2_unlock_ast function.
>>>>>> Here is a scenario that will cause the second BUG
>>>>>> in ocfs2_unlock_ast as follows:
>>>>>>
>>>>>> At the beginning, Node 1 is the master and has NL lock,
>>>>>> Node 2 has PR lock, Node 3 has PR lock too.
>>>>>>
>>>>>> Node 1          Node 2          Node 3
>>>>>>                want to get EX lock.
>>>>>>
>>>>>>                                want to get EX lock.
>>>>>>
>>>>>> Node 3 hinder
>>>>>> Node 2 to get
>>>>>> EX lock, send
>>>>>> Node 3 a BAST.
>>>>>>
>>>>>>                                receive BAST from
>>>>>>                                Node 1. downconvert
>>>>>>                                thread begin to
>>>>>>                                cancel PR to EX conversion.
>>>>>>                                In dlmunlock_common function,
>>>>>>                                downconvert thread has released
>>>>>>                                lock->spinlock and res->spinlock,
>>>>>>                                but did not enter
>>>>>>                                dlm_send_remote_unlock_request
>>>>>>                                function.
>>>>>>
>>>>>>                Node2 dies because
>>>>>>                the host is powered down.
>>>>>>
>>>>>> In recovery process,
>>>>>> clean the lock that
>>>>>> related to Node2.
>>>>>> then finish Node 3
>>>>>> PR to EX request.
>>>>>> give Node 3 a AST.
>>>>>>
>>>>>>                                receive AST from Node 1.
>>>>>>                                change lock level to EX,
>>>>>>                                move lock to granted list,
>>>>>>                                set lockres->l_unlock_action
>>>>>>                                as OCFS2_UNLOCK_INVALID
>>>>>>                                in ocfs2_locking_ast function.
>>>>>>
>>>>>> Node2 dies because
>>>>>> the host is powered down.
>>>>>>
>>>>>>                                Node 3 realize that Node 1
>>>>>>                                is dead, remove Node 1 from
>>>>>>                                domain_map. downconvert thread
>>>>>>                                get DLM_NORMAL from
>>>>>>                                dlm_send_remote_unlock_request
>>>>>>                                function and set *call_ast as 1.
>>>>>>                                Then downconvert thread meet
>>>>>>                                BUG in ocfs2_unlock_ast function.
>>>>>>
>>>>>> To avoid meet the second BUG, function dlmunlock_common shuold
>>>>>> return DLM_CANCELGRANT if the lock is on granted list and
>>>>>> the operation is canceled.
>>>>>>
>>>>>> Signed-off-by: Jian Wang<wangjian161@huawei.com>
>>>>>> Reviewed-by: Yiwen Jiang<jiangyiwen@huawei.com>
>>>>>> ---
>>>>>>     fs/ocfs2/dlm/dlmrecovery.c | 1 -
>>>>>>     fs/ocfs2/dlm/dlmunlock.c   | 5 +++++
>>>>>>     2 files changed, 5 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
>>>>>> index 802636d..7489652 100644
>>>>>> --- a/fs/ocfs2/dlm/dlmrecovery.c
>>>>>> +++ b/fs/ocfs2/dlm/dlmrecovery.c
>>>>>> @@ -2134,7 +2134,6 @@ void dlm_move_lockres_to_recovery_list(struct dlm_ctxt *dlm,
>>>>>>     				 * if this had completed successfully
>>>>>>     				 * before sending this lock state to the
>>>>>>     				 * new master */
>>>>>> -				BUG_ON(i != DLM_CONVERTING_LIST);
>>>>>>     				mlog(0, "node died with cancel pending "
>>>>>>     				     "on %.*s. move back to granted list.\n",
>>>>>>     				     res->lockname.len, res->lockname.name);
>>>>>> diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
>>>>>> index 63d701c..505bb6c 100644
>>>>>> --- a/fs/ocfs2/dlm/dlmunlock.c
>>>>>> +++ b/fs/ocfs2/dlm/dlmunlock.c
>>>>>> @@ -183,6 +183,11 @@ static enum dlm_status dlmunlock_common(struct dlm_ctxt *dlm,
>>>>>>     							flags, owner);
>>>>>>     		spin_lock(&res->spinlock);
>>>>>>     		spin_lock(&lock->spinlock);
>>>>>> +
>>>>>> +		if ((flags & LKM_CANCEL) &&
>>>>>> +				dlm_lock_on_list(&res->granted, lock))
>>>>>> +			status = DLM_CANCELGRANT;
>>>>>> +
>>>>>>     		/* if the master told us the lock was already granted,
>>>>>>     		 * let the ast handle all of these actions */
>>>>>>     		if (status == DLM_CANCELGRANT) {
>>>>>> -- 
>>>>>> 1.8.3.1
>>>>>>
>>>>> .
>>>>>
>>> .
>>>
>>
>>
>> _______________________________________________
>> Ocfs2-devel mailing list
>> Ocfs2-devel@oss.oracle.com
>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>>
>
piaojun Feb. 14, 2019, 10:25 a.m. UTC | #8
Hi Changwei,

On 2019/2/14 17:59, Changwei Ge wrote:
> Hi Jun,
> 
> On 2019/2/14 17:09, piaojun wrote:
>> Hi Changwei,
>>
>> The problem can't be solved completely if clear ::cancel_pending in
>> dlm_proxy_ast_handler, as AST will come at anytime just before
>> ::cancel_pendig is set. If there is not any other better solutions,
>> could we accept this patch? This bug is very harmful.
> 
> Um, I am recalling the problem this patch is trying to address...
> If I miss something, please let me know.
> 
> I read the patch again, it seems that NODE3 finds a lock on its _grant list_ with ::cancel_pending set, which makes dlm insane.
> Actually, I don't like patches addressing problems killing BUG checks as it keeps the whole software consistent with its original design.
> With those BUG checks, we can add feature and fix bugs in an elegant way.

I also agree this.

> 
> As we the lock is *already granted*, why not just check ::cancel_pending and if it is set, clear it and then move the lock to
> grant list since cancellation obviously cant work anymore. And I think we even don't have to worry about dlm_send_remote_unlock_request()
> since message sending will fail anyway, but function will return NORMAL as NODE1 was *dead* already.
> 
> 
> I admit that the problem reported truly exists and should be solved but the method is not in my favor. :-)
> I have several reasons for your reference:
> 1) My biggest concern is we should not remove the BUG check ( BUG_ON(i != DLM_CONVERTING_LIST) ),
>     because it might violate the original ocfs2/dlm design making following maintenance work hard.
> 2) As for another patch from Jian, it can't use LVB to boost performance that single locking procedure.
> 
> I am providing a draft code for you and Jian's reference.
> 
> root@ubuntu16:/home/chge/linux[master]# git diff
> diff --git a/fs/ocfs2/dlm/dlmast.c b/fs/ocfs2/dlm/dlmast.c
> index 39831fc..812843b 100644
> --- a/fs/ocfs2/dlm/dlmast.c
> +++ b/fs/ocfs2/dlm/dlmast.c
> @@ -372,8 +372,11 @@ int dlm_proxy_ast_handler(struct o2net_msg *msg, u32 len, void *data,
>          head = &res->converting;
>          lock = NULL;
>          list_for_each_entry(lock, head, list) {
> -               if (lock->ml.cookie == cookie)
> +               if (lock->ml.cookie == cookie) {
> +                       if (lock->cancel_pending)
> +                               lock->cancel_pending = 0;

I'm afraid that there is a case lock->cancel_pending is not set yet in
dlmunlock_common, so this problem still exist.

>                          goto do_ast;
> +               }
>          }
> 
> Thanks,
> Changwei
> 
> 
>>
>> Thanks,
>> Jun
>>
>> On 2018/12/8 18:05, wangjian wrote:
>>> Hi Changwei,
>>>
>>> I understand your idea. But we should be aware that the cancel_convert process and
>>> other processes (accepting the AST process, the recovery process) are asynchronous.
>>> For example, according to your idea, check if the lock is in the grant queue before
>>> calling the dlm_send_remote_unlock_request function in the dlm_proxy_ast_handler function.
>>> Then decide whether to clear cancel_pending. But if the AST does not come at this time,
>>> the check passes and cancel_pendig will not be cleared. Then AST immediately came over again,
>>> which also led to a bug. I personally think that for asynchronous processes we can't guarantee
>>> the speed of execution of each process. All we can do is to avoid the BUG scene.
>>>
>>> As for the question you said ("If you remove the BUG check why you still call dlm_commit_pending_cancel()
>>> to move the lock back to grant on matter it's on converting list or not?").
>>> I think we should first check if the lock is in the grant queue
>>> (at this time, dlm->spinlock and res->spinlock have been added), then decide whether to call
>>> dlm_commit_pending_cancel function.
>>>
>>> Thanks,
>>> Jian
>>>
>>> On 12/7/2018 11:12 AM, Changwei Ge wrote:
>>>> Hi Jian,
>>>>
>>>> I suppose that the situation you described truly exists.
>>>> But the way you fix the issue is not in my favor.
>>>>
>>>> If you remove the BUG check why you still call dlm_commit_pending_cancel() to
>>>> move the lock back to grant on matter it's on converting list or not?
>>>>
>>>> So how about keeping the BUG check in dlm_move_lockres_to_recovery_list().
>>>> If the locking _ast_ comes back very fast just check ::cancel_pending in dlm_proxy_ast_handler() and clear it.
>>>> Then with the logic checking if the lock is on grant list (in dlmunlock_common() more less like your current method)
>>>> or not we can easily tell if the cancellation succeeds or not.
>>>>
>>>> That complies the original dlm design, which I think is better and easier for maintainers to understand.
>>>>
>>>> Thanks,
>>>> Changwei
>>>>
>>>> On 2018/12/6 20:06, wangjian wrote:
>>>>> Hi Changwei,
>>>>>
>>>>> I don't fully agree with your point of view. In my opinion, the lock cancellation process and the lock
>>>>> conversion process are asynchronous. We can't guarantee that the lock must be in the convert list
>>>>> during the lock conversion process, otherwise this BUG will not happen.
>>>>> So, I think this is a meaningless BUG.
>>>>>
>>>>> Thanks,
>>>>> Jian
>>>>>
>>>>> On 12/5/2018 9:49 AM, Changwei Ge wrote:
>>>>>> Hi Jian,
>>>>>>
>>>>>> I suppose you can't just remove the BUG_ON() check.
>>>>>> If you remove it, below code violates the original logic.
>>>>>>
>>>>>> '''
>>>>>> 2141                                 dlm_commit_pending_cancel(res, lock);
>>>>>> '''
>>>>>>
>>>>>> What's more important is *locking cancel* must be against a *locking conversion* progress.
>>>>>> So it makes sense to check if this lock is on converting list.
>>>>>>
>>>>>> So I have to NACK to this patch.
>>>>>>
>>>>>> Thanks,
>>>>>> Changwei
>>>>>>
>>>>>>
>>>>>> On 2018/12/3 20:23, wangjian wrote:
>>>>>>> In the dlm_move_lockres_to_recovery_list function, if the lock
>>>>>>> is in the granted queue and cancel_pending is set, it will
>>>>>>> encounter a BUG. I think this is a meaningless BUG,
>>>>>>> so be prepared to remove it. A scenario that causes
>>>>>>> this BUG will be given below.
>>>>>>>
>>>>>>> At the beginning, Node 1 is the master and has NL lock,
>>>>>>> Node 2 has PR lock, Node 3 has PR lock too.
>>>>>>>
>>>>>>> Node 1          Node 2          Node 3
>>>>>>>                want to get EX lock.
>>>>>>>
>>>>>>>                                want to get EX lock.
>>>>>>>
>>>>>>> Node 3 hinder
>>>>>>> Node 2 to get
>>>>>>> EX lock, send
>>>>>>> Node 3 a BAST.
>>>>>>>
>>>>>>>                                receive BAST from
>>>>>>>                                Node 1. downconvert
>>>>>>>                                thread begin to
>>>>>>>                                cancel PR to EX conversion.
>>>>>>>                                In dlmunlock_common function,
>>>>>>>                                downconvert thread has set
>>>>>>>                                lock->cancel_pending,
>>>>>>>                                but did not enter
>>>>>>>                                dlm_send_remote_unlock_request
>>>>>>>                                function.
>>>>>>>
>>>>>>>                Node2 dies because
>>>>>>>                the host is powered down.
>>>>>>>
>>>>>>> In recovery process,
>>>>>>> clean the lock that
>>>>>>> related to Node2.
>>>>>>> then finish Node 3
>>>>>>> PR to EX request.
>>>>>>> give Node 3 a AST.
>>>>>>>
>>>>>>>                                receive AST from Node 1.
>>>>>>>                                change lock level to EX,
>>>>>>>                                move lock to granted list.
>>>>>>>
>>>>>>> Node1 dies because
>>>>>>> the host is powered down.
>>>>>>>
>>>>>>>                                In dlm_move_lockres_to_recovery_list
>>>>>>>                                function. the lock is in the
>>>>>>>                                granted queue and cancel_pending
>>>>>>>                                is set. BUG_ON.
>>>>>>>
>>>>>>> But after clearing this BUG, process will encounter
>>>>>>> the second BUG in the ocfs2_unlock_ast function.
>>>>>>> Here is a scenario that will cause the second BUG
>>>>>>> in ocfs2_unlock_ast as follows:
>>>>>>>
>>>>>>> At the beginning, Node 1 is the master and has NL lock,
>>>>>>> Node 2 has PR lock, Node 3 has PR lock too.
>>>>>>>
>>>>>>> Node 1          Node 2          Node 3
>>>>>>>                want to get EX lock.
>>>>>>>
>>>>>>>                                want to get EX lock.
>>>>>>>
>>>>>>> Node 3 hinder
>>>>>>> Node 2 to get
>>>>>>> EX lock, send
>>>>>>> Node 3 a BAST.
>>>>>>>
>>>>>>>                                receive BAST from
>>>>>>>                                Node 1. downconvert
>>>>>>>                                thread begin to
>>>>>>>                                cancel PR to EX conversion.
>>>>>>>                                In dlmunlock_common function,
>>>>>>>                                downconvert thread has released
>>>>>>>                                lock->spinlock and res->spinlock,
>>>>>>>                                but did not enter
>>>>>>>                                dlm_send_remote_unlock_request
>>>>>>>                                function.
>>>>>>>
>>>>>>>                Node2 dies because
>>>>>>>                the host is powered down.
>>>>>>>
>>>>>>> In recovery process,
>>>>>>> clean the lock that
>>>>>>> related to Node2.
>>>>>>> then finish Node 3
>>>>>>> PR to EX request.
>>>>>>> give Node 3 a AST.
>>>>>>>
>>>>>>>                                receive AST from Node 1.
>>>>>>>                                change lock level to EX,
>>>>>>>                                move lock to granted list,
>>>>>>>                                set lockres->l_unlock_action
>>>>>>>                                as OCFS2_UNLOCK_INVALID
>>>>>>>                                in ocfs2_locking_ast function.
>>>>>>>
>>>>>>> Node2 dies because
>>>>>>> the host is powered down.
>>>>>>>
>>>>>>>                                Node 3 realize that Node 1
>>>>>>>                                is dead, remove Node 1 from
>>>>>>>                                domain_map. downconvert thread
>>>>>>>                                get DLM_NORMAL from
>>>>>>>                                dlm_send_remote_unlock_request
>>>>>>>                                function and set *call_ast as 1.
>>>>>>>                                Then downconvert thread meet
>>>>>>>                                BUG in ocfs2_unlock_ast function.
>>>>>>>
>>>>>>> To avoid meet the second BUG, function dlmunlock_common shuold
>>>>>>> return DLM_CANCELGRANT if the lock is on granted list and
>>>>>>> the operation is canceled.
>>>>>>>
>>>>>>> Signed-off-by: Jian Wang<wangjian161@huawei.com>
>>>>>>> Reviewed-by: Yiwen Jiang<jiangyiwen@huawei.com>
>>>>>>> ---
>>>>>>>     fs/ocfs2/dlm/dlmrecovery.c | 1 -
>>>>>>>     fs/ocfs2/dlm/dlmunlock.c   | 5 +++++
>>>>>>>     2 files changed, 5 insertions(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
>>>>>>> index 802636d..7489652 100644
>>>>>>> --- a/fs/ocfs2/dlm/dlmrecovery.c
>>>>>>> +++ b/fs/ocfs2/dlm/dlmrecovery.c
>>>>>>> @@ -2134,7 +2134,6 @@ void dlm_move_lockres_to_recovery_list(struct dlm_ctxt *dlm,
>>>>>>>     				 * if this had completed successfully
>>>>>>>     				 * before sending this lock state to the
>>>>>>>     				 * new master */
>>>>>>> -				BUG_ON(i != DLM_CONVERTING_LIST);
>>>>>>>     				mlog(0, "node died with cancel pending "
>>>>>>>     				     "on %.*s. move back to granted list.\n",
>>>>>>>     				     res->lockname.len, res->lockname.name);
>>>>>>> diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
>>>>>>> index 63d701c..505bb6c 100644
>>>>>>> --- a/fs/ocfs2/dlm/dlmunlock.c
>>>>>>> +++ b/fs/ocfs2/dlm/dlmunlock.c
>>>>>>> @@ -183,6 +183,11 @@ static enum dlm_status dlmunlock_common(struct dlm_ctxt *dlm,
>>>>>>>     							flags, owner);
>>>>>>>     		spin_lock(&res->spinlock);
>>>>>>>     		spin_lock(&lock->spinlock);
>>>>>>> +
>>>>>>> +		if ((flags & LKM_CANCEL) &&
>>>>>>> +				dlm_lock_on_list(&res->granted, lock))
>>>>>>> +			status = DLM_CANCELGRANT;
>>>>>>> +
>>>>>>>     		/* if the master told us the lock was already granted,
>>>>>>>     		 * let the ast handle all of these actions */
>>>>>>>     		if (status == DLM_CANCELGRANT) {
>>>>>>> -- 
>>>>>>> 1.8.3.1
>>>>>>>
>>>>>> .
>>>>>>
>>>> .
>>>>
>>>
>>>
>>> _______________________________________________
>>> Ocfs2-devel mailing list
>>> Ocfs2-devel@oss.oracle.com
>>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>>>
>>
> .
>
Changwei Ge Feb. 14, 2019, 10:28 a.m. UTC | #9
On 2019/2/14 18:25, piaojun wrote:
> Hi Changwei,
> 
> On 2019/2/14 17:59, Changwei Ge wrote:
>> Hi Jun,
>>
>> On 2019/2/14 17:09, piaojun wrote:
>>> Hi Changwei,
>>>
>>> The problem can't be solved completely if clear ::cancel_pending in
>>> dlm_proxy_ast_handler, as AST will come at anytime just before
>>> ::cancel_pendig is set. If there is not any other better solutions,
>>> could we accept this patch? This bug is very harmful.
>>
>> Um, I am recalling the problem this patch is trying to address...
>> If I miss something, please let me know.
>>
>> I read the patch again, it seems that NODE3 finds a lock on its _grant list_ with ::cancel_pending set, which makes dlm insane.
>> Actually, I don't like patches addressing problems killing BUG checks as it keeps the whole software consistent with its original design.
>> With those BUG checks, we can add feature and fix bugs in an elegant way.
> 
> I also agree this.
> 
>>
>> As we the lock is *already granted*, why not just check ::cancel_pending and if it is set, clear it and then move the lock to
>> grant list since cancellation obviously cant work anymore. And I think we even don't have to worry about dlm_send_remote_unlock_request()
>> since message sending will fail anyway, but function will return NORMAL as NODE1 was *dead* already.
>>
>>
>> I admit that the problem reported truly exists and should be solved but the method is not in my favor. :-)
>> I have several reasons for your reference:
>> 1) My biggest concern is we should not remove the BUG check ( BUG_ON(i != DLM_CONVERTING_LIST) ),
>>      because it might violate the original ocfs2/dlm design making following maintenance work hard.
>> 2) As for another patch from Jian, it can't use LVB to boost performance that single locking procedure.
>>
>> I am providing a draft code for you and Jian's reference.
>>
>> root@ubuntu16:/home/chge/linux[master]# git diff
>> diff --git a/fs/ocfs2/dlm/dlmast.c b/fs/ocfs2/dlm/dlmast.c
>> index 39831fc..812843b 100644
>> --- a/fs/ocfs2/dlm/dlmast.c
>> +++ b/fs/ocfs2/dlm/dlmast.c
>> @@ -372,8 +372,11 @@ int dlm_proxy_ast_handler(struct o2net_msg *msg, u32 len, void *data,
>>           head = &res->converting;
>>           lock = NULL;
>>           list_for_each_entry(lock, head, list) {
>> -               if (lock->ml.cookie == cookie)
>> +               if (lock->ml.cookie == cookie) {
>> +                       if (lock->cancel_pending)
>> +                               lock->cancel_pending = 0;
> 
> I'm afraid that there is a case lock->cancel_pending is not set yet in
> dlmunlock_common, so this problem still exist.

Please refer to my another mail which might close the race you are concerning.

> 
>>                           goto do_ast;
>> +               }
>>           }
>>
>> Thanks,
>> Changwei
>>
>>
>>>
>>> Thanks,
>>> Jun
>>>
>>> On 2018/12/8 18:05, wangjian wrote:
>>>> Hi Changwei,
>>>>
>>>> I understand your idea. But we should be aware that the cancel_convert process and
>>>> other processes (accepting the AST process, the recovery process) are asynchronous.
>>>> For example, according to your idea, check if the lock is in the grant queue before
>>>> calling the dlm_send_remote_unlock_request function in the dlm_proxy_ast_handler function.
>>>> Then decide whether to clear cancel_pending. But if the AST does not come at this time,
>>>> the check passes and cancel_pendig will not be cleared. Then AST immediately came over again,
>>>> which also led to a bug. I personally think that for asynchronous processes we can't guarantee
>>>> the speed of execution of each process. All we can do is to avoid the BUG scene.
>>>>
>>>> As for the question you said ("If you remove the BUG check why you still call dlm_commit_pending_cancel()
>>>> to move the lock back to grant on matter it's on converting list or not?").
>>>> I think we should first check if the lock is in the grant queue
>>>> (at this time, dlm->spinlock and res->spinlock have been added), then decide whether to call
>>>> dlm_commit_pending_cancel function.
>>>>
>>>> Thanks,
>>>> Jian
>>>>
>>>> On 12/7/2018 11:12 AM, Changwei Ge wrote:
>>>>> Hi Jian,
>>>>>
>>>>> I suppose that the situation you described truly exists.
>>>>> But the way you fix the issue is not in my favor.
>>>>>
>>>>> If you remove the BUG check why you still call dlm_commit_pending_cancel() to
>>>>> move the lock back to grant on matter it's on converting list or not?
>>>>>
>>>>> So how about keeping the BUG check in dlm_move_lockres_to_recovery_list().
>>>>> If the locking _ast_ comes back very fast just check ::cancel_pending in dlm_proxy_ast_handler() and clear it.
>>>>> Then with the logic checking if the lock is on grant list (in dlmunlock_common() more less like your current method)
>>>>> or not we can easily tell if the cancellation succeeds or not.
>>>>>
>>>>> That complies the original dlm design, which I think is better and easier for maintainers to understand.
>>>>>
>>>>> Thanks,
>>>>> Changwei
>>>>>
>>>>> On 2018/12/6 20:06, wangjian wrote:
>>>>>> Hi Changwei,
>>>>>>
>>>>>> I don't fully agree with your point of view. In my opinion, the lock cancellation process and the lock
>>>>>> conversion process are asynchronous. We can't guarantee that the lock must be in the convert list
>>>>>> during the lock conversion process, otherwise this BUG will not happen.
>>>>>> So, I think this is a meaningless BUG.
>>>>>>
>>>>>> Thanks,
>>>>>> Jian
>>>>>>
>>>>>> On 12/5/2018 9:49 AM, Changwei Ge wrote:
>>>>>>> Hi Jian,
>>>>>>>
>>>>>>> I suppose you can't just remove the BUG_ON() check.
>>>>>>> If you remove it, below code violates the original logic.
>>>>>>>
>>>>>>> '''
>>>>>>> 2141                                 dlm_commit_pending_cancel(res, lock);
>>>>>>> '''
>>>>>>>
>>>>>>> What's more important is *locking cancel* must be against a *locking conversion* progress.
>>>>>>> So it makes sense to check if this lock is on converting list.
>>>>>>>
>>>>>>> So I have to NACK to this patch.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Changwei
>>>>>>>
>>>>>>>
>>>>>>> On 2018/12/3 20:23, wangjian wrote:
>>>>>>>> In the dlm_move_lockres_to_recovery_list function, if the lock
>>>>>>>> is in the granted queue and cancel_pending is set, it will
>>>>>>>> encounter a BUG. I think this is a meaningless BUG,
>>>>>>>> so be prepared to remove it. A scenario that causes
>>>>>>>> this BUG will be given below.
>>>>>>>>
>>>>>>>> At the beginning, Node 1 is the master and has NL lock,
>>>>>>>> Node 2 has PR lock, Node 3 has PR lock too.
>>>>>>>>
>>>>>>>> Node 1          Node 2          Node 3
>>>>>>>>                 want to get EX lock.
>>>>>>>>
>>>>>>>>                                 want to get EX lock.
>>>>>>>>
>>>>>>>> Node 3 hinder
>>>>>>>> Node 2 to get
>>>>>>>> EX lock, send
>>>>>>>> Node 3 a BAST.
>>>>>>>>
>>>>>>>>                                 receive BAST from
>>>>>>>>                                 Node 1. downconvert
>>>>>>>>                                 thread begin to
>>>>>>>>                                 cancel PR to EX conversion.
>>>>>>>>                                 In dlmunlock_common function,
>>>>>>>>                                 downconvert thread has set
>>>>>>>>                                 lock->cancel_pending,
>>>>>>>>                                 but did not enter
>>>>>>>>                                 dlm_send_remote_unlock_request
>>>>>>>>                                 function.
>>>>>>>>
>>>>>>>>                 Node2 dies because
>>>>>>>>                 the host is powered down.
>>>>>>>>
>>>>>>>> In recovery process,
>>>>>>>> clean the lock that
>>>>>>>> related to Node2.
>>>>>>>> then finish Node 3
>>>>>>>> PR to EX request.
>>>>>>>> give Node 3 a AST.
>>>>>>>>
>>>>>>>>                                 receive AST from Node 1.
>>>>>>>>                                 change lock level to EX,
>>>>>>>>                                 move lock to granted list.
>>>>>>>>
>>>>>>>> Node1 dies because
>>>>>>>> the host is powered down.
>>>>>>>>
>>>>>>>>                                 In dlm_move_lockres_to_recovery_list
>>>>>>>>                                 function. the lock is in the
>>>>>>>>                                 granted queue and cancel_pending
>>>>>>>>                                 is set. BUG_ON.
>>>>>>>>
>>>>>>>> But after clearing this BUG, process will encounter
>>>>>>>> the second BUG in the ocfs2_unlock_ast function.
>>>>>>>> Here is a scenario that will cause the second BUG
>>>>>>>> in ocfs2_unlock_ast as follows:
>>>>>>>>
>>>>>>>> At the beginning, Node 1 is the master and has NL lock,
>>>>>>>> Node 2 has PR lock, Node 3 has PR lock too.
>>>>>>>>
>>>>>>>> Node 1          Node 2          Node 3
>>>>>>>>                 want to get EX lock.
>>>>>>>>
>>>>>>>>                                 want to get EX lock.
>>>>>>>>
>>>>>>>> Node 3 hinder
>>>>>>>> Node 2 to get
>>>>>>>> EX lock, send
>>>>>>>> Node 3 a BAST.
>>>>>>>>
>>>>>>>>                                 receive BAST from
>>>>>>>>                                 Node 1. downconvert
>>>>>>>>                                 thread begin to
>>>>>>>>                                 cancel PR to EX conversion.
>>>>>>>>                                 In dlmunlock_common function,
>>>>>>>>                                 downconvert thread has released
>>>>>>>>                                 lock->spinlock and res->spinlock,
>>>>>>>>                                 but did not enter
>>>>>>>>                                 dlm_send_remote_unlock_request
>>>>>>>>                                 function.
>>>>>>>>
>>>>>>>>                 Node2 dies because
>>>>>>>>                 the host is powered down.
>>>>>>>>
>>>>>>>> In recovery process,
>>>>>>>> clean the lock that
>>>>>>>> related to Node2.
>>>>>>>> then finish Node 3
>>>>>>>> PR to EX request.
>>>>>>>> give Node 3 a AST.
>>>>>>>>
>>>>>>>>                                 receive AST from Node 1.
>>>>>>>>                                 change lock level to EX,
>>>>>>>>                                 move lock to granted list,
>>>>>>>>                                 set lockres->l_unlock_action
>>>>>>>>                                 as OCFS2_UNLOCK_INVALID
>>>>>>>>                                 in ocfs2_locking_ast function.
>>>>>>>>
>>>>>>>> Node2 dies because
>>>>>>>> the host is powered down.
>>>>>>>>
>>>>>>>>                                 Node 3 realize that Node 1
>>>>>>>>                                 is dead, remove Node 1 from
>>>>>>>>                                 domain_map. downconvert thread
>>>>>>>>                                 get DLM_NORMAL from
>>>>>>>>                                 dlm_send_remote_unlock_request
>>>>>>>>                                 function and set *call_ast as 1.
>>>>>>>>                                 Then downconvert thread meet
>>>>>>>>                                 BUG in ocfs2_unlock_ast function.
>>>>>>>>
>>>>>>>> To avoid meet the second BUG, function dlmunlock_common shuold
>>>>>>>> return DLM_CANCELGRANT if the lock is on granted list and
>>>>>>>> the operation is canceled.
>>>>>>>>
>>>>>>>> Signed-off-by: Jian Wang<wangjian161@huawei.com>
>>>>>>>> Reviewed-by: Yiwen Jiang<jiangyiwen@huawei.com>
>>>>>>>> ---
>>>>>>>>      fs/ocfs2/dlm/dlmrecovery.c | 1 -
>>>>>>>>      fs/ocfs2/dlm/dlmunlock.c   | 5 +++++
>>>>>>>>      2 files changed, 5 insertions(+), 1 deletion(-)
>>>>>>>>
>>>>>>>> diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
>>>>>>>> index 802636d..7489652 100644
>>>>>>>> --- a/fs/ocfs2/dlm/dlmrecovery.c
>>>>>>>> +++ b/fs/ocfs2/dlm/dlmrecovery.c
>>>>>>>> @@ -2134,7 +2134,6 @@ void dlm_move_lockres_to_recovery_list(struct dlm_ctxt *dlm,
>>>>>>>>      				 * if this had completed successfully
>>>>>>>>      				 * before sending this lock state to the
>>>>>>>>      				 * new master */
>>>>>>>> -				BUG_ON(i != DLM_CONVERTING_LIST);
>>>>>>>>      				mlog(0, "node died with cancel pending "
>>>>>>>>      				     "on %.*s. move back to granted list.\n",
>>>>>>>>      				     res->lockname.len, res->lockname.name);
>>>>>>>> diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
>>>>>>>> index 63d701c..505bb6c 100644
>>>>>>>> --- a/fs/ocfs2/dlm/dlmunlock.c
>>>>>>>> +++ b/fs/ocfs2/dlm/dlmunlock.c
>>>>>>>> @@ -183,6 +183,11 @@ static enum dlm_status dlmunlock_common(struct dlm_ctxt *dlm,
>>>>>>>>      							flags, owner);
>>>>>>>>      		spin_lock(&res->spinlock);
>>>>>>>>      		spin_lock(&lock->spinlock);
>>>>>>>> +
>>>>>>>> +		if ((flags & LKM_CANCEL) &&
>>>>>>>> +				dlm_lock_on_list(&res->granted, lock))
>>>>>>>> +			status = DLM_CANCELGRANT;
>>>>>>>> +
>>>>>>>>      		/* if the master told us the lock was already granted,
>>>>>>>>      		 * let the ast handle all of these actions */
>>>>>>>>      		if (status == DLM_CANCELGRANT) {
>>>>>>>> -- 
>>>>>>>> 1.8.3.1
>>>>>>>>
>>>>>>> .
>>>>>>>
>>>>> .
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Ocfs2-devel mailing list
>>>> Ocfs2-devel@oss.oracle.com
>>>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>>>>
>>>
>> .
>>
>
piaojun Feb. 15, 2019, 6:50 a.m. UTC | #10
Hi Changwei,

On 2019/2/14 18:13, Changwei Ge wrote:
> On 2019/2/14 17:09, piaojun wrote:
>> Hi Changwei,
>>
>> The problem can't be solved completely if clear ::cancel_pending in
>> dlm_proxy_ast_handler, as AST will come at anytime just before
> 
> So how about also add check here bere setting ::cancel_pending in dlmunlock_common() before invoking dlm_send_remote_unlock_request().
> If already on grant list just return DLM_CANCELGRANT
> 
> Then a further reference code might look like:
> 
> root@ubuntu16:/home/chge/linux[master]# git diff
> diff --git a/fs/ocfs2/dlm/dlmast.c b/fs/ocfs2/dlm/dlmast.c
> index 39831fc..812843b 100644
> --- a/fs/ocfs2/dlm/dlmast.c
> +++ b/fs/ocfs2/dlm/dlmast.c
> @@ -372,8 +372,11 @@ int dlm_proxy_ast_handler(struct o2net_msg *msg, u32 len, void *data,
>          head = &res->converting;
>          lock = NULL;
>          list_for_each_entry(lock, head, list) {
> -               if (lock->ml.cookie == cookie)
> +               if (lock->ml.cookie == cookie) {
> +                       if (lock->cancel_pending)
> +                               lock->cancel_pending = 0;
>                          goto do_ast;
> +               }
>          }
> 
>          /* if not on convert, try blocked for ast, granted for bast */
> diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
> index c8e9b70..b4728b5 100644
> --- a/fs/ocfs2/dlm/dlmunlock.c
> +++ b/fs/ocfs2/dlm/dlmunlock.c
> @@ -174,9 +174,14 @@ static enum dlm_status dlmunlock_common(struct dlm_ctxt *dlm,
>          if (!master_node) {
>                  owner = res->owner;
>                  /* drop locks and send message */
> -               if (flags & LKM_CANCEL)
> +               if (flags & LKM_CANCEL) {
> +                       if (dlm_lock_on_list(&res->granted, lock)) {
> +                               status = DLM_CANCELGRANT;
> +                               goto leave;

If master dead and then lockres is moved to granted list in
dlm_move_lockres_to_recovery_list, the OCFS2_LOCK_BUSY is not cleared.
And this will cause stuck problem when ocfs2_drop_lock. The reason is
that unlock ast won't be done when DLM_CANCELGRANT is set. So I think
we need distinguish all the cases of moving lockres to grant list.

> +                       }
> +
>                          lock->cancel_pending = 1;
> -               else
> +               } else
>                          lock->unlock_pending = 1;
>                  spin_unlock(&lock->spinlock);
>                  spin_unlock(&res->spinlock);
> 
> 
> 
> Thanks,
> Changwei
> 
>> ::cancel_pendig is set. If there is not any other better solutions,
>> could we accept this patch? This bug is very harmful.
>>
>> Thanks,
>> Jun
>>
>> On 2018/12/8 18:05, wangjian wrote:
>>> Hi Changwei,
>>>
>>> I understand your idea. But we should be aware that the cancel_convert process and
>>> other processes (accepting the AST process, the recovery process) are asynchronous.
>>> For example, according to your idea, check if the lock is in the grant queue before
>>> calling the dlm_send_remote_unlock_request function in the dlm_proxy_ast_handler function.
>>> Then decide whether to clear cancel_pending. But if the AST does not come at this time,
>>> the check passes and cancel_pendig will not be cleared. Then AST immediately came over again,
>>> which also led to a bug. I personally think that for asynchronous processes we can't guarantee
>>> the speed of execution of each process. All we can do is to avoid the BUG scene.
>>>
>>> As for the question you said ("If you remove the BUG check why you still call dlm_commit_pending_cancel()
>>> to move the lock back to grant on matter it's on converting list or not?").
>>> I think we should first check if the lock is in the grant queue
>>> (at this time, dlm->spinlock and res->spinlock have been added), then decide whether to call
>>> dlm_commit_pending_cancel function.
>>>
>>> Thanks,
>>> Jian
>>>
>>> On 12/7/2018 11:12 AM, Changwei Ge wrote:
>>>> Hi Jian,
>>>>
>>>> I suppose that the situation you described truly exists.
>>>> But the way you fix the issue is not in my favor.
>>>>
>>>> If you remove the BUG check why you still call dlm_commit_pending_cancel() to
>>>> move the lock back to grant on matter it's on converting list or not?
>>>>
>>>> So how about keeping the BUG check in dlm_move_lockres_to_recovery_list().
>>>> If the locking _ast_ comes back very fast just check ::cancel_pending in dlm_proxy_ast_handler() and clear it.
>>>> Then with the logic checking if the lock is on grant list (in dlmunlock_common() more less like your current method)
>>>> or not we can easily tell if the cancellation succeeds or not.
>>>>
>>>> That complies the original dlm design, which I think is better and easier for maintainers to understand.
>>>>
>>>> Thanks,
>>>> Changwei
>>>>
>>>> On 2018/12/6 20:06, wangjian wrote:
>>>>> Hi Changwei,
>>>>>
>>>>> I don't fully agree with your point of view. In my opinion, the lock cancellation process and the lock
>>>>> conversion process are asynchronous. We can't guarantee that the lock must be in the convert list
>>>>> during the lock conversion process, otherwise this BUG will not happen.
>>>>> So, I think this is a meaningless BUG.
>>>>>
>>>>> Thanks,
>>>>> Jian
>>>>>
>>>>> On 12/5/2018 9:49 AM, Changwei Ge wrote:
>>>>>> Hi Jian,
>>>>>>
>>>>>> I suppose you can't just remove the BUG_ON() check.
>>>>>> If you remove it, below code violates the original logic.
>>>>>>
>>>>>> '''
>>>>>> 2141                                 dlm_commit_pending_cancel(res, lock);
>>>>>> '''
>>>>>>
>>>>>> What's more important is *locking cancel* must be against a *locking conversion* progress.
>>>>>> So it makes sense to check if this lock is on converting list.
>>>>>>
>>>>>> So I have to NACK to this patch.
>>>>>>
>>>>>> Thanks,
>>>>>> Changwei
>>>>>>
>>>>>>
>>>>>> On 2018/12/3 20:23, wangjian wrote:
>>>>>>> In the dlm_move_lockres_to_recovery_list function, if the lock
>>>>>>> is in the granted queue and cancel_pending is set, it will
>>>>>>> encounter a BUG. I think this is a meaningless BUG,
>>>>>>> so be prepared to remove it. A scenario that causes
>>>>>>> this BUG will be given below.
>>>>>>>
>>>>>>> At the beginning, Node 1 is the master and has NL lock,
>>>>>>> Node 2 has PR lock, Node 3 has PR lock too.
>>>>>>>
>>>>>>> Node 1          Node 2          Node 3
>>>>>>>                want to get EX lock.
>>>>>>>
>>>>>>>                                want to get EX lock.
>>>>>>>
>>>>>>> Node 3 hinder
>>>>>>> Node 2 to get
>>>>>>> EX lock, send
>>>>>>> Node 3 a BAST.
>>>>>>>
>>>>>>>                                receive BAST from
>>>>>>>                                Node 1. downconvert
>>>>>>>                                thread begin to
>>>>>>>                                cancel PR to EX conversion.
>>>>>>>                                In dlmunlock_common function,
>>>>>>>                                downconvert thread has set
>>>>>>>                                lock->cancel_pending,
>>>>>>>                                but did not enter
>>>>>>>                                dlm_send_remote_unlock_request
>>>>>>>                                function.
>>>>>>>
>>>>>>>                Node2 dies because
>>>>>>>                the host is powered down.
>>>>>>>
>>>>>>> In recovery process,
>>>>>>> clean the lock that
>>>>>>> related to Node2.
>>>>>>> then finish Node 3
>>>>>>> PR to EX request.
>>>>>>> give Node 3 a AST.
>>>>>>>
>>>>>>>                                receive AST from Node 1.
>>>>>>>                                change lock level to EX,
>>>>>>>                                move lock to granted list.
>>>>>>>
>>>>>>> Node1 dies because
>>>>>>> the host is powered down.
>>>>>>>
>>>>>>>                                In dlm_move_lockres_to_recovery_list
>>>>>>>                                function. the lock is in the
>>>>>>>                                granted queue and cancel_pending
>>>>>>>                                is set. BUG_ON.
>>>>>>>
>>>>>>> But after clearing this BUG, process will encounter
>>>>>>> the second BUG in the ocfs2_unlock_ast function.
>>>>>>> Here is a scenario that will cause the second BUG
>>>>>>> in ocfs2_unlock_ast as follows:
>>>>>>>
>>>>>>> At the beginning, Node 1 is the master and has NL lock,
>>>>>>> Node 2 has PR lock, Node 3 has PR lock too.
>>>>>>>
>>>>>>> Node 1          Node 2          Node 3
>>>>>>>                want to get EX lock.
>>>>>>>
>>>>>>>                                want to get EX lock.
>>>>>>>
>>>>>>> Node 3 hinder
>>>>>>> Node 2 to get
>>>>>>> EX lock, send
>>>>>>> Node 3 a BAST.
>>>>>>>
>>>>>>>                                receive BAST from
>>>>>>>                                Node 1. downconvert
>>>>>>>                                thread begin to
>>>>>>>                                cancel PR to EX conversion.
>>>>>>>                                In dlmunlock_common function,
>>>>>>>                                downconvert thread has released
>>>>>>>                                lock->spinlock and res->spinlock,
>>>>>>>                                but did not enter
>>>>>>>                                dlm_send_remote_unlock_request
>>>>>>>                                function.
>>>>>>>
>>>>>>>                Node2 dies because
>>>>>>>                the host is powered down.
>>>>>>>
>>>>>>> In recovery process,
>>>>>>> clean the lock that
>>>>>>> related to Node2.
>>>>>>> then finish Node 3
>>>>>>> PR to EX request.
>>>>>>> give Node 3 a AST.
>>>>>>>
>>>>>>>                                receive AST from Node 1.
>>>>>>>                                change lock level to EX,
>>>>>>>                                move lock to granted list,
>>>>>>>                                set lockres->l_unlock_action
>>>>>>>                                as OCFS2_UNLOCK_INVALID
>>>>>>>                                in ocfs2_locking_ast function.
>>>>>>>
>>>>>>> Node2 dies because
>>>>>>> the host is powered down.
>>>>>>>
>>>>>>>                                Node 3 realize that Node 1
>>>>>>>                                is dead, remove Node 1 from
>>>>>>>                                domain_map. downconvert thread
>>>>>>>                                get DLM_NORMAL from
>>>>>>>                                dlm_send_remote_unlock_request
>>>>>>>                                function and set *call_ast as 1.
>>>>>>>                                Then downconvert thread meet
>>>>>>>                                BUG in ocfs2_unlock_ast function.
>>>>>>>
>>>>>>> To avoid meet the second BUG, function dlmunlock_common shuold
>>>>>>> return DLM_CANCELGRANT if the lock is on granted list and
>>>>>>> the operation is canceled.
>>>>>>>
>>>>>>> Signed-off-by: Jian Wang<wangjian161@huawei.com>
>>>>>>> Reviewed-by: Yiwen Jiang<jiangyiwen@huawei.com>
>>>>>>> ---
>>>>>>>     fs/ocfs2/dlm/dlmrecovery.c | 1 -
>>>>>>>     fs/ocfs2/dlm/dlmunlock.c   | 5 +++++
>>>>>>>     2 files changed, 5 insertions(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
>>>>>>> index 802636d..7489652 100644
>>>>>>> --- a/fs/ocfs2/dlm/dlmrecovery.c
>>>>>>> +++ b/fs/ocfs2/dlm/dlmrecovery.c
>>>>>>> @@ -2134,7 +2134,6 @@ void dlm_move_lockres_to_recovery_list(struct dlm_ctxt *dlm,
>>>>>>>     				 * if this had completed successfully
>>>>>>>     				 * before sending this lock state to the
>>>>>>>     				 * new master */
>>>>>>> -				BUG_ON(i != DLM_CONVERTING_LIST);
>>>>>>>     				mlog(0, "node died with cancel pending "
>>>>>>>     				     "on %.*s. move back to granted list.\n",
>>>>>>>     				     res->lockname.len, res->lockname.name);
>>>>>>> diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
>>>>>>> index 63d701c..505bb6c 100644
>>>>>>> --- a/fs/ocfs2/dlm/dlmunlock.c
>>>>>>> +++ b/fs/ocfs2/dlm/dlmunlock.c
>>>>>>> @@ -183,6 +183,11 @@ static enum dlm_status dlmunlock_common(struct dlm_ctxt *dlm,
>>>>>>>     							flags, owner);
>>>>>>>     		spin_lock(&res->spinlock);
>>>>>>>     		spin_lock(&lock->spinlock);
>>>>>>> +
>>>>>>> +		if ((flags & LKM_CANCEL) &&
>>>>>>> +				dlm_lock_on_list(&res->granted, lock))
>>>>>>> +			status = DLM_CANCELGRANT;
>>>>>>> +
>>>>>>>     		/* if the master told us the lock was already granted,
>>>>>>>     		 * let the ast handle all of these actions */
>>>>>>>     		if (status == DLM_CANCELGRANT) {
>>>>>>> -- 
>>>>>>> 1.8.3.1
>>>>>>>
>>>>>> .
>>>>>>
>>>> .
>>>>
>>>
>>>
>>> _______________________________________________
>>> Ocfs2-devel mailing list
>>> Ocfs2-devel@oss.oracle.com
>>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>>>
>>
> .
>
Changwei Ge Feb. 15, 2019, 7:06 a.m. UTC | #11
Hi Jun,

On 2019/2/15 14:51, piaojun wrote:
> Hi Changwei,
> 
> On 2019/2/14 18:13, Changwei Ge wrote:
>> On 2019/2/14 17:09, piaojun wrote:
>>> Hi Changwei,
>>>
>>> The problem can't be solved completely if clear ::cancel_pending in
>>> dlm_proxy_ast_handler, as AST will come at anytime just before
>>
>> So how about also add check here bere setting ::cancel_pending in dlmunlock_common() before invoking dlm_send_remote_unlock_request().
>> If already on grant list just return DLM_CANCELGRANT
>>
>> Then a further reference code might look like:
>>
>> root@ubuntu16:/home/chge/linux[master]# git diff
>> diff --git a/fs/ocfs2/dlm/dlmast.c b/fs/ocfs2/dlm/dlmast.c
>> index 39831fc..812843b 100644
>> --- a/fs/ocfs2/dlm/dlmast.c
>> +++ b/fs/ocfs2/dlm/dlmast.c
>> @@ -372,8 +372,11 @@ int dlm_proxy_ast_handler(struct o2net_msg *msg, u32 len, void *data,
>>           head = &res->converting;
>>           lock = NULL;
>>           list_for_each_entry(lock, head, list) {
>> -               if (lock->ml.cookie == cookie)
>> +               if (lock->ml.cookie == cookie) {
>> +                       if (lock->cancel_pending)
>> +                               lock->cancel_pending = 0;
>>                           goto do_ast;
>> +               }
>>           }
>>
>>           /* if not on convert, try blocked for ast, granted for bast */
>> diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
>> index c8e9b70..b4728b5 100644
>> --- a/fs/ocfs2/dlm/dlmunlock.c
>> +++ b/fs/ocfs2/dlm/dlmunlock.c
>> @@ -174,9 +174,14 @@ static enum dlm_status dlmunlock_common(struct dlm_ctxt *dlm,
>>           if (!master_node) {
>>                   owner = res->owner;
>>                   /* drop locks and send message */
>> -               if (flags & LKM_CANCEL)
>> +               if (flags & LKM_CANCEL) {
>> +                       if (dlm_lock_on_list(&res->granted, lock)) {
>> +                               status = DLM_CANCELGRANT;
>> +                               goto leave;
> 
> If master dead and then lockres is moved to granted list in
> dlm_move_lockres_to_recovery_list, the OCFS2_LOCK_BUSY is not cleared.

OCFS2_LOCK_BUSY  should be cleared in ocfs2_locking_ast() since previous locking AST has come back(moving lock to grant list).
That's why we return DLM_CANCELGRANT to caller to avoid calling AST. Otherwise ast() will be called twice, which is obviously a BUG.

> And this will cause stuck problem when ocfs2_drop_lock. The reason is
> that unlock ast won't be done when DLM_CANCELGRANT is set. So I think

With above elaboration, you don't have to worry the hang issue anymore.

Thanks,
Changwei

> we need distinguish all the cases of moving lockres to grant list.
> 
>> +                       }
>> +
>>                           lock->cancel_pending = 1;
>> -               else
>> +               } else
>>                           lock->unlock_pending = 1;
>>                   spin_unlock(&lock->spinlock);
>>                   spin_unlock(&res->spinlock);
>>
>>
>>
>> Thanks,
>> Changwei
>>
>>> ::cancel_pendig is set. If there is not any other better solutions,
>>> could we accept this patch? This bug is very harmful.
>>>
>>> Thanks,
>>> Jun
>>>
>>> On 2018/12/8 18:05, wangjian wrote:
>>>> Hi Changwei,
>>>>
>>>> I understand your idea. But we should be aware that the cancel_convert process and
>>>> other processes (accepting the AST process, the recovery process) are asynchronous.
>>>> For example, according to your idea, check if the lock is in the grant queue before
>>>> calling the dlm_send_remote_unlock_request function in the dlm_proxy_ast_handler function.
>>>> Then decide whether to clear cancel_pending. But if the AST does not come at this time,
>>>> the check passes and cancel_pendig will not be cleared. Then AST immediately came over again,
>>>> which also led to a bug. I personally think that for asynchronous processes we can't guarantee
>>>> the speed of execution of each process. All we can do is to avoid the BUG scene.
>>>>
>>>> As for the question you said ("If you remove the BUG check why you still call dlm_commit_pending_cancel()
>>>> to move the lock back to grant on matter it's on converting list or not?").
>>>> I think we should first check if the lock is in the grant queue
>>>> (at this time, dlm->spinlock and res->spinlock have been added), then decide whether to call
>>>> dlm_commit_pending_cancel function.
>>>>
>>>> Thanks,
>>>> Jian
>>>>
>>>> On 12/7/2018 11:12 AM, Changwei Ge wrote:
>>>>> Hi Jian,
>>>>>
>>>>> I suppose that the situation you described truly exists.
>>>>> But the way you fix the issue is not in my favor.
>>>>>
>>>>> If you remove the BUG check why you still call dlm_commit_pending_cancel() to
>>>>> move the lock back to grant on matter it's on converting list or not?
>>>>>
>>>>> So how about keeping the BUG check in dlm_move_lockres_to_recovery_list().
>>>>> If the locking _ast_ comes back very fast just check ::cancel_pending in dlm_proxy_ast_handler() and clear it.
>>>>> Then with the logic checking if the lock is on grant list (in dlmunlock_common() more less like your current method)
>>>>> or not we can easily tell if the cancellation succeeds or not.
>>>>>
>>>>> That complies the original dlm design, which I think is better and easier for maintainers to understand.
>>>>>
>>>>> Thanks,
>>>>> Changwei
>>>>>
>>>>> On 2018/12/6 20:06, wangjian wrote:
>>>>>> Hi Changwei,
>>>>>>
>>>>>> I don't fully agree with your point of view. In my opinion, the lock cancellation process and the lock
>>>>>> conversion process are asynchronous. We can't guarantee that the lock must be in the convert list
>>>>>> during the lock conversion process, otherwise this BUG will not happen.
>>>>>> So, I think this is a meaningless BUG.
>>>>>>
>>>>>> Thanks,
>>>>>> Jian
>>>>>>
>>>>>> On 12/5/2018 9:49 AM, Changwei Ge wrote:
>>>>>>> Hi Jian,
>>>>>>>
>>>>>>> I suppose you can't just remove the BUG_ON() check.
>>>>>>> If you remove it, below code violates the original logic.
>>>>>>>
>>>>>>> '''
>>>>>>> 2141                                 dlm_commit_pending_cancel(res, lock);
>>>>>>> '''
>>>>>>>
>>>>>>> What's more important is *locking cancel* must be against a *locking conversion* progress.
>>>>>>> So it makes sense to check if this lock is on converting list.
>>>>>>>
>>>>>>> So I have to NACK to this patch.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Changwei
>>>>>>>
>>>>>>>
>>>>>>> On 2018/12/3 20:23, wangjian wrote:
>>>>>>>> In the dlm_move_lockres_to_recovery_list function, if the lock
>>>>>>>> is in the granted queue and cancel_pending is set, it will
>>>>>>>> encounter a BUG. I think this is a meaningless BUG,
>>>>>>>> so be prepared to remove it. A scenario that causes
>>>>>>>> this BUG will be given below.
>>>>>>>>
>>>>>>>> At the beginning, Node 1 is the master and has NL lock,
>>>>>>>> Node 2 has PR lock, Node 3 has PR lock too.
>>>>>>>>
>>>>>>>> Node 1          Node 2          Node 3
>>>>>>>>                 want to get EX lock.
>>>>>>>>
>>>>>>>>                                 want to get EX lock.
>>>>>>>>
>>>>>>>> Node 3 hinder
>>>>>>>> Node 2 to get
>>>>>>>> EX lock, send
>>>>>>>> Node 3 a BAST.
>>>>>>>>
>>>>>>>>                                 receive BAST from
>>>>>>>>                                 Node 1. downconvert
>>>>>>>>                                 thread begin to
>>>>>>>>                                 cancel PR to EX conversion.
>>>>>>>>                                 In dlmunlock_common function,
>>>>>>>>                                 downconvert thread has set
>>>>>>>>                                 lock->cancel_pending,
>>>>>>>>                                 but did not enter
>>>>>>>>                                 dlm_send_remote_unlock_request
>>>>>>>>                                 function.
>>>>>>>>
>>>>>>>>                 Node2 dies because
>>>>>>>>                 the host is powered down.
>>>>>>>>
>>>>>>>> In recovery process,
>>>>>>>> clean the lock that
>>>>>>>> related to Node2.
>>>>>>>> then finish Node 3
>>>>>>>> PR to EX request.
>>>>>>>> give Node 3 a AST.
>>>>>>>>
>>>>>>>>                                 receive AST from Node 1.
>>>>>>>>                                 change lock level to EX,
>>>>>>>>                                 move lock to granted list.
>>>>>>>>
>>>>>>>> Node1 dies because
>>>>>>>> the host is powered down.
>>>>>>>>
>>>>>>>>                                 In dlm_move_lockres_to_recovery_list
>>>>>>>>                                 function. the lock is in the
>>>>>>>>                                 granted queue and cancel_pending
>>>>>>>>                                 is set. BUG_ON.
>>>>>>>>
>>>>>>>> But after clearing this BUG, process will encounter
>>>>>>>> the second BUG in the ocfs2_unlock_ast function.
>>>>>>>> Here is a scenario that will cause the second BUG
>>>>>>>> in ocfs2_unlock_ast as follows:
>>>>>>>>
>>>>>>>> At the beginning, Node 1 is the master and has NL lock,
>>>>>>>> Node 2 has PR lock, Node 3 has PR lock too.
>>>>>>>>
>>>>>>>> Node 1          Node 2          Node 3
>>>>>>>>                 want to get EX lock.
>>>>>>>>
>>>>>>>>                                 want to get EX lock.
>>>>>>>>
>>>>>>>> Node 3 hinder
>>>>>>>> Node 2 to get
>>>>>>>> EX lock, send
>>>>>>>> Node 3 a BAST.
>>>>>>>>
>>>>>>>>                                 receive BAST from
>>>>>>>>                                 Node 1. downconvert
>>>>>>>>                                 thread begin to
>>>>>>>>                                 cancel PR to EX conversion.
>>>>>>>>                                 In dlmunlock_common function,
>>>>>>>>                                 downconvert thread has released
>>>>>>>>                                 lock->spinlock and res->spinlock,
>>>>>>>>                                 but did not enter
>>>>>>>>                                 dlm_send_remote_unlock_request
>>>>>>>>                                 function.
>>>>>>>>
>>>>>>>>                 Node2 dies because
>>>>>>>>                 the host is powered down.
>>>>>>>>
>>>>>>>> In recovery process,
>>>>>>>> clean the lock that
>>>>>>>> related to Node2.
>>>>>>>> then finish Node 3
>>>>>>>> PR to EX request.
>>>>>>>> give Node 3 a AST.
>>>>>>>>
>>>>>>>>                                 receive AST from Node 1.
>>>>>>>>                                 change lock level to EX,
>>>>>>>>                                 move lock to granted list,
>>>>>>>>                                 set lockres->l_unlock_action
>>>>>>>>                                 as OCFS2_UNLOCK_INVALID
>>>>>>>>                                 in ocfs2_locking_ast function.
>>>>>>>>
>>>>>>>> Node2 dies because
>>>>>>>> the host is powered down.
>>>>>>>>
>>>>>>>>                                 Node 3 realize that Node 1
>>>>>>>>                                 is dead, remove Node 1 from
>>>>>>>>                                 domain_map. downconvert thread
>>>>>>>>                                 get DLM_NORMAL from
>>>>>>>>                                 dlm_send_remote_unlock_request
>>>>>>>>                                 function and set *call_ast as 1.
>>>>>>>>                                 Then downconvert thread meet
>>>>>>>>                                 BUG in ocfs2_unlock_ast function.
>>>>>>>>
>>>>>>>> To avoid meet the second BUG, function dlmunlock_common shuold
>>>>>>>> return DLM_CANCELGRANT if the lock is on granted list and
>>>>>>>> the operation is canceled.
>>>>>>>>
>>>>>>>> Signed-off-by: Jian Wang<wangjian161@huawei.com>
>>>>>>>> Reviewed-by: Yiwen Jiang<jiangyiwen@huawei.com>
>>>>>>>> ---
>>>>>>>>      fs/ocfs2/dlm/dlmrecovery.c | 1 -
>>>>>>>>      fs/ocfs2/dlm/dlmunlock.c   | 5 +++++
>>>>>>>>      2 files changed, 5 insertions(+), 1 deletion(-)
>>>>>>>>
>>>>>>>> diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
>>>>>>>> index 802636d..7489652 100644
>>>>>>>> --- a/fs/ocfs2/dlm/dlmrecovery.c
>>>>>>>> +++ b/fs/ocfs2/dlm/dlmrecovery.c
>>>>>>>> @@ -2134,7 +2134,6 @@ void dlm_move_lockres_to_recovery_list(struct dlm_ctxt *dlm,
>>>>>>>>      				 * if this had completed successfully
>>>>>>>>      				 * before sending this lock state to the
>>>>>>>>      				 * new master */
>>>>>>>> -				BUG_ON(i != DLM_CONVERTING_LIST);
>>>>>>>>      				mlog(0, "node died with cancel pending "
>>>>>>>>      				     "on %.*s. move back to granted list.\n",
>>>>>>>>      				     res->lockname.len, res->lockname.name);
>>>>>>>> diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
>>>>>>>> index 63d701c..505bb6c 100644
>>>>>>>> --- a/fs/ocfs2/dlm/dlmunlock.c
>>>>>>>> +++ b/fs/ocfs2/dlm/dlmunlock.c
>>>>>>>> @@ -183,6 +183,11 @@ static enum dlm_status dlmunlock_common(struct dlm_ctxt *dlm,
>>>>>>>>      							flags, owner);
>>>>>>>>      		spin_lock(&res->spinlock);
>>>>>>>>      		spin_lock(&lock->spinlock);
>>>>>>>> +
>>>>>>>> +		if ((flags & LKM_CANCEL) &&
>>>>>>>> +				dlm_lock_on_list(&res->granted, lock))
>>>>>>>> +			status = DLM_CANCELGRANT;
>>>>>>>> +
>>>>>>>>      		/* if the master told us the lock was already granted,
>>>>>>>>      		 * let the ast handle all of these actions */
>>>>>>>>      		if (status == DLM_CANCELGRANT) {
>>>>>>>> -- 
>>>>>>>> 1.8.3.1
>>>>>>>>
>>>>>>> .
>>>>>>>
>>>>> .
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Ocfs2-devel mailing list
>>>> Ocfs2-devel@oss.oracle.com
>>>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>>>>
>>>
>> .
>>
>
piaojun Feb. 15, 2019, 7:35 a.m. UTC | #12
Hi Changwei,

On 2019/2/15 15:06, Changwei Ge wrote:
> Hi Jun,
> 
> On 2019/2/15 14:51, piaojun wrote:
>> Hi Changwei,
>>
>> On 2019/2/14 18:13, Changwei Ge wrote:
>>> On 2019/2/14 17:09, piaojun wrote:
>>>> Hi Changwei,
>>>>
>>>> The problem can't be solved completely if clear ::cancel_pending in
>>>> dlm_proxy_ast_handler, as AST will come at anytime just before
>>>
>>> So how about also add check here bere setting ::cancel_pending in dlmunlock_common() before invoking dlm_send_remote_unlock_request().
>>> If already on grant list just return DLM_CANCELGRANT
>>>
>>> Then a further reference code might look like:
>>>
>>> root@ubuntu16:/home/chge/linux[master]# git diff
>>> diff --git a/fs/ocfs2/dlm/dlmast.c b/fs/ocfs2/dlm/dlmast.c
>>> index 39831fc..812843b 100644
>>> --- a/fs/ocfs2/dlm/dlmast.c
>>> +++ b/fs/ocfs2/dlm/dlmast.c
>>> @@ -372,8 +372,11 @@ int dlm_proxy_ast_handler(struct o2net_msg *msg, u32 len, void *data,
>>>           head = &res->converting;
>>>           lock = NULL;
>>>           list_for_each_entry(lock, head, list) {
>>> -               if (lock->ml.cookie == cookie)
>>> +               if (lock->ml.cookie == cookie) {
>>> +                       if (lock->cancel_pending)
>>> +                               lock->cancel_pending = 0;
>>>                           goto do_ast;
>>> +               }
>>>           }
>>>
>>>           /* if not on convert, try blocked for ast, granted for bast */
>>> diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
>>> index c8e9b70..b4728b5 100644
>>> --- a/fs/ocfs2/dlm/dlmunlock.c
>>> +++ b/fs/ocfs2/dlm/dlmunlock.c
>>> @@ -174,9 +174,14 @@ static enum dlm_status dlmunlock_common(struct dlm_ctxt *dlm,
>>>           if (!master_node) {
>>>                   owner = res->owner;
>>>                   /* drop locks and send message */
>>> -               if (flags & LKM_CANCEL)
>>> +               if (flags & LKM_CANCEL) {
>>> +                       if (dlm_lock_on_list(&res->granted, lock)) {
>>> +                               status = DLM_CANCELGRANT;
>>> +                               goto leave;
>>
>> If master dead and then lockres is moved to granted list in
>> dlm_move_lockres_to_recovery_list, the OCFS2_LOCK_BUSY is not cleared.
> 
> OCFS2_LOCK_BUSY  should be cleared in ocfs2_locking_ast() since previous locking AST has come back(moving lock to grant list).
> That's why we return DLM_CANCELGRANT to caller to avoid calling AST. Otherwise ast() will be called twice, which is obviously a BUG.

I mean master is already dead and ast won't come. So the
OCFS2_LOCK_BUSY is not cleared. And there are two cases that lockres is
moved to grant list:
1. AST comes back and OCFS2_LOCK_BUSY is cleared.
2. AST does not come back and OCFS2_LOCK_BUSY remains, recovery process
   move it to grant list. In this case, we need do AST for it.

Perhaps we need do AST again for it.

> 
>> And this will cause stuck problem when ocfs2_drop_lock. The reason is
>> that unlock ast won't be done when DLM_CANCELGRANT is set. So I think
> 
> With above elaboration, you don't have to worry the hang issue anymore.
> 
> Thanks,
> Changwei
> 
>> we need distinguish all the cases of moving lockres to grant list.
>>
>>> +                       }
>>> +
>>>                           lock->cancel_pending = 1;
>>> -               else
>>> +               } else
>>>                           lock->unlock_pending = 1;
>>>                   spin_unlock(&lock->spinlock);
>>>                   spin_unlock(&res->spinlock);
>>>
>>>
>>>
>>> Thanks,
>>> Changwei
>>>
>>>> ::cancel_pendig is set. If there is not any other better solutions,
>>>> could we accept this patch? This bug is very harmful.
>>>>
>>>> Thanks,
>>>> Jun
>>>>
>>>> On 2018/12/8 18:05, wangjian wrote:
>>>>> Hi Changwei,
>>>>>
>>>>> I understand your idea. But we should be aware that the cancel_convert process and
>>>>> other processes (accepting the AST process, the recovery process) are asynchronous.
>>>>> For example, according to your idea, check if the lock is in the grant queue before
>>>>> calling the dlm_send_remote_unlock_request function in the dlm_proxy_ast_handler function.
>>>>> Then decide whether to clear cancel_pending. But if the AST does not come at this time,
>>>>> the check passes and cancel_pendig will not be cleared. Then AST immediately came over again,
>>>>> which also led to a bug. I personally think that for asynchronous processes we can't guarantee
>>>>> the speed of execution of each process. All we can do is to avoid the BUG scene.
>>>>>
>>>>> As for the question you said ("If you remove the BUG check why you still call dlm_commit_pending_cancel()
>>>>> to move the lock back to grant on matter it's on converting list or not?").
>>>>> I think we should first check if the lock is in the grant queue
>>>>> (at this time, dlm->spinlock and res->spinlock have been added), then decide whether to call
>>>>> dlm_commit_pending_cancel function.
>>>>>
>>>>> Thanks,
>>>>> Jian
>>>>>
>>>>> On 12/7/2018 11:12 AM, Changwei Ge wrote:
>>>>>> Hi Jian,
>>>>>>
>>>>>> I suppose that the situation you described truly exists.
>>>>>> But the way you fix the issue is not in my favor.
>>>>>>
>>>>>> If you remove the BUG check why you still call dlm_commit_pending_cancel() to
>>>>>> move the lock back to grant on matter it's on converting list or not?
>>>>>>
>>>>>> So how about keeping the BUG check in dlm_move_lockres_to_recovery_list().
>>>>>> If the locking _ast_ comes back very fast just check ::cancel_pending in dlm_proxy_ast_handler() and clear it.
>>>>>> Then with the logic checking if the lock is on grant list (in dlmunlock_common() more less like your current method)
>>>>>> or not we can easily tell if the cancellation succeeds or not.
>>>>>>
>>>>>> That complies the original dlm design, which I think is better and easier for maintainers to understand.
>>>>>>
>>>>>> Thanks,
>>>>>> Changwei
>>>>>>
>>>>>> On 2018/12/6 20:06, wangjian wrote:
>>>>>>> Hi Changwei,
>>>>>>>
>>>>>>> I don't fully agree with your point of view. In my opinion, the lock cancellation process and the lock
>>>>>>> conversion process are asynchronous. We can't guarantee that the lock must be in the convert list
>>>>>>> during the lock conversion process, otherwise this BUG will not happen.
>>>>>>> So, I think this is a meaningless BUG.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Jian
>>>>>>>
>>>>>>> On 12/5/2018 9:49 AM, Changwei Ge wrote:
>>>>>>>> Hi Jian,
>>>>>>>>
>>>>>>>> I suppose you can't just remove the BUG_ON() check.
>>>>>>>> If you remove it, below code violates the original logic.
>>>>>>>>
>>>>>>>> '''
>>>>>>>> 2141                                 dlm_commit_pending_cancel(res, lock);
>>>>>>>> '''
>>>>>>>>
>>>>>>>> What's more important is *locking cancel* must be against a *locking conversion* progress.
>>>>>>>> So it makes sense to check if this lock is on converting list.
>>>>>>>>
>>>>>>>> So I have to NACK to this patch.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Changwei
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2018/12/3 20:23, wangjian wrote:
>>>>>>>>> In the dlm_move_lockres_to_recovery_list function, if the lock
>>>>>>>>> is in the granted queue and cancel_pending is set, it will
>>>>>>>>> encounter a BUG. I think this is a meaningless BUG,
>>>>>>>>> so be prepared to remove it. A scenario that causes
>>>>>>>>> this BUG will be given below.
>>>>>>>>>
>>>>>>>>> At the beginning, Node 1 is the master and has NL lock,
>>>>>>>>> Node 2 has PR lock, Node 3 has PR lock too.
>>>>>>>>>
>>>>>>>>> Node 1          Node 2          Node 3
>>>>>>>>>                 want to get EX lock.
>>>>>>>>>
>>>>>>>>>                                 want to get EX lock.
>>>>>>>>>
>>>>>>>>> Node 3 hinder
>>>>>>>>> Node 2 to get
>>>>>>>>> EX lock, send
>>>>>>>>> Node 3 a BAST.
>>>>>>>>>
>>>>>>>>>                                 receive BAST from
>>>>>>>>>                                 Node 1. downconvert
>>>>>>>>>                                 thread begin to
>>>>>>>>>                                 cancel PR to EX conversion.
>>>>>>>>>                                 In dlmunlock_common function,
>>>>>>>>>                                 downconvert thread has set
>>>>>>>>>                                 lock->cancel_pending,
>>>>>>>>>                                 but did not enter
>>>>>>>>>                                 dlm_send_remote_unlock_request
>>>>>>>>>                                 function.
>>>>>>>>>
>>>>>>>>>                 Node2 dies because
>>>>>>>>>                 the host is powered down.
>>>>>>>>>
>>>>>>>>> In recovery process,
>>>>>>>>> clean the lock that
>>>>>>>>> related to Node2.
>>>>>>>>> then finish Node 3
>>>>>>>>> PR to EX request.
>>>>>>>>> give Node 3 a AST.
>>>>>>>>>
>>>>>>>>>                                 receive AST from Node 1.
>>>>>>>>>                                 change lock level to EX,
>>>>>>>>>                                 move lock to granted list.
>>>>>>>>>
>>>>>>>>> Node1 dies because
>>>>>>>>> the host is powered down.
>>>>>>>>>
>>>>>>>>>                                 In dlm_move_lockres_to_recovery_list
>>>>>>>>>                                 function. the lock is in the
>>>>>>>>>                                 granted queue and cancel_pending
>>>>>>>>>                                 is set. BUG_ON.
>>>>>>>>>
>>>>>>>>> But after clearing this BUG, process will encounter
>>>>>>>>> the second BUG in the ocfs2_unlock_ast function.
>>>>>>>>> Here is a scenario that will cause the second BUG
>>>>>>>>> in ocfs2_unlock_ast as follows:
>>>>>>>>>
>>>>>>>>> At the beginning, Node 1 is the master and has NL lock,
>>>>>>>>> Node 2 has PR lock, Node 3 has PR lock too.
>>>>>>>>>
>>>>>>>>> Node 1          Node 2          Node 3
>>>>>>>>>                 want to get EX lock.
>>>>>>>>>
>>>>>>>>>                                 want to get EX lock.
>>>>>>>>>
>>>>>>>>> Node 3 hinder
>>>>>>>>> Node 2 to get
>>>>>>>>> EX lock, send
>>>>>>>>> Node 3 a BAST.
>>>>>>>>>
>>>>>>>>>                                 receive BAST from
>>>>>>>>>                                 Node 1. downconvert
>>>>>>>>>                                 thread begin to
>>>>>>>>>                                 cancel PR to EX conversion.
>>>>>>>>>                                 In dlmunlock_common function,
>>>>>>>>>                                 downconvert thread has released
>>>>>>>>>                                 lock->spinlock and res->spinlock,
>>>>>>>>>                                 but did not enter
>>>>>>>>>                                 dlm_send_remote_unlock_request
>>>>>>>>>                                 function.
>>>>>>>>>
>>>>>>>>>                 Node2 dies because
>>>>>>>>>                 the host is powered down.
>>>>>>>>>
>>>>>>>>> In recovery process,
>>>>>>>>> clean the lock that
>>>>>>>>> related to Node2.
>>>>>>>>> then finish Node 3
>>>>>>>>> PR to EX request.
>>>>>>>>> give Node 3 a AST.
>>>>>>>>>
>>>>>>>>>                                 receive AST from Node 1.
>>>>>>>>>                                 change lock level to EX,
>>>>>>>>>                                 move lock to granted list,
>>>>>>>>>                                 set lockres->l_unlock_action
>>>>>>>>>                                 as OCFS2_UNLOCK_INVALID
>>>>>>>>>                                 in ocfs2_locking_ast function.
>>>>>>>>>
>>>>>>>>> Node2 dies because
>>>>>>>>> the host is powered down.
>>>>>>>>>
>>>>>>>>>                                 Node 3 realize that Node 1
>>>>>>>>>                                 is dead, remove Node 1 from
>>>>>>>>>                                 domain_map. downconvert thread
>>>>>>>>>                                 get DLM_NORMAL from
>>>>>>>>>                                 dlm_send_remote_unlock_request
>>>>>>>>>                                 function and set *call_ast as 1.
>>>>>>>>>                                 Then downconvert thread meet
>>>>>>>>>                                 BUG in ocfs2_unlock_ast function.
>>>>>>>>>
>>>>>>>>> To avoid meet the second BUG, function dlmunlock_common shuold
>>>>>>>>> return DLM_CANCELGRANT if the lock is on granted list and
>>>>>>>>> the operation is canceled.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Jian Wang<wangjian161@huawei.com>
>>>>>>>>> Reviewed-by: Yiwen Jiang<jiangyiwen@huawei.com>
>>>>>>>>> ---
>>>>>>>>>      fs/ocfs2/dlm/dlmrecovery.c | 1 -
>>>>>>>>>      fs/ocfs2/dlm/dlmunlock.c   | 5 +++++
>>>>>>>>>      2 files changed, 5 insertions(+), 1 deletion(-)
>>>>>>>>>
>>>>>>>>> diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
>>>>>>>>> index 802636d..7489652 100644
>>>>>>>>> --- a/fs/ocfs2/dlm/dlmrecovery.c
>>>>>>>>> +++ b/fs/ocfs2/dlm/dlmrecovery.c
>>>>>>>>> @@ -2134,7 +2134,6 @@ void dlm_move_lockres_to_recovery_list(struct dlm_ctxt *dlm,
>>>>>>>>>      				 * if this had completed successfully
>>>>>>>>>      				 * before sending this lock state to the
>>>>>>>>>      				 * new master */
>>>>>>>>> -				BUG_ON(i != DLM_CONVERTING_LIST);
>>>>>>>>>      				mlog(0, "node died with cancel pending "
>>>>>>>>>      				     "on %.*s. move back to granted list.\n",
>>>>>>>>>      				     res->lockname.len, res->lockname.name);
>>>>>>>>> diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
>>>>>>>>> index 63d701c..505bb6c 100644
>>>>>>>>> --- a/fs/ocfs2/dlm/dlmunlock.c
>>>>>>>>> +++ b/fs/ocfs2/dlm/dlmunlock.c
>>>>>>>>> @@ -183,6 +183,11 @@ static enum dlm_status dlmunlock_common(struct dlm_ctxt *dlm,
>>>>>>>>>      							flags, owner);
>>>>>>>>>      		spin_lock(&res->spinlock);
>>>>>>>>>      		spin_lock(&lock->spinlock);
>>>>>>>>> +
>>>>>>>>> +		if ((flags & LKM_CANCEL) &&
>>>>>>>>> +				dlm_lock_on_list(&res->granted, lock))
>>>>>>>>> +			status = DLM_CANCELGRANT;
>>>>>>>>> +
>>>>>>>>>      		/* if the master told us the lock was already granted,
>>>>>>>>>      		 * let the ast handle all of these actions */
>>>>>>>>>      		if (status == DLM_CANCELGRANT) {
>>>>>>>>> -- 
>>>>>>>>> 1.8.3.1
>>>>>>>>>
>>>>>>>> .
>>>>>>>>
>>>>>> .
>>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Ocfs2-devel mailing list
>>>>> Ocfs2-devel@oss.oracle.com
>>>>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>>>>>
>>>>
>>> .
>>>
>>
> .
>
Changwei Ge Feb. 15, 2019, 7:56 a.m. UTC | #13
Hi Jun

I just read the code around unlock/cancel.

On 2019/2/15 15:35, piaojun wrote:
> Hi Changwei,
> 
> On 2019/2/15 15:06, Changwei Ge wrote:
>> Hi Jun,
>>
>> On 2019/2/15 14:51, piaojun wrote:
>>> Hi Changwei,
>>>
>>> On 2019/2/14 18:13, Changwei Ge wrote:
>>>> On 2019/2/14 17:09, piaojun wrote:
>>>>> Hi Changwei,
>>>>>
>>>>> The problem can't be solved completely if clear ::cancel_pending in
>>>>> dlm_proxy_ast_handler, as AST will come at anytime just before
>>>>
>>>> So how about also add check here bere setting ::cancel_pending in dlmunlock_common() before invoking dlm_send_remote_unlock_request().
>>>> If already on grant list just return DLM_CANCELGRANT
>>>>
>>>> Then a further reference code might look like:
>>>>
>>>> root@ubuntu16:/home/chge/linux[master]# git diff
>>>> diff --git a/fs/ocfs2/dlm/dlmast.c b/fs/ocfs2/dlm/dlmast.c
>>>> index 39831fc..812843b 100644
>>>> --- a/fs/ocfs2/dlm/dlmast.c
>>>> +++ b/fs/ocfs2/dlm/dlmast.c
>>>> @@ -372,8 +372,11 @@ int dlm_proxy_ast_handler(struct o2net_msg *msg, u32 len, void *data,
>>>>            head = &res->converting;
>>>>            lock = NULL;
>>>>            list_for_each_entry(lock, head, list) {
>>>> -               if (lock->ml.cookie == cookie)
>>>> +               if (lock->ml.cookie == cookie) {
>>>> +                       if (lock->cancel_pending)
>>>> +                               lock->cancel_pending = 0;
>>>>                            goto do_ast;
>>>> +               }
>>>>            }
>>>>
>>>>            /* if not on convert, try blocked for ast, granted for bast */
>>>> diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
>>>> index c8e9b70..b4728b5 100644
>>>> --- a/fs/ocfs2/dlm/dlmunlock.c
>>>> +++ b/fs/ocfs2/dlm/dlmunlock.c
>>>> @@ -174,9 +174,14 @@ static enum dlm_status dlmunlock_common(struct dlm_ctxt *dlm,
>>>>            if (!master_node) {
>>>>                    owner = res->owner;
>>>>                    /* drop locks and send message */
>>>> -               if (flags & LKM_CANCEL)
>>>> +               if (flags & LKM_CANCEL) {
>>>> +                       if (dlm_lock_on_list(&res->granted, lock)) {
>>>> +                               status = DLM_CANCELGRANT;
>>>> +                               goto leave;

I found that above code should be useless.
As upstream code already take it into consideration that AST has come before cancellation.
In dlm_get_cancel_actions()
	'''
	} else if (dlm_lock_on_list(&res->granted, lock)) {
		/* too late, already granted. */
		status = DLM_CANCELGRANT;
		*actions = DLM_UNLOCK_CALL_AST;
	} else {
	'''

>>>
>>> If master dead and then lockres is moved to granted list in
>>> dlm_move_lockres_to_recovery_list, the OCFS2_LOCK_BUSY is not cleared.
>>
>> OCFS2_LOCK_BUSY  should be cleared in ocfs2_locking_ast() since previous locking AST has come back(moving lock to grant list).
>> That's why we return DLM_CANCELGRANT to caller to avoid calling AST. Otherwise ast() will be called twice, which is obviously a BUG.
> 
> I mean master is already dead and ast won't come. So the
> OCFS2_LOCK_BUSY is not cleared. And there are two cases that lockres is
> moved to grant list:
> 1. AST comes back and OCFS2_LOCK_BUSY is cleared.
> 2. AST does not come back and OCFS2_LOCK_BUSY remains, recovery process
>     move it to grant list. In this case, we need do AST for it.

For point 2, ocfs2 can handle it, so you still don't have to worry. It's no problem.
In dlmconvert_remote()


341         if (status != DLM_NORMAL) {
342                 if (status != DLM_NOTQUEUED)
343                         dlm_error(status);
344                 dlm_revert_pending_convert(res, lock);
345         } else if (!lock->convert_pending) {
346                 mlog(0, "%s: res %.*s, owner died and lock has been moved back "
347                                 "to granted list, retry convert.\n",
348                                 dlm->name, res->lockname.len, res->lockname.name);
349                 status = DLM_RECOVERING;
350         }

Thus, dlmlock() will wait until RECOVERY is done.
And for ocfs2 layer, it's apparent, it doesn't have to be aware of what happened to DLM.

Thanks,
Changwei


> 
> Perhaps we need do AST again for it.
> 
>>
>>> And this will cause stuck problem when ocfs2_drop_lock. The reason is
>>> that unlock ast won't be done when DLM_CANCELGRANT is set. So I think
>>
>> With above elaboration, you don't have to worry the hang issue anymore.
>>
>> Thanks,
>> Changwei
>>
>>> we need distinguish all the cases of moving lockres to grant list.
>>>
>>>> +                       }
>>>> +
>>>>                            lock->cancel_pending = 1;
>>>> -               else
>>>> +               } else
>>>>                            lock->unlock_pending = 1;
>>>>                    spin_unlock(&lock->spinlock);
>>>>                    spin_unlock(&res->spinlock);
>>>>
>>>>
>>>>
>>>> Thanks,
>>>> Changwei
>>>>
>>>>> ::cancel_pendig is set. If there is not any other better solutions,
>>>>> could we accept this patch? This bug is very harmful.
>>>>>
>>>>> Thanks,
>>>>> Jun
>>>>>
>>>>> On 2018/12/8 18:05, wangjian wrote:
>>>>>> Hi Changwei,
>>>>>>
>>>>>> I understand your idea. But we should be aware that the cancel_convert process and
>>>>>> other processes (accepting the AST process, the recovery process) are asynchronous.
>>>>>> For example, according to your idea, check if the lock is in the grant queue before
>>>>>> calling the dlm_send_remote_unlock_request function in the dlm_proxy_ast_handler function.
>>>>>> Then decide whether to clear cancel_pending. But if the AST does not come at this time,
>>>>>> the check passes and cancel_pendig will not be cleared. Then AST immediately came over again,
>>>>>> which also led to a bug. I personally think that for asynchronous processes we can't guarantee
>>>>>> the speed of execution of each process. All we can do is to avoid the BUG scene.
>>>>>>
>>>>>> As for the question you said ("If you remove the BUG check why you still call dlm_commit_pending_cancel()
>>>>>> to move the lock back to grant on matter it's on converting list or not?").
>>>>>> I think we should first check if the lock is in the grant queue
>>>>>> (at this time, dlm->spinlock and res->spinlock have been added), then decide whether to call
>>>>>> dlm_commit_pending_cancel function.
>>>>>>
>>>>>> Thanks,
>>>>>> Jian
>>>>>>
>>>>>> On 12/7/2018 11:12 AM, Changwei Ge wrote:
>>>>>>> Hi Jian,
>>>>>>>
>>>>>>> I suppose that the situation you described truly exists.
>>>>>>> But the way you fix the issue is not in my favor.
>>>>>>>
>>>>>>> If you remove the BUG check why you still call dlm_commit_pending_cancel() to
>>>>>>> move the lock back to grant on matter it's on converting list or not?
>>>>>>>
>>>>>>> So how about keeping the BUG check in dlm_move_lockres_to_recovery_list().
>>>>>>> If the locking _ast_ comes back very fast just check ::cancel_pending in dlm_proxy_ast_handler() and clear it.
>>>>>>> Then with the logic checking if the lock is on grant list (in dlmunlock_common() more less like your current method)
>>>>>>> or not we can easily tell if the cancellation succeeds or not.
>>>>>>>
>>>>>>> That complies the original dlm design, which I think is better and easier for maintainers to understand.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Changwei
>>>>>>>
>>>>>>> On 2018/12/6 20:06, wangjian wrote:
>>>>>>>> Hi Changwei,
>>>>>>>>
>>>>>>>> I don't fully agree with your point of view. In my opinion, the lock cancellation process and the lock
>>>>>>>> conversion process are asynchronous. We can't guarantee that the lock must be in the convert list
>>>>>>>> during the lock conversion process, otherwise this BUG will not happen.
>>>>>>>> So, I think this is a meaningless BUG.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Jian
>>>>>>>>
>>>>>>>> On 12/5/2018 9:49 AM, Changwei Ge wrote:
>>>>>>>>> Hi Jian,
>>>>>>>>>
>>>>>>>>> I suppose you can't just remove the BUG_ON() check.
>>>>>>>>> If you remove it, below code violates the original logic.
>>>>>>>>>
>>>>>>>>> '''
>>>>>>>>> 2141                                 dlm_commit_pending_cancel(res, lock);
>>>>>>>>> '''
>>>>>>>>>
>>>>>>>>> What's more important is *locking cancel* must be against a *locking conversion* progress.
>>>>>>>>> So it makes sense to check if this lock is on converting list.
>>>>>>>>>
>>>>>>>>> So I have to NACK to this patch.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Changwei
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 2018/12/3 20:23, wangjian wrote:
>>>>>>>>>> In the dlm_move_lockres_to_recovery_list function, if the lock
>>>>>>>>>> is in the granted queue and cancel_pending is set, it will
>>>>>>>>>> encounter a BUG. I think this is a meaningless BUG,
>>>>>>>>>> so be prepared to remove it. A scenario that causes
>>>>>>>>>> this BUG will be given below.
>>>>>>>>>>
>>>>>>>>>> At the beginning, Node 1 is the master and has NL lock,
>>>>>>>>>> Node 2 has PR lock, Node 3 has PR lock too.
>>>>>>>>>>
>>>>>>>>>> Node 1          Node 2          Node 3
>>>>>>>>>>                  want to get EX lock.
>>>>>>>>>>
>>>>>>>>>>                                  want to get EX lock.
>>>>>>>>>>
>>>>>>>>>> Node 3 hinder
>>>>>>>>>> Node 2 to get
>>>>>>>>>> EX lock, send
>>>>>>>>>> Node 3 a BAST.
>>>>>>>>>>
>>>>>>>>>>                                  receive BAST from
>>>>>>>>>>                                  Node 1. downconvert
>>>>>>>>>>                                  thread begin to
>>>>>>>>>>                                  cancel PR to EX conversion.
>>>>>>>>>>                                  In dlmunlock_common function,
>>>>>>>>>>                                  downconvert thread has set
>>>>>>>>>>                                  lock->cancel_pending,
>>>>>>>>>>                                  but did not enter
>>>>>>>>>>                                  dlm_send_remote_unlock_request
>>>>>>>>>>                                  function.
>>>>>>>>>>
>>>>>>>>>>                  Node2 dies because
>>>>>>>>>>                  the host is powered down.
>>>>>>>>>>
>>>>>>>>>> In recovery process,
>>>>>>>>>> clean the lock that
>>>>>>>>>> related to Node2.
>>>>>>>>>> then finish Node 3
>>>>>>>>>> PR to EX request.
>>>>>>>>>> give Node 3 a AST.
>>>>>>>>>>
>>>>>>>>>>                                  receive AST from Node 1.
>>>>>>>>>>                                  change lock level to EX,
>>>>>>>>>>                                  move lock to granted list.
>>>>>>>>>>
>>>>>>>>>> Node1 dies because
>>>>>>>>>> the host is powered down.
>>>>>>>>>>
>>>>>>>>>>                                  In dlm_move_lockres_to_recovery_list
>>>>>>>>>>                                  function. the lock is in the
>>>>>>>>>>                                  granted queue and cancel_pending
>>>>>>>>>>                                  is set. BUG_ON.
>>>>>>>>>>
>>>>>>>>>> But after clearing this BUG, process will encounter
>>>>>>>>>> the second BUG in the ocfs2_unlock_ast function.
>>>>>>>>>> Here is a scenario that will cause the second BUG
>>>>>>>>>> in ocfs2_unlock_ast as follows:
>>>>>>>>>>
>>>>>>>>>> At the beginning, Node 1 is the master and has NL lock,
>>>>>>>>>> Node 2 has PR lock, Node 3 has PR lock too.
>>>>>>>>>>
>>>>>>>>>> Node 1          Node 2          Node 3
>>>>>>>>>>                  want to get EX lock.
>>>>>>>>>>
>>>>>>>>>>                                  want to get EX lock.
>>>>>>>>>>
>>>>>>>>>> Node 3 hinder
>>>>>>>>>> Node 2 to get
>>>>>>>>>> EX lock, send
>>>>>>>>>> Node 3 a BAST.
>>>>>>>>>>
>>>>>>>>>>                                  receive BAST from
>>>>>>>>>>                                  Node 1. downconvert
>>>>>>>>>>                                  thread begin to
>>>>>>>>>>                                  cancel PR to EX conversion.
>>>>>>>>>>                                  In dlmunlock_common function,
>>>>>>>>>>                                  downconvert thread has released
>>>>>>>>>>                                  lock->spinlock and res->spinlock,
>>>>>>>>>>                                  but did not enter
>>>>>>>>>>                                  dlm_send_remote_unlock_request
>>>>>>>>>>                                  function.
>>>>>>>>>>
>>>>>>>>>>                  Node2 dies because
>>>>>>>>>>                  the host is powered down.
>>>>>>>>>>
>>>>>>>>>> In recovery process,
>>>>>>>>>> clean the lock that
>>>>>>>>>> related to Node2.
>>>>>>>>>> then finish Node 3
>>>>>>>>>> PR to EX request.
>>>>>>>>>> give Node 3 a AST.
>>>>>>>>>>
>>>>>>>>>>                                  receive AST from Node 1.
>>>>>>>>>>                                  change lock level to EX,
>>>>>>>>>>                                  move lock to granted list,
>>>>>>>>>>                                  set lockres->l_unlock_action
>>>>>>>>>>                                  as OCFS2_UNLOCK_INVALID
>>>>>>>>>>                                  in ocfs2_locking_ast function.
>>>>>>>>>>
>>>>>>>>>> Node2 dies because
>>>>>>>>>> the host is powered down.
>>>>>>>>>>
>>>>>>>>>>                                  Node 3 realize that Node 1
>>>>>>>>>>                                  is dead, remove Node 1 from
>>>>>>>>>>                                  domain_map. downconvert thread
>>>>>>>>>>                                  get DLM_NORMAL from
>>>>>>>>>>                                  dlm_send_remote_unlock_request
>>>>>>>>>>                                  function and set *call_ast as 1.
>>>>>>>>>>                                  Then downconvert thread meet
>>>>>>>>>>                                  BUG in ocfs2_unlock_ast function.
>>>>>>>>>>
>>>>>>>>>> To avoid meet the second BUG, function dlmunlock_common shuold
>>>>>>>>>> return DLM_CANCELGRANT if the lock is on granted list and
>>>>>>>>>> the operation is canceled.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Jian Wang<wangjian161@huawei.com>
>>>>>>>>>> Reviewed-by: Yiwen Jiang<jiangyiwen@huawei.com>
>>>>>>>>>> ---
>>>>>>>>>>       fs/ocfs2/dlm/dlmrecovery.c | 1 -
>>>>>>>>>>       fs/ocfs2/dlm/dlmunlock.c   | 5 +++++
>>>>>>>>>>       2 files changed, 5 insertions(+), 1 deletion(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
>>>>>>>>>> index 802636d..7489652 100644
>>>>>>>>>> --- a/fs/ocfs2/dlm/dlmrecovery.c
>>>>>>>>>> +++ b/fs/ocfs2/dlm/dlmrecovery.c
>>>>>>>>>> @@ -2134,7 +2134,6 @@ void dlm_move_lockres_to_recovery_list(struct dlm_ctxt *dlm,
>>>>>>>>>>       				 * if this had completed successfully
>>>>>>>>>>       				 * before sending this lock state to the
>>>>>>>>>>       				 * new master */
>>>>>>>>>> -				BUG_ON(i != DLM_CONVERTING_LIST);
>>>>>>>>>>       				mlog(0, "node died with cancel pending "
>>>>>>>>>>       				     "on %.*s. move back to granted list.\n",
>>>>>>>>>>       				     res->lockname.len, res->lockname.name);
>>>>>>>>>> diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
>>>>>>>>>> index 63d701c..505bb6c 100644
>>>>>>>>>> --- a/fs/ocfs2/dlm/dlmunlock.c
>>>>>>>>>> +++ b/fs/ocfs2/dlm/dlmunlock.c
>>>>>>>>>> @@ -183,6 +183,11 @@ static enum dlm_status dlmunlock_common(struct dlm_ctxt *dlm,
>>>>>>>>>>       							flags, owner);
>>>>>>>>>>       		spin_lock(&res->spinlock);
>>>>>>>>>>       		spin_lock(&lock->spinlock);
>>>>>>>>>> +
>>>>>>>>>> +		if ((flags & LKM_CANCEL) &&
>>>>>>>>>> +				dlm_lock_on_list(&res->granted, lock))
>>>>>>>>>> +			status = DLM_CANCELGRANT;
>>>>>>>>>> +
>>>>>>>>>>       		/* if the master told us the lock was already granted,
>>>>>>>>>>       		 * let the ast handle all of these actions */
>>>>>>>>>>       		if (status == DLM_CANCELGRANT) {
>>>>>>>>>> -- 
>>>>>>>>>> 1.8.3.1
>>>>>>>>>>
>>>>>>>>> .
>>>>>>>>>
>>>>>>> .
>>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Ocfs2-devel mailing list
>>>>>> Ocfs2-devel@oss.oracle.com
>>>>>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>>>>>>
>>>>>
>>>> .
>>>>
>>>
>> .
>>
>
piaojun Feb. 15, 2019, 9:19 a.m. UTC | #14
Hi Changwei,

On 2019/2/15 15:56, Changwei Ge wrote:
> Hi Jun
> 
> I just read the code around unlock/cancel.
> 
> On 2019/2/15 15:35, piaojun wrote:
>> Hi Changwei,
>>
>> On 2019/2/15 15:06, Changwei Ge wrote:
>>> Hi Jun,
>>>
>>> On 2019/2/15 14:51, piaojun wrote:
>>>> Hi Changwei,
>>>>
>>>> On 2019/2/14 18:13, Changwei Ge wrote:
>>>>> On 2019/2/14 17:09, piaojun wrote:
>>>>>> Hi Changwei,
>>>>>>
>>>>>> The problem can't be solved completely if clear ::cancel_pending in
>>>>>> dlm_proxy_ast_handler, as AST will come at anytime just before
>>>>>
>>>>> So how about also add check here bere setting ::cancel_pending in dlmunlock_common() before invoking dlm_send_remote_unlock_request().
>>>>> If already on grant list just return DLM_CANCELGRANT
>>>>>
>>>>> Then a further reference code might look like:
>>>>>
>>>>> root@ubuntu16:/home/chge/linux[master]# git diff
>>>>> diff --git a/fs/ocfs2/dlm/dlmast.c b/fs/ocfs2/dlm/dlmast.c
>>>>> index 39831fc..812843b 100644
>>>>> --- a/fs/ocfs2/dlm/dlmast.c
>>>>> +++ b/fs/ocfs2/dlm/dlmast.c
>>>>> @@ -372,8 +372,11 @@ int dlm_proxy_ast_handler(struct o2net_msg *msg, u32 len, void *data,
>>>>>            head = &res->converting;
>>>>>            lock = NULL;
>>>>>            list_for_each_entry(lock, head, list) {
>>>>> -               if (lock->ml.cookie == cookie)
>>>>> +               if (lock->ml.cookie == cookie) {
>>>>> +                       if (lock->cancel_pending)
>>>>> +                               lock->cancel_pending = 0;
>>>>>                            goto do_ast;
>>>>> +               }
>>>>>            }
>>>>>
>>>>>            /* if not on convert, try blocked for ast, granted for bast */
>>>>> diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
>>>>> index c8e9b70..b4728b5 100644
>>>>> --- a/fs/ocfs2/dlm/dlmunlock.c
>>>>> +++ b/fs/ocfs2/dlm/dlmunlock.c
>>>>> @@ -174,9 +174,14 @@ static enum dlm_status dlmunlock_common(struct dlm_ctxt *dlm,
>>>>>            if (!master_node) {
>>>>>                    owner = res->owner;
>>>>>                    /* drop locks and send message */
>>>>> -               if (flags & LKM_CANCEL)
>>>>> +               if (flags & LKM_CANCEL) {
>>>>> +                       if (dlm_lock_on_list(&res->granted, lock)) {
>>>>> +                               status = DLM_CANCELGRANT;
>>>>> +                               goto leave;
> 
> I found that above code should be useless.
> As upstream code already take it into consideration that AST has come before cancellation.
> In dlm_get_cancel_actions()
> 	'''
> 	} else if (dlm_lock_on_list(&res->granted, lock)) {
> 		/* too late, already granted. */
> 		status = DLM_CANCELGRANT;
> 		*actions = DLM_UNLOCK_CALL_AST;
> 	} else {
> 	'''
> 
>>>>
>>>> If master dead and then lockres is moved to granted list in
>>>> dlm_move_lockres_to_recovery_list, the OCFS2_LOCK_BUSY is not cleared.
>>>
>>> OCFS2_LOCK_BUSY  should be cleared in ocfs2_locking_ast() since previous locking AST has come back(moving lock to grant list).
>>> That's why we return DLM_CANCELGRANT to caller to avoid calling AST. Otherwise ast() will be called twice, which is obviously a BUG.
>>
>> I mean master is already dead and ast won't come. So the
>> OCFS2_LOCK_BUSY is not cleared. And there are two cases that lockres is
>> moved to grant list:
>> 1. AST comes back and OCFS2_LOCK_BUSY is cleared.
>> 2. AST does not come back and OCFS2_LOCK_BUSY remains, recovery process
>>     move it to grant list. In this case, we need do AST for it.
> 
> For point 2, ocfs2 can handle it, so you still don't have to worry. It's no problem.
> In dlmconvert_remote()

What I worry about is that OCFS2_LOCK_BUSY won't be cleared if remote
master is dead, as OCFS2_LOCK_BUSY is cleared in two cases:
1. remote AST come, clear it in dlm_proxy_ast_handler.
2. remote AST does not come when master dead, clear it in
   o2dlm_unlock_ast_wrapper.

For case 2, if DLM_CANCELGRANT is set, o2dlm_unlock_ast_wrapper just
return and won't clear OCFS2_LOCK_BUSY. So in this case, I think we
should do AST again for it.

> 
> 
> 341         if (status != DLM_NORMAL) {
> 342                 if (status != DLM_NOTQUEUED)
> 343                         dlm_error(status);
> 344                 dlm_revert_pending_convert(res, lock);
> 345         } else if (!lock->convert_pending) {
> 346                 mlog(0, "%s: res %.*s, owner died and lock has been moved back "
> 347                                 "to granted list, retry convert.\n",
> 348                                 dlm->name, res->lockname.len, res->lockname.name);
> 349                 status = DLM_RECOVERING;
> 350         }
> 
> Thus, dlmlock() will wait until RECOVERY is done.
> And for ocfs2 layer, it's apparent, it doesn't have to be aware of what happened to DLM.

I do not think the waiting can solve this problem.

> 
> Thanks,
> Changwei
> 
> 
>>
>> Perhaps we need do AST again for it.
>>
>>>
>>>> And this will cause stuck problem when ocfs2_drop_lock. The reason is
>>>> that unlock ast won't be done when DLM_CANCELGRANT is set. So I think
>>>
>>> With above elaboration, you don't have to worry the hang issue anymore.
>>>
>>> Thanks,
>>> Changwei
>>>
>>>> we need distinguish all the cases of moving lockres to grant list.
>>>>
>>>>> +                       }
>>>>> +
>>>>>                            lock->cancel_pending = 1;
>>>>> -               else
>>>>> +               } else
>>>>>                            lock->unlock_pending = 1;
>>>>>                    spin_unlock(&lock->spinlock);
>>>>>                    spin_unlock(&res->spinlock);
>>>>>
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Changwei
>>>>>
>>>>>> ::cancel_pendig is set. If there is not any other better solutions,
>>>>>> could we accept this patch? This bug is very harmful.
>>>>>>
>>>>>> Thanks,
>>>>>> Jun
>>>>>>
>>>>>> On 2018/12/8 18:05, wangjian wrote:
>>>>>>> Hi Changwei,
>>>>>>>
>>>>>>> I understand your idea. But we should be aware that the cancel_convert process and
>>>>>>> other processes (accepting the AST process, the recovery process) are asynchronous.
>>>>>>> For example, according to your idea, check if the lock is in the grant queue before
>>>>>>> calling the dlm_send_remote_unlock_request function in the dlm_proxy_ast_handler function.
>>>>>>> Then decide whether to clear cancel_pending. But if the AST does not come at this time,
>>>>>>> the check passes and cancel_pendig will not be cleared. Then AST immediately came over again,
>>>>>>> which also led to a bug. I personally think that for asynchronous processes we can't guarantee
>>>>>>> the speed of execution of each process. All we can do is to avoid the BUG scene.
>>>>>>>
>>>>>>> As for the question you said ("If you remove the BUG check why you still call dlm_commit_pending_cancel()
>>>>>>> to move the lock back to grant on matter it's on converting list or not?").
>>>>>>> I think we should first check if the lock is in the grant queue
>>>>>>> (at this time, dlm->spinlock and res->spinlock have been added), then decide whether to call
>>>>>>> dlm_commit_pending_cancel function.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Jian
>>>>>>>
>>>>>>> On 12/7/2018 11:12 AM, Changwei Ge wrote:
>>>>>>>> Hi Jian,
>>>>>>>>
>>>>>>>> I suppose that the situation you described truly exists.
>>>>>>>> But the way you fix the issue is not in my favor.
>>>>>>>>
>>>>>>>> If you remove the BUG check why you still call dlm_commit_pending_cancel() to
>>>>>>>> move the lock back to grant on matter it's on converting list or not?
>>>>>>>>
>>>>>>>> So how about keeping the BUG check in dlm_move_lockres_to_recovery_list().
>>>>>>>> If the locking _ast_ comes back very fast just check ::cancel_pending in dlm_proxy_ast_handler() and clear it.
>>>>>>>> Then with the logic checking if the lock is on grant list (in dlmunlock_common() more less like your current method)
>>>>>>>> or not we can easily tell if the cancellation succeeds or not.
>>>>>>>>
>>>>>>>> That complies the original dlm design, which I think is better and easier for maintainers to understand.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Changwei
>>>>>>>>
>>>>>>>> On 2018/12/6 20:06, wangjian wrote:
>>>>>>>>> Hi Changwei,
>>>>>>>>>
>>>>>>>>> I don't fully agree with your point of view. In my opinion, the lock cancellation process and the lock
>>>>>>>>> conversion process are asynchronous. We can't guarantee that the lock must be in the convert list
>>>>>>>>> during the lock conversion process, otherwise this BUG will not happen.
>>>>>>>>> So, I think this is a meaningless BUG.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Jian
>>>>>>>>>
>>>>>>>>> On 12/5/2018 9:49 AM, Changwei Ge wrote:
>>>>>>>>>> Hi Jian,
>>>>>>>>>>
>>>>>>>>>> I suppose you can't just remove the BUG_ON() check.
>>>>>>>>>> If you remove it, below code violates the original logic.
>>>>>>>>>>
>>>>>>>>>> '''
>>>>>>>>>> 2141                                 dlm_commit_pending_cancel(res, lock);
>>>>>>>>>> '''
>>>>>>>>>>
>>>>>>>>>> What's more important is *locking cancel* must be against a *locking conversion* progress.
>>>>>>>>>> So it makes sense to check if this lock is on converting list.
>>>>>>>>>>
>>>>>>>>>> So I have to NACK to this patch.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Changwei
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 2018/12/3 20:23, wangjian wrote:
>>>>>>>>>>> In the dlm_move_lockres_to_recovery_list function, if the lock
>>>>>>>>>>> is in the granted queue and cancel_pending is set, it will
>>>>>>>>>>> encounter a BUG. I think this is a meaningless BUG,
>>>>>>>>>>> so be prepared to remove it. A scenario that causes
>>>>>>>>>>> this BUG will be given below.
>>>>>>>>>>>
>>>>>>>>>>> At the beginning, Node 1 is the master and has NL lock,
>>>>>>>>>>> Node 2 has PR lock, Node 3 has PR lock too.
>>>>>>>>>>>
>>>>>>>>>>> Node 1          Node 2          Node 3
>>>>>>>>>>>                  want to get EX lock.
>>>>>>>>>>>
>>>>>>>>>>>                                  want to get EX lock.
>>>>>>>>>>>
>>>>>>>>>>> Node 3 hinder
>>>>>>>>>>> Node 2 to get
>>>>>>>>>>> EX lock, send
>>>>>>>>>>> Node 3 a BAST.
>>>>>>>>>>>
>>>>>>>>>>>                                  receive BAST from
>>>>>>>>>>>                                  Node 1. downconvert
>>>>>>>>>>>                                  thread begin to
>>>>>>>>>>>                                  cancel PR to EX conversion.
>>>>>>>>>>>                                  In dlmunlock_common function,
>>>>>>>>>>>                                  downconvert thread has set
>>>>>>>>>>>                                  lock->cancel_pending,
>>>>>>>>>>>                                  but did not enter
>>>>>>>>>>>                                  dlm_send_remote_unlock_request
>>>>>>>>>>>                                  function.
>>>>>>>>>>>
>>>>>>>>>>>                  Node2 dies because
>>>>>>>>>>>                  the host is powered down.
>>>>>>>>>>>
>>>>>>>>>>> In recovery process,
>>>>>>>>>>> clean the lock that
>>>>>>>>>>> related to Node2.
>>>>>>>>>>> then finish Node 3
>>>>>>>>>>> PR to EX request.
>>>>>>>>>>> give Node 3 a AST.
>>>>>>>>>>>
>>>>>>>>>>>                                  receive AST from Node 1.
>>>>>>>>>>>                                  change lock level to EX,
>>>>>>>>>>>                                  move lock to granted list.
>>>>>>>>>>>
>>>>>>>>>>> Node1 dies because
>>>>>>>>>>> the host is powered down.
>>>>>>>>>>>
>>>>>>>>>>>                                  In dlm_move_lockres_to_recovery_list
>>>>>>>>>>>                                  function. the lock is in the
>>>>>>>>>>>                                  granted queue and cancel_pending
>>>>>>>>>>>                                  is set. BUG_ON.
>>>>>>>>>>>
>>>>>>>>>>> But after clearing this BUG, process will encounter
>>>>>>>>>>> the second BUG in the ocfs2_unlock_ast function.
>>>>>>>>>>> Here is a scenario that will cause the second BUG
>>>>>>>>>>> in ocfs2_unlock_ast as follows:
>>>>>>>>>>>
>>>>>>>>>>> At the beginning, Node 1 is the master and has NL lock,
>>>>>>>>>>> Node 2 has PR lock, Node 3 has PR lock too.
>>>>>>>>>>>
>>>>>>>>>>> Node 1          Node 2          Node 3
>>>>>>>>>>>                  want to get EX lock.
>>>>>>>>>>>
>>>>>>>>>>>                                  want to get EX lock.
>>>>>>>>>>>
>>>>>>>>>>> Node 3 hinder
>>>>>>>>>>> Node 2 to get
>>>>>>>>>>> EX lock, send
>>>>>>>>>>> Node 3 a BAST.
>>>>>>>>>>>
>>>>>>>>>>>                                  receive BAST from
>>>>>>>>>>>                                  Node 1. downconvert
>>>>>>>>>>>                                  thread begin to
>>>>>>>>>>>                                  cancel PR to EX conversion.
>>>>>>>>>>>                                  In dlmunlock_common function,
>>>>>>>>>>>                                  downconvert thread has released
>>>>>>>>>>>                                  lock->spinlock and res->spinlock,
>>>>>>>>>>>                                  but did not enter
>>>>>>>>>>>                                  dlm_send_remote_unlock_request
>>>>>>>>>>>                                  function.
>>>>>>>>>>>
>>>>>>>>>>>                  Node2 dies because
>>>>>>>>>>>                  the host is powered down.
>>>>>>>>>>>
>>>>>>>>>>> In recovery process,
>>>>>>>>>>> clean the lock that
>>>>>>>>>>> related to Node2.
>>>>>>>>>>> then finish Node 3
>>>>>>>>>>> PR to EX request.
>>>>>>>>>>> give Node 3 a AST.
>>>>>>>>>>>
>>>>>>>>>>>                                  receive AST from Node 1.
>>>>>>>>>>>                                  change lock level to EX,
>>>>>>>>>>>                                  move lock to granted list,
>>>>>>>>>>>                                  set lockres->l_unlock_action
>>>>>>>>>>>                                  as OCFS2_UNLOCK_INVALID
>>>>>>>>>>>                                  in ocfs2_locking_ast function.
>>>>>>>>>>>
>>>>>>>>>>> Node2 dies because
>>>>>>>>>>> the host is powered down.
>>>>>>>>>>>
>>>>>>>>>>>                                  Node 3 realize that Node 1
>>>>>>>>>>>                                  is dead, remove Node 1 from
>>>>>>>>>>>                                  domain_map. downconvert thread
>>>>>>>>>>>                                  get DLM_NORMAL from
>>>>>>>>>>>                                  dlm_send_remote_unlock_request
>>>>>>>>>>>                                  function and set *call_ast as 1.
>>>>>>>>>>>                                  Then downconvert thread meet
>>>>>>>>>>>                                  BUG in ocfs2_unlock_ast function.
>>>>>>>>>>>
>>>>>>>>>>> To avoid meet the second BUG, function dlmunlock_common shuold
>>>>>>>>>>> return DLM_CANCELGRANT if the lock is on granted list and
>>>>>>>>>>> the operation is canceled.
>>>>>>>>>>>
>>>>>>>>>>> Signed-off-by: Jian Wang<wangjian161@huawei.com>
>>>>>>>>>>> Reviewed-by: Yiwen Jiang<jiangyiwen@huawei.com>
>>>>>>>>>>> ---
>>>>>>>>>>>       fs/ocfs2/dlm/dlmrecovery.c | 1 -
>>>>>>>>>>>       fs/ocfs2/dlm/dlmunlock.c   | 5 +++++
>>>>>>>>>>>       2 files changed, 5 insertions(+), 1 deletion(-)
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
>>>>>>>>>>> index 802636d..7489652 100644
>>>>>>>>>>> --- a/fs/ocfs2/dlm/dlmrecovery.c
>>>>>>>>>>> +++ b/fs/ocfs2/dlm/dlmrecovery.c
>>>>>>>>>>> @@ -2134,7 +2134,6 @@ void dlm_move_lockres_to_recovery_list(struct dlm_ctxt *dlm,
>>>>>>>>>>>       				 * if this had completed successfully
>>>>>>>>>>>       				 * before sending this lock state to the
>>>>>>>>>>>       				 * new master */
>>>>>>>>>>> -				BUG_ON(i != DLM_CONVERTING_LIST);
>>>>>>>>>>>       				mlog(0, "node died with cancel pending "
>>>>>>>>>>>       				     "on %.*s. move back to granted list.\n",
>>>>>>>>>>>       				     res->lockname.len, res->lockname.name);
>>>>>>>>>>> diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
>>>>>>>>>>> index 63d701c..505bb6c 100644
>>>>>>>>>>> --- a/fs/ocfs2/dlm/dlmunlock.c
>>>>>>>>>>> +++ b/fs/ocfs2/dlm/dlmunlock.c
>>>>>>>>>>> @@ -183,6 +183,11 @@ static enum dlm_status dlmunlock_common(struct dlm_ctxt *dlm,
>>>>>>>>>>>       							flags, owner);
>>>>>>>>>>>       		spin_lock(&res->spinlock);
>>>>>>>>>>>       		spin_lock(&lock->spinlock);
>>>>>>>>>>> +
>>>>>>>>>>> +		if ((flags & LKM_CANCEL) &&
>>>>>>>>>>> +				dlm_lock_on_list(&res->granted, lock))
>>>>>>>>>>> +			status = DLM_CANCELGRANT;
>>>>>>>>>>> +
>>>>>>>>>>>       		/* if the master told us the lock was already granted,
>>>>>>>>>>>       		 * let the ast handle all of these actions */
>>>>>>>>>>>       		if (status == DLM_CANCELGRANT) {
>>>>>>>>>>> -- 
>>>>>>>>>>> 1.8.3.1
>>>>>>>>>>>
>>>>>>>>>> .
>>>>>>>>>>
>>>>>>>> .
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Ocfs2-devel mailing list
>>>>>>> Ocfs2-devel@oss.oracle.com
>>>>>>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>>>>>>>
>>>>>>
>>>>> .
>>>>>
>>>>
>>> .
>>>
>>
> .
>
Changwei Ge Feb. 15, 2019, 9:27 a.m. UTC | #15
On 2019/2/15 17:20, piaojun wrote:
> Hi Changwei,
> 
> On 2019/2/15 15:56, Changwei Ge wrote:
>> Hi Jun
>>
>> I just read the code around unlock/cancel.
>>
>> On 2019/2/15 15:35, piaojun wrote:
>>> Hi Changwei,
>>>
>>> On 2019/2/15 15:06, Changwei Ge wrote:
>>>> Hi Jun,
>>>>
>>>> On 2019/2/15 14:51, piaojun wrote:
>>>>> Hi Changwei,
>>>>>
>>>>> On 2019/2/14 18:13, Changwei Ge wrote:
>>>>>> On 2019/2/14 17:09, piaojun wrote:
>>>>>>> Hi Changwei,
>>>>>>>
>>>>>>> The problem can't be solved completely if clear ::cancel_pending in
>>>>>>> dlm_proxy_ast_handler, as AST will come at anytime just before
>>>>>>
>>>>>> So how about also add check here bere setting ::cancel_pending in dlmunlock_common() before invoking dlm_send_remote_unlock_request().
>>>>>> If already on grant list just return DLM_CANCELGRANT
>>>>>>
>>>>>> Then a further reference code might look like:
>>>>>>
>>>>>> root@ubuntu16:/home/chge/linux[master]# git diff
>>>>>> diff --git a/fs/ocfs2/dlm/dlmast.c b/fs/ocfs2/dlm/dlmast.c
>>>>>> index 39831fc..812843b 100644
>>>>>> --- a/fs/ocfs2/dlm/dlmast.c
>>>>>> +++ b/fs/ocfs2/dlm/dlmast.c
>>>>>> @@ -372,8 +372,11 @@ int dlm_proxy_ast_handler(struct o2net_msg *msg, u32 len, void *data,
>>>>>>             head = &res->converting;
>>>>>>             lock = NULL;
>>>>>>             list_for_each_entry(lock, head, list) {
>>>>>> -               if (lock->ml.cookie == cookie)
>>>>>> +               if (lock->ml.cookie == cookie) {
>>>>>> +                       if (lock->cancel_pending)
>>>>>> +                               lock->cancel_pending = 0;
>>>>>>                             goto do_ast;
>>>>>> +               }
>>>>>>             }
>>>>>>
>>>>>>             /* if not on convert, try blocked for ast, granted for bast */
>>>>>> diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
>>>>>> index c8e9b70..b4728b5 100644
>>>>>> --- a/fs/ocfs2/dlm/dlmunlock.c
>>>>>> +++ b/fs/ocfs2/dlm/dlmunlock.c
>>>>>> @@ -174,9 +174,14 @@ static enum dlm_status dlmunlock_common(struct dlm_ctxt *dlm,
>>>>>>             if (!master_node) {
>>>>>>                     owner = res->owner;
>>>>>>                     /* drop locks and send message */
>>>>>> -               if (flags & LKM_CANCEL)
>>>>>> +               if (flags & LKM_CANCEL) {
>>>>>> +                       if (dlm_lock_on_list(&res->granted, lock)) {
>>>>>> +                               status = DLM_CANCELGRANT;
>>>>>> +                               goto leave;
>>
>> I found that above code should be useless.
>> As upstream code already take it into consideration that AST has come before cancellation.
>> In dlm_get_cancel_actions()
>> 	'''
>> 	} else if (dlm_lock_on_list(&res->granted, lock)) {
>> 		/* too late, already granted. */
>> 		status = DLM_CANCELGRANT;
>> 		*actions = DLM_UNLOCK_CALL_AST;
>> 	} else {
>> 	'''
>>
>>>>>
>>>>> If master dead and then lockres is moved to granted list in
>>>>> dlm_move_lockres_to_recovery_list, the OCFS2_LOCK_BUSY is not cleared.
>>>>
>>>> OCFS2_LOCK_BUSY  should be cleared in ocfs2_locking_ast() since previous locking AST has come back(moving lock to grant list).
>>>> That's why we return DLM_CANCELGRANT to caller to avoid calling AST. Otherwise ast() will be called twice, which is obviously a BUG.
>>>
>>> I mean master is already dead and ast won't come. So the
>>> OCFS2_LOCK_BUSY is not cleared. And there are two cases that lockres is
>>> moved to grant list:
>>> 1. AST comes back and OCFS2_LOCK_BUSY is cleared.
>>> 2. AST does not come back and OCFS2_LOCK_BUSY remains, recovery process
>>>      move it to grant list. In this case, we need do AST for it.
>>
>> For point 2, ocfs2 can handle it, so you still don't have to worry. It's no problem.
>> In dlmconvert_remote()
> 
> What I worry about is that OCFS2_LOCK_BUSY won't be cleared if remote
> master is dead, as OCFS2_LOCK_BUSY is cleared in two cases:
> 1. remote AST come, clear it in dlm_proxy_ast_handler.
> 2. remote AST does not come when master dead, clear it in
>     o2dlm_unlock_ast_wrapper.

Please don't worry about point 2.
Like my previous e-mail, after dlm recovery picking up a new master for corresponding lock resource.
dlmlock() -> convert will retry and send request to new master.
Eventually, this locking(convert) will succeed and ast() will be called for ocfs2 layer to clear OCFS2_LOCK_BUSY.

Thanks,
Changwei

> 
> For case 2, if DLM_CANCELGRANT is set, o2dlm_unlock_ast_wrapper just
> return and won't clear OCFS2_LOCK_BUSY. So in this case, I think we
> should do AST again for it.
> 
>>
>>
>> 341         if (status != DLM_NORMAL) {
>> 342                 if (status != DLM_NOTQUEUED)
>> 343                         dlm_error(status);
>> 344                 dlm_revert_pending_convert(res, lock);
>> 345         } else if (!lock->convert_pending) {
>> 346                 mlog(0, "%s: res %.*s, owner died and lock has been moved back "
>> 347                                 "to granted list, retry convert.\n",
>> 348                                 dlm->name, res->lockname.len, res->lockname.name);
>> 349                 status = DLM_RECOVERING;
>> 350         }
>>
>> Thus, dlmlock() will wait until RECOVERY is done.
>> And for ocfs2 layer, it's apparent, it doesn't have to be aware of what happened to DLM.
> 
> I do not think the waiting can solve this problem.
> 
>>
>> Thanks,
>> Changwei
>>
>>
>>>
>>> Perhaps we need do AST again for it.
>>>
>>>>
>>>>> And this will cause stuck problem when ocfs2_drop_lock. The reason is
>>>>> that unlock ast won't be done when DLM_CANCELGRANT is set. So I think
>>>>
>>>> With above elaboration, you don't have to worry the hang issue anymore.
>>>>
>>>> Thanks,
>>>> Changwei
>>>>
>>>>> we need distinguish all the cases of moving lockres to grant list.
>>>>>
>>>>>> +                       }
>>>>>> +
>>>>>>                             lock->cancel_pending = 1;
>>>>>> -               else
>>>>>> +               } else
>>>>>>                             lock->unlock_pending = 1;
>>>>>>                     spin_unlock(&lock->spinlock);
>>>>>>                     spin_unlock(&res->spinlock);
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Changwei
>>>>>>
>>>>>>> ::cancel_pendig is set. If there is not any other better solutions,
>>>>>>> could we accept this patch? This bug is very harmful.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Jun
>>>>>>>
>>>>>>> On 2018/12/8 18:05, wangjian wrote:
>>>>>>>> Hi Changwei,
>>>>>>>>
>>>>>>>> I understand your idea. But we should be aware that the cancel_convert process and
>>>>>>>> other processes (accepting the AST process, the recovery process) are asynchronous.
>>>>>>>> For example, according to your idea, check if the lock is in the grant queue before
>>>>>>>> calling the dlm_send_remote_unlock_request function in the dlm_proxy_ast_handler function.
>>>>>>>> Then decide whether to clear cancel_pending. But if the AST does not come at this time,
>>>>>>>> the check passes and cancel_pendig will not be cleared. Then AST immediately came over again,
>>>>>>>> which also led to a bug. I personally think that for asynchronous processes we can't guarantee
>>>>>>>> the speed of execution of each process. All we can do is to avoid the BUG scene.
>>>>>>>>
>>>>>>>> As for the question you said ("If you remove the BUG check why you still call dlm_commit_pending_cancel()
>>>>>>>> to move the lock back to grant on matter it's on converting list or not?").
>>>>>>>> I think we should first check if the lock is in the grant queue
>>>>>>>> (at this time, dlm->spinlock and res->spinlock have been added), then decide whether to call
>>>>>>>> dlm_commit_pending_cancel function.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Jian
>>>>>>>>
>>>>>>>> On 12/7/2018 11:12 AM, Changwei Ge wrote:
>>>>>>>>> Hi Jian,
>>>>>>>>>
>>>>>>>>> I suppose that the situation you described truly exists.
>>>>>>>>> But the way you fix the issue is not in my favor.
>>>>>>>>>
>>>>>>>>> If you remove the BUG check why you still call dlm_commit_pending_cancel() to
>>>>>>>>> move the lock back to grant on matter it's on converting list or not?
>>>>>>>>>
>>>>>>>>> So how about keeping the BUG check in dlm_move_lockres_to_recovery_list().
>>>>>>>>> If the locking _ast_ comes back very fast just check ::cancel_pending in dlm_proxy_ast_handler() and clear it.
>>>>>>>>> Then with the logic checking if the lock is on grant list (in dlmunlock_common() more less like your current method)
>>>>>>>>> or not we can easily tell if the cancellation succeeds or not.
>>>>>>>>>
>>>>>>>>> That complies the original dlm design, which I think is better and easier for maintainers to understand.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Changwei
>>>>>>>>>
>>>>>>>>> On 2018/12/6 20:06, wangjian wrote:
>>>>>>>>>> Hi Changwei,
>>>>>>>>>>
>>>>>>>>>> I don't fully agree with your point of view. In my opinion, the lock cancellation process and the lock
>>>>>>>>>> conversion process are asynchronous. We can't guarantee that the lock must be in the convert list
>>>>>>>>>> during the lock conversion process, otherwise this BUG will not happen.
>>>>>>>>>> So, I think this is a meaningless BUG.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Jian
>>>>>>>>>>
>>>>>>>>>> On 12/5/2018 9:49 AM, Changwei Ge wrote:
>>>>>>>>>>> Hi Jian,
>>>>>>>>>>>
>>>>>>>>>>> I suppose you can't just remove the BUG_ON() check.
>>>>>>>>>>> If you remove it, below code violates the original logic.
>>>>>>>>>>>
>>>>>>>>>>> '''
>>>>>>>>>>> 2141                                 dlm_commit_pending_cancel(res, lock);
>>>>>>>>>>> '''
>>>>>>>>>>>
>>>>>>>>>>> What's more important is *locking cancel* must be against a *locking conversion* progress.
>>>>>>>>>>> So it makes sense to check if this lock is on converting list.
>>>>>>>>>>>
>>>>>>>>>>> So I have to NACK to this patch.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Changwei
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 2018/12/3 20:23, wangjian wrote:
>>>>>>>>>>>> In the dlm_move_lockres_to_recovery_list function, if the lock
>>>>>>>>>>>> is in the granted queue and cancel_pending is set, it will
>>>>>>>>>>>> encounter a BUG. I think this is a meaningless BUG,
>>>>>>>>>>>> so be prepared to remove it. A scenario that causes
>>>>>>>>>>>> this BUG will be given below.
>>>>>>>>>>>>
>>>>>>>>>>>> At the beginning, Node 1 is the master and has NL lock,
>>>>>>>>>>>> Node 2 has PR lock, Node 3 has PR lock too.
>>>>>>>>>>>>
>>>>>>>>>>>> Node 1          Node 2          Node 3
>>>>>>>>>>>>                   want to get EX lock.
>>>>>>>>>>>>
>>>>>>>>>>>>                                   want to get EX lock.
>>>>>>>>>>>>
>>>>>>>>>>>> Node 3 hinder
>>>>>>>>>>>> Node 2 to get
>>>>>>>>>>>> EX lock, send
>>>>>>>>>>>> Node 3 a BAST.
>>>>>>>>>>>>
>>>>>>>>>>>>                                   receive BAST from
>>>>>>>>>>>>                                   Node 1. downconvert
>>>>>>>>>>>>                                   thread begin to
>>>>>>>>>>>>                                   cancel PR to EX conversion.
>>>>>>>>>>>>                                   In dlmunlock_common function,
>>>>>>>>>>>>                                   downconvert thread has set
>>>>>>>>>>>>                                   lock->cancel_pending,
>>>>>>>>>>>>                                   but did not enter
>>>>>>>>>>>>                                   dlm_send_remote_unlock_request
>>>>>>>>>>>>                                   function.
>>>>>>>>>>>>
>>>>>>>>>>>>                   Node2 dies because
>>>>>>>>>>>>                   the host is powered down.
>>>>>>>>>>>>
>>>>>>>>>>>> In recovery process,
>>>>>>>>>>>> clean the lock that
>>>>>>>>>>>> related to Node2.
>>>>>>>>>>>> then finish Node 3
>>>>>>>>>>>> PR to EX request.
>>>>>>>>>>>> give Node 3 a AST.
>>>>>>>>>>>>
>>>>>>>>>>>>                                   receive AST from Node 1.
>>>>>>>>>>>>                                   change lock level to EX,
>>>>>>>>>>>>                                   move lock to granted list.
>>>>>>>>>>>>
>>>>>>>>>>>> Node1 dies because
>>>>>>>>>>>> the host is powered down.
>>>>>>>>>>>>
>>>>>>>>>>>>                                   In dlm_move_lockres_to_recovery_list
>>>>>>>>>>>>                                   function. the lock is in the
>>>>>>>>>>>>                                   granted queue and cancel_pending
>>>>>>>>>>>>                                   is set. BUG_ON.
>>>>>>>>>>>>
>>>>>>>>>>>> But after clearing this BUG, process will encounter
>>>>>>>>>>>> the second BUG in the ocfs2_unlock_ast function.
>>>>>>>>>>>> Here is a scenario that will cause the second BUG
>>>>>>>>>>>> in ocfs2_unlock_ast as follows:
>>>>>>>>>>>>
>>>>>>>>>>>> At the beginning, Node 1 is the master and has NL lock,
>>>>>>>>>>>> Node 2 has PR lock, Node 3 has PR lock too.
>>>>>>>>>>>>
>>>>>>>>>>>> Node 1          Node 2          Node 3
>>>>>>>>>>>>                   want to get EX lock.
>>>>>>>>>>>>
>>>>>>>>>>>>                                   want to get EX lock.
>>>>>>>>>>>>
>>>>>>>>>>>> Node 3 hinder
>>>>>>>>>>>> Node 2 to get
>>>>>>>>>>>> EX lock, send
>>>>>>>>>>>> Node 3 a BAST.
>>>>>>>>>>>>
>>>>>>>>>>>>                                   receive BAST from
>>>>>>>>>>>>                                   Node 1. downconvert
>>>>>>>>>>>>                                   thread begin to
>>>>>>>>>>>>                                   cancel PR to EX conversion.
>>>>>>>>>>>>                                   In dlmunlock_common function,
>>>>>>>>>>>>                                   downconvert thread has released
>>>>>>>>>>>>                                   lock->spinlock and res->spinlock,
>>>>>>>>>>>>                                   but did not enter
>>>>>>>>>>>>                                   dlm_send_remote_unlock_request
>>>>>>>>>>>>                                   function.
>>>>>>>>>>>>
>>>>>>>>>>>>                   Node2 dies because
>>>>>>>>>>>>                   the host is powered down.
>>>>>>>>>>>>
>>>>>>>>>>>> In recovery process,
>>>>>>>>>>>> clean the lock that
>>>>>>>>>>>> related to Node2.
>>>>>>>>>>>> then finish Node 3
>>>>>>>>>>>> PR to EX request.
>>>>>>>>>>>> give Node 3 a AST.
>>>>>>>>>>>>
>>>>>>>>>>>>                                   receive AST from Node 1.
>>>>>>>>>>>>                                   change lock level to EX,
>>>>>>>>>>>>                                   move lock to granted list,
>>>>>>>>>>>>                                   set lockres->l_unlock_action
>>>>>>>>>>>>                                   as OCFS2_UNLOCK_INVALID
>>>>>>>>>>>>                                   in ocfs2_locking_ast function.
>>>>>>>>>>>>
>>>>>>>>>>>> Node2 dies because
>>>>>>>>>>>> the host is powered down.
>>>>>>>>>>>>
>>>>>>>>>>>>                                   Node 3 realize that Node 1
>>>>>>>>>>>>                                   is dead, remove Node 1 from
>>>>>>>>>>>>                                   domain_map. downconvert thread
>>>>>>>>>>>>                                   get DLM_NORMAL from
>>>>>>>>>>>>                                   dlm_send_remote_unlock_request
>>>>>>>>>>>>                                   function and set *call_ast as 1.
>>>>>>>>>>>>                                   Then downconvert thread meet
>>>>>>>>>>>>                                   BUG in ocfs2_unlock_ast function.
>>>>>>>>>>>>
>>>>>>>>>>>> To avoid meet the second BUG, function dlmunlock_common shuold
>>>>>>>>>>>> return DLM_CANCELGRANT if the lock is on granted list and
>>>>>>>>>>>> the operation is canceled.
>>>>>>>>>>>>
>>>>>>>>>>>> Signed-off-by: Jian Wang<wangjian161@huawei.com>
>>>>>>>>>>>> Reviewed-by: Yiwen Jiang<jiangyiwen@huawei.com>
>>>>>>>>>>>> ---
>>>>>>>>>>>>        fs/ocfs2/dlm/dlmrecovery.c | 1 -
>>>>>>>>>>>>        fs/ocfs2/dlm/dlmunlock.c   | 5 +++++
>>>>>>>>>>>>        2 files changed, 5 insertions(+), 1 deletion(-)
>>>>>>>>>>>>
>>>>>>>>>>>> diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
>>>>>>>>>>>> index 802636d..7489652 100644
>>>>>>>>>>>> --- a/fs/ocfs2/dlm/dlmrecovery.c
>>>>>>>>>>>> +++ b/fs/ocfs2/dlm/dlmrecovery.c
>>>>>>>>>>>> @@ -2134,7 +2134,6 @@ void dlm_move_lockres_to_recovery_list(struct dlm_ctxt *dlm,
>>>>>>>>>>>>        				 * if this had completed successfully
>>>>>>>>>>>>        				 * before sending this lock state to the
>>>>>>>>>>>>        				 * new master */
>>>>>>>>>>>> -				BUG_ON(i != DLM_CONVERTING_LIST);
>>>>>>>>>>>>        				mlog(0, "node died with cancel pending "
>>>>>>>>>>>>        				     "on %.*s. move back to granted list.\n",
>>>>>>>>>>>>        				     res->lockname.len, res->lockname.name);
>>>>>>>>>>>> diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
>>>>>>>>>>>> index 63d701c..505bb6c 100644
>>>>>>>>>>>> --- a/fs/ocfs2/dlm/dlmunlock.c
>>>>>>>>>>>> +++ b/fs/ocfs2/dlm/dlmunlock.c
>>>>>>>>>>>> @@ -183,6 +183,11 @@ static enum dlm_status dlmunlock_common(struct dlm_ctxt *dlm,
>>>>>>>>>>>>        							flags, owner);
>>>>>>>>>>>>        		spin_lock(&res->spinlock);
>>>>>>>>>>>>        		spin_lock(&lock->spinlock);
>>>>>>>>>>>> +
>>>>>>>>>>>> +		if ((flags & LKM_CANCEL) &&
>>>>>>>>>>>> +				dlm_lock_on_list(&res->granted, lock))
>>>>>>>>>>>> +			status = DLM_CANCELGRANT;
>>>>>>>>>>>> +
>>>>>>>>>>>>        		/* if the master told us the lock was already granted,
>>>>>>>>>>>>        		 * let the ast handle all of these actions */
>>>>>>>>>>>>        		if (status == DLM_CANCELGRANT) {
>>>>>>>>>>>> -- 
>>>>>>>>>>>> 1.8.3.1
>>>>>>>>>>>>
>>>>>>>>>>> .
>>>>>>>>>>>
>>>>>>>>> .
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Ocfs2-devel mailing list
>>>>>>>> Ocfs2-devel@oss.oracle.com
>>>>>>>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>>>>>>>>
>>>>>>>
>>>>>> .
>>>>>>
>>>>>
>>>> .
>>>>
>>>
>> .
>>
>
piaojun Feb. 15, 2019, 9:48 a.m. UTC | #16
Hi Changwei,

On 2019/2/15 17:27, Changwei Ge wrote:
> On 2019/2/15 17:20, piaojun wrote:
>> Hi Changwei,
>>
>> On 2019/2/15 15:56, Changwei Ge wrote:
>>> Hi Jun
>>>
>>> I just read the code around unlock/cancel.
>>>
>>> On 2019/2/15 15:35, piaojun wrote:
>>>> Hi Changwei,
>>>>
>>>> On 2019/2/15 15:06, Changwei Ge wrote:
>>>>> Hi Jun,
>>>>>
>>>>> On 2019/2/15 14:51, piaojun wrote:
>>>>>> Hi Changwei,
>>>>>>
>>>>>> On 2019/2/14 18:13, Changwei Ge wrote:
>>>>>>> On 2019/2/14 17:09, piaojun wrote:
>>>>>>>> Hi Changwei,
>>>>>>>>
>>>>>>>> The problem can't be solved completely if clear ::cancel_pending in
>>>>>>>> dlm_proxy_ast_handler, as AST will come at anytime just before
>>>>>>>
>>>>>>> So how about also add check here bere setting ::cancel_pending in dlmunlock_common() before invoking dlm_send_remote_unlock_request().
>>>>>>> If already on grant list just return DLM_CANCELGRANT
>>>>>>>
>>>>>>> Then a further reference code might look like:
>>>>>>>
>>>>>>> root@ubuntu16:/home/chge/linux[master]# git diff
>>>>>>> diff --git a/fs/ocfs2/dlm/dlmast.c b/fs/ocfs2/dlm/dlmast.c
>>>>>>> index 39831fc..812843b 100644
>>>>>>> --- a/fs/ocfs2/dlm/dlmast.c
>>>>>>> +++ b/fs/ocfs2/dlm/dlmast.c
>>>>>>> @@ -372,8 +372,11 @@ int dlm_proxy_ast_handler(struct o2net_msg *msg, u32 len, void *data,
>>>>>>>             head = &res->converting;
>>>>>>>             lock = NULL;
>>>>>>>             list_for_each_entry(lock, head, list) {
>>>>>>> -               if (lock->ml.cookie == cookie)
>>>>>>> +               if (lock->ml.cookie == cookie) {
>>>>>>> +                       if (lock->cancel_pending)
>>>>>>> +                               lock->cancel_pending = 0;
>>>>>>>                             goto do_ast;
>>>>>>> +               }
>>>>>>>             }
>>>>>>>
>>>>>>>             /* if not on convert, try blocked for ast, granted for bast */
>>>>>>> diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
>>>>>>> index c8e9b70..b4728b5 100644
>>>>>>> --- a/fs/ocfs2/dlm/dlmunlock.c
>>>>>>> +++ b/fs/ocfs2/dlm/dlmunlock.c
>>>>>>> @@ -174,9 +174,14 @@ static enum dlm_status dlmunlock_common(struct dlm_ctxt *dlm,
>>>>>>>             if (!master_node) {
>>>>>>>                     owner = res->owner;
>>>>>>>                     /* drop locks and send message */
>>>>>>> -               if (flags & LKM_CANCEL)
>>>>>>> +               if (flags & LKM_CANCEL) {
>>>>>>> +                       if (dlm_lock_on_list(&res->granted, lock)) {
>>>>>>> +                               status = DLM_CANCELGRANT;
>>>>>>> +                               goto leave;
>>>
>>> I found that above code should be useless.
>>> As upstream code already take it into consideration that AST has come before cancellation.
>>> In dlm_get_cancel_actions()
>>> 	'''
>>> 	} else if (dlm_lock_on_list(&res->granted, lock)) {
>>> 		/* too late, already granted. */
>>> 		status = DLM_CANCELGRANT;
>>> 		*actions = DLM_UNLOCK_CALL_AST;
>>> 	} else {
>>> 	'''
>>>
>>>>>>
>>>>>> If master dead and then lockres is moved to granted list in
>>>>>> dlm_move_lockres_to_recovery_list, the OCFS2_LOCK_BUSY is not cleared.
>>>>>
>>>>> OCFS2_LOCK_BUSY  should be cleared in ocfs2_locking_ast() since previous locking AST has come back(moving lock to grant list).
>>>>> That's why we return DLM_CANCELGRANT to caller to avoid calling AST. Otherwise ast() will be called twice, which is obviously a BUG.
>>>>
>>>> I mean master is already dead and ast won't come. So the
>>>> OCFS2_LOCK_BUSY is not cleared. And there are two cases that lockres is
>>>> moved to grant list:
>>>> 1. AST comes back and OCFS2_LOCK_BUSY is cleared.
>>>> 2. AST does not come back and OCFS2_LOCK_BUSY remains, recovery process
>>>>      move it to grant list. In this case, we need do AST for it.
>>>
>>> For point 2, ocfs2 can handle it, so you still don't have to worry. It's no problem.
>>> In dlmconvert_remote()
>>
>> What I worry about is that OCFS2_LOCK_BUSY won't be cleared if remote
>> master is dead, as OCFS2_LOCK_BUSY is cleared in two cases:
>> 1. remote AST come, clear it in dlm_proxy_ast_handler.
>> 2. remote AST does not come when master dead, clear it in
>>     o2dlm_unlock_ast_wrapper.
> 
> Please don't worry about point 2.
> Like my previous e-mail, after dlm recovery picking up a new master for corresponding lock resource.
> dlmlock() -> convert will retry and send request to new master.
> Eventually, this locking(convert) will succeed and ast() will be called for ocfs2 layer to clear OCFS2_LOCK_BUSY.

Perhaps you misunderstand my meaning. The AST I meant is for dlmunlock,
not for dlmlock, so it won't cause retry sending request to new master.

> 
> Thanks,
> Changwei
> 
>>
>> For case 2, if DLM_CANCELGRANT is set, o2dlm_unlock_ast_wrapper just
>> return and won't clear OCFS2_LOCK_BUSY. So in this case, I think we
>> should do AST again for it.
>>
>>>
>>>
>>> 341         if (status != DLM_NORMAL) {
>>> 342                 if (status != DLM_NOTQUEUED)
>>> 343                         dlm_error(status);
>>> 344                 dlm_revert_pending_convert(res, lock);
>>> 345         } else if (!lock->convert_pending) {
>>> 346                 mlog(0, "%s: res %.*s, owner died and lock has been moved back "
>>> 347                                 "to granted list, retry convert.\n",
>>> 348                                 dlm->name, res->lockname.len, res->lockname.name);
>>> 349                 status = DLM_RECOVERING;
>>> 350         }
>>>
>>> Thus, dlmlock() will wait until RECOVERY is done.
>>> And for ocfs2 layer, it's apparent, it doesn't have to be aware of what happened to DLM.
>>
>> I do not think the waiting can solve this problem.
>>
>>>
>>> Thanks,
>>> Changwei
>>>
>>>
>>>>
>>>> Perhaps we need do AST again for it.
>>>>
>>>>>
>>>>>> And this will cause stuck problem when ocfs2_drop_lock. The reason is
>>>>>> that unlock ast won't be done when DLM_CANCELGRANT is set. So I think
>>>>>
>>>>> With above elaboration, you don't have to worry the hang issue anymore.
>>>>>
>>>>> Thanks,
>>>>> Changwei
>>>>>
>>>>>> we need distinguish all the cases of moving lockres to grant list.
>>>>>>
>>>>>>> +                       }
>>>>>>> +
>>>>>>>                             lock->cancel_pending = 1;
>>>>>>> -               else
>>>>>>> +               } else
>>>>>>>                             lock->unlock_pending = 1;
>>>>>>>                     spin_unlock(&lock->spinlock);
>>>>>>>                     spin_unlock(&res->spinlock);
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Changwei
>>>>>>>
>>>>>>>> ::cancel_pendig is set. If there is not any other better solutions,
>>>>>>>> could we accept this patch? This bug is very harmful.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Jun
>>>>>>>>
>>>>>>>> On 2018/12/8 18:05, wangjian wrote:
>>>>>>>>> Hi Changwei,
>>>>>>>>>
>>>>>>>>> I understand your idea. But we should be aware that the cancel_convert process and
>>>>>>>>> other processes (accepting the AST process, the recovery process) are asynchronous.
>>>>>>>>> For example, according to your idea, check if the lock is in the grant queue before
>>>>>>>>> calling the dlm_send_remote_unlock_request function in the dlm_proxy_ast_handler function.
>>>>>>>>> Then decide whether to clear cancel_pending. But if the AST does not come at this time,
>>>>>>>>> the check passes and cancel_pendig will not be cleared. Then AST immediately came over again,
>>>>>>>>> which also led to a bug. I personally think that for asynchronous processes we can't guarantee
>>>>>>>>> the speed of execution of each process. All we can do is to avoid the BUG scene.
>>>>>>>>>
>>>>>>>>> As for the question you said ("If you remove the BUG check why you still call dlm_commit_pending_cancel()
>>>>>>>>> to move the lock back to grant on matter it's on converting list or not?").
>>>>>>>>> I think we should first check if the lock is in the grant queue
>>>>>>>>> (at this time, dlm->spinlock and res->spinlock have been added), then decide whether to call
>>>>>>>>> dlm_commit_pending_cancel function.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Jian
>>>>>>>>>
>>>>>>>>> On 12/7/2018 11:12 AM, Changwei Ge wrote:
>>>>>>>>>> Hi Jian,
>>>>>>>>>>
>>>>>>>>>> I suppose that the situation you described truly exists.
>>>>>>>>>> But the way you fix the issue is not in my favor.
>>>>>>>>>>
>>>>>>>>>> If you remove the BUG check why you still call dlm_commit_pending_cancel() to
>>>>>>>>>> move the lock back to grant on matter it's on converting list or not?
>>>>>>>>>>
>>>>>>>>>> So how about keeping the BUG check in dlm_move_lockres_to_recovery_list().
>>>>>>>>>> If the locking _ast_ comes back very fast just check ::cancel_pending in dlm_proxy_ast_handler() and clear it.
>>>>>>>>>> Then with the logic checking if the lock is on grant list (in dlmunlock_common() more less like your current method)
>>>>>>>>>> or not we can easily tell if the cancellation succeeds or not.
>>>>>>>>>>
>>>>>>>>>> That complies the original dlm design, which I think is better and easier for maintainers to understand.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Changwei
>>>>>>>>>>
>>>>>>>>>> On 2018/12/6 20:06, wangjian wrote:
>>>>>>>>>>> Hi Changwei,
>>>>>>>>>>>
>>>>>>>>>>> I don't fully agree with your point of view. In my opinion, the lock cancellation process and the lock
>>>>>>>>>>> conversion process are asynchronous. We can't guarantee that the lock must be in the convert list
>>>>>>>>>>> during the lock conversion process, otherwise this BUG will not happen.
>>>>>>>>>>> So, I think this is a meaningless BUG.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Jian
>>>>>>>>>>>
>>>>>>>>>>> On 12/5/2018 9:49 AM, Changwei Ge wrote:
>>>>>>>>>>>> Hi Jian,
>>>>>>>>>>>>
>>>>>>>>>>>> I suppose you can't just remove the BUG_ON() check.
>>>>>>>>>>>> If you remove it, below code violates the original logic.
>>>>>>>>>>>>
>>>>>>>>>>>> '''
>>>>>>>>>>>> 2141                                 dlm_commit_pending_cancel(res, lock);
>>>>>>>>>>>> '''
>>>>>>>>>>>>
>>>>>>>>>>>> What's more important is *locking cancel* must be against a *locking conversion* progress.
>>>>>>>>>>>> So it makes sense to check if this lock is on converting list.
>>>>>>>>>>>>
>>>>>>>>>>>> So I have to NACK to this patch.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Changwei
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 2018/12/3 20:23, wangjian wrote:
>>>>>>>>>>>>> In the dlm_move_lockres_to_recovery_list function, if the lock
>>>>>>>>>>>>> is in the granted queue and cancel_pending is set, it will
>>>>>>>>>>>>> encounter a BUG. I think this is a meaningless BUG,
>>>>>>>>>>>>> so be prepared to remove it. A scenario that causes
>>>>>>>>>>>>> this BUG will be given below.
>>>>>>>>>>>>>
>>>>>>>>>>>>> At the beginning, Node 1 is the master and has NL lock,
>>>>>>>>>>>>> Node 2 has PR lock, Node 3 has PR lock too.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Node 1          Node 2          Node 3
>>>>>>>>>>>>>                   want to get EX lock.
>>>>>>>>>>>>>
>>>>>>>>>>>>>                                   want to get EX lock.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Node 3 hinder
>>>>>>>>>>>>> Node 2 to get
>>>>>>>>>>>>> EX lock, send
>>>>>>>>>>>>> Node 3 a BAST.
>>>>>>>>>>>>>
>>>>>>>>>>>>>                                   receive BAST from
>>>>>>>>>>>>>                                   Node 1. downconvert
>>>>>>>>>>>>>                                   thread begin to
>>>>>>>>>>>>>                                   cancel PR to EX conversion.
>>>>>>>>>>>>>                                   In dlmunlock_common function,
>>>>>>>>>>>>>                                   downconvert thread has set
>>>>>>>>>>>>>                                   lock->cancel_pending,
>>>>>>>>>>>>>                                   but did not enter
>>>>>>>>>>>>>                                   dlm_send_remote_unlock_request
>>>>>>>>>>>>>                                   function.
>>>>>>>>>>>>>
>>>>>>>>>>>>>                   Node2 dies because
>>>>>>>>>>>>>                   the host is powered down.
>>>>>>>>>>>>>
>>>>>>>>>>>>> In recovery process,
>>>>>>>>>>>>> clean the lock that
>>>>>>>>>>>>> related to Node2.
>>>>>>>>>>>>> then finish Node 3
>>>>>>>>>>>>> PR to EX request.
>>>>>>>>>>>>> give Node 3 a AST.
>>>>>>>>>>>>>
>>>>>>>>>>>>>                                   receive AST from Node 1.
>>>>>>>>>>>>>                                   change lock level to EX,
>>>>>>>>>>>>>                                   move lock to granted list.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Node1 dies because
>>>>>>>>>>>>> the host is powered down.
>>>>>>>>>>>>>
>>>>>>>>>>>>>                                   In dlm_move_lockres_to_recovery_list
>>>>>>>>>>>>>                                   function. the lock is in the
>>>>>>>>>>>>>                                   granted queue and cancel_pending
>>>>>>>>>>>>>                                   is set. BUG_ON.
>>>>>>>>>>>>>
>>>>>>>>>>>>> But after clearing this BUG, process will encounter
>>>>>>>>>>>>> the second BUG in the ocfs2_unlock_ast function.
>>>>>>>>>>>>> Here is a scenario that will cause the second BUG
>>>>>>>>>>>>> in ocfs2_unlock_ast as follows:
>>>>>>>>>>>>>
>>>>>>>>>>>>> At the beginning, Node 1 is the master and has NL lock,
>>>>>>>>>>>>> Node 2 has PR lock, Node 3 has PR lock too.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Node 1          Node 2          Node 3
>>>>>>>>>>>>>                   want to get EX lock.
>>>>>>>>>>>>>
>>>>>>>>>>>>>                                   want to get EX lock.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Node 3 hinder
>>>>>>>>>>>>> Node 2 to get
>>>>>>>>>>>>> EX lock, send
>>>>>>>>>>>>> Node 3 a BAST.
>>>>>>>>>>>>>
>>>>>>>>>>>>>                                   receive BAST from
>>>>>>>>>>>>>                                   Node 1. downconvert
>>>>>>>>>>>>>                                   thread begin to
>>>>>>>>>>>>>                                   cancel PR to EX conversion.
>>>>>>>>>>>>>                                   In dlmunlock_common function,
>>>>>>>>>>>>>                                   downconvert thread has released
>>>>>>>>>>>>>                                   lock->spinlock and res->spinlock,
>>>>>>>>>>>>>                                   but did not enter
>>>>>>>>>>>>>                                   dlm_send_remote_unlock_request
>>>>>>>>>>>>>                                   function.
>>>>>>>>>>>>>
>>>>>>>>>>>>>                   Node2 dies because
>>>>>>>>>>>>>                   the host is powered down.
>>>>>>>>>>>>>
>>>>>>>>>>>>> In recovery process,
>>>>>>>>>>>>> clean the lock that
>>>>>>>>>>>>> related to Node2.
>>>>>>>>>>>>> then finish Node 3
>>>>>>>>>>>>> PR to EX request.
>>>>>>>>>>>>> give Node 3 a AST.
>>>>>>>>>>>>>
>>>>>>>>>>>>>                                   receive AST from Node 1.
>>>>>>>>>>>>>                                   change lock level to EX,
>>>>>>>>>>>>>                                   move lock to granted list,
>>>>>>>>>>>>>                                   set lockres->l_unlock_action
>>>>>>>>>>>>>                                   as OCFS2_UNLOCK_INVALID
>>>>>>>>>>>>>                                   in ocfs2_locking_ast function.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Node2 dies because
>>>>>>>>>>>>> the host is powered down.
>>>>>>>>>>>>>
>>>>>>>>>>>>>                                   Node 3 realize that Node 1
>>>>>>>>>>>>>                                   is dead, remove Node 1 from
>>>>>>>>>>>>>                                   domain_map. downconvert thread
>>>>>>>>>>>>>                                   get DLM_NORMAL from
>>>>>>>>>>>>>                                   dlm_send_remote_unlock_request
>>>>>>>>>>>>>                                   function and set *call_ast as 1.
>>>>>>>>>>>>>                                   Then downconvert thread meet
>>>>>>>>>>>>>                                   BUG in ocfs2_unlock_ast function.
>>>>>>>>>>>>>
>>>>>>>>>>>>> To avoid meet the second BUG, function dlmunlock_common shuold
>>>>>>>>>>>>> return DLM_CANCELGRANT if the lock is on granted list and
>>>>>>>>>>>>> the operation is canceled.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Signed-off-by: Jian Wang<wangjian161@huawei.com>
>>>>>>>>>>>>> Reviewed-by: Yiwen Jiang<jiangyiwen@huawei.com>
>>>>>>>>>>>>> ---
>>>>>>>>>>>>>        fs/ocfs2/dlm/dlmrecovery.c | 1 -
>>>>>>>>>>>>>        fs/ocfs2/dlm/dlmunlock.c   | 5 +++++
>>>>>>>>>>>>>        2 files changed, 5 insertions(+), 1 deletion(-)
>>>>>>>>>>>>>
>>>>>>>>>>>>> diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
>>>>>>>>>>>>> index 802636d..7489652 100644
>>>>>>>>>>>>> --- a/fs/ocfs2/dlm/dlmrecovery.c
>>>>>>>>>>>>> +++ b/fs/ocfs2/dlm/dlmrecovery.c
>>>>>>>>>>>>> @@ -2134,7 +2134,6 @@ void dlm_move_lockres_to_recovery_list(struct dlm_ctxt *dlm,
>>>>>>>>>>>>>        				 * if this had completed successfully
>>>>>>>>>>>>>        				 * before sending this lock state to the
>>>>>>>>>>>>>        				 * new master */
>>>>>>>>>>>>> -				BUG_ON(i != DLM_CONVERTING_LIST);
>>>>>>>>>>>>>        				mlog(0, "node died with cancel pending "
>>>>>>>>>>>>>        				     "on %.*s. move back to granted list.\n",
>>>>>>>>>>>>>        				     res->lockname.len, res->lockname.name);
>>>>>>>>>>>>> diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
>>>>>>>>>>>>> index 63d701c..505bb6c 100644
>>>>>>>>>>>>> --- a/fs/ocfs2/dlm/dlmunlock.c
>>>>>>>>>>>>> +++ b/fs/ocfs2/dlm/dlmunlock.c
>>>>>>>>>>>>> @@ -183,6 +183,11 @@ static enum dlm_status dlmunlock_common(struct dlm_ctxt *dlm,
>>>>>>>>>>>>>        							flags, owner);
>>>>>>>>>>>>>        		spin_lock(&res->spinlock);
>>>>>>>>>>>>>        		spin_lock(&lock->spinlock);
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +		if ((flags & LKM_CANCEL) &&
>>>>>>>>>>>>> +				dlm_lock_on_list(&res->granted, lock))
>>>>>>>>>>>>> +			status = DLM_CANCELGRANT;
>>>>>>>>>>>>> +
>>>>>>>>>>>>>        		/* if the master told us the lock was already granted,
>>>>>>>>>>>>>        		 * let the ast handle all of these actions */
>>>>>>>>>>>>>        		if (status == DLM_CANCELGRANT) {
>>>>>>>>>>>>> -- 
>>>>>>>>>>>>> 1.8.3.1
>>>>>>>>>>>>>
>>>>>>>>>>>> .
>>>>>>>>>>>>
>>>>>>>>>> .
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> Ocfs2-devel mailing list
>>>>>>>>> Ocfs2-devel@oss.oracle.com
>>>>>>>>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>>>>>>>>>
>>>>>>>>
>>>>>>> .
>>>>>>>
>>>>>>
>>>>> .
>>>>>
>>>>
>>> .
>>>
>>
> .
>
Changwei Ge Feb. 18, 2019, 9:25 a.m. UTC | #17
Hi Jun,

On 2019/2/15 17:49, piaojun wrote:
> Hi Changwei,
> 
> On 2019/2/15 17:27, Changwei Ge wrote:
>> On 2019/2/15 17:20, piaojun wrote:
>>> Hi Changwei,
>>>
>>> On 2019/2/15 15:56, Changwei Ge wrote:
>>>> Hi Jun
>>>>
>>>> I just read the code around unlock/cancel.
>>>>
>>>> On 2019/2/15 15:35, piaojun wrote:
>>>>> Hi Changwei,
>>>>>
>>>>> On 2019/2/15 15:06, Changwei Ge wrote:
>>>>>> Hi Jun,
>>>>>>
>>>>>> On 2019/2/15 14:51, piaojun wrote:
>>>>>>> Hi Changwei,
>>>>>>>
>>>>>>> On 2019/2/14 18:13, Changwei Ge wrote:
>>>>>>>> On 2019/2/14 17:09, piaojun wrote:
>>>>>>>>> Hi Changwei,
>>>>>>>>>
>>>>>>>>> The problem can't be solved completely if clear ::cancel_pending in
>>>>>>>>> dlm_proxy_ast_handler, as AST will come at anytime just before
>>>>>>>>
>>>>>>>> So how about also add check here bere setting ::cancel_pending in dlmunlock_common() before invoking dlm_send_remote_unlock_request().
>>>>>>>> If already on grant list just return DLM_CANCELGRANT
>>>>>>>>
>>>>>>>> Then a further reference code might look like:
>>>>>>>>
>>>>>>>> root@ubuntu16:/home/chge/linux[master]# git diff
>>>>>>>> diff --git a/fs/ocfs2/dlm/dlmast.c b/fs/ocfs2/dlm/dlmast.c
>>>>>>>> index 39831fc..812843b 100644
>>>>>>>> --- a/fs/ocfs2/dlm/dlmast.c
>>>>>>>> +++ b/fs/ocfs2/dlm/dlmast.c
>>>>>>>> @@ -372,8 +372,11 @@ int dlm_proxy_ast_handler(struct o2net_msg *msg, u32 len, void *data,
>>>>>>>>              head = &res->converting;
>>>>>>>>              lock = NULL;
>>>>>>>>              list_for_each_entry(lock, head, list) {
>>>>>>>> -               if (lock->ml.cookie == cookie)
>>>>>>>> +               if (lock->ml.cookie == cookie) {
>>>>>>>> +                       if (lock->cancel_pending)
>>>>>>>> +                               lock->cancel_pending = 0;
>>>>>>>>                              goto do_ast;
>>>>>>>> +               }
>>>>>>>>              }
>>>>>>>>
>>>>>>>>              /* if not on convert, try blocked for ast, granted for bast */
>>>>>>>> diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
>>>>>>>> index c8e9b70..b4728b5 100644
>>>>>>>> --- a/fs/ocfs2/dlm/dlmunlock.c
>>>>>>>> +++ b/fs/ocfs2/dlm/dlmunlock.c
>>>>>>>> @@ -174,9 +174,14 @@ static enum dlm_status dlmunlock_common(struct dlm_ctxt *dlm,
>>>>>>>>              if (!master_node) {
>>>>>>>>                      owner = res->owner;
>>>>>>>>                      /* drop locks and send message */
>>>>>>>> -               if (flags & LKM_CANCEL)
>>>>>>>> +               if (flags & LKM_CANCEL) {
>>>>>>>> +                       if (dlm_lock_on_list(&res->granted, lock)) {
>>>>>>>> +                               status = DLM_CANCELGRANT;
>>>>>>>> +                               goto leave;
>>>>
>>>> I found that above code should be useless.
>>>> As upstream code already take it into consideration that AST has come before cancellation.
>>>> In dlm_get_cancel_actions()
>>>> 	'''
>>>> 	} else if (dlm_lock_on_list(&res->granted, lock)) {
>>>> 		/* too late, already granted. */
>>>> 		status = DLM_CANCELGRANT;
>>>> 		*actions = DLM_UNLOCK_CALL_AST;
>>>> 	} else {
>>>> 	'''
>>>>
>>>>>>>
>>>>>>> If master dead and then lockres is moved to granted list in
>>>>>>> dlm_move_lockres_to_recovery_list, the OCFS2_LOCK_BUSY is not cleared.
>>>>>>
>>>>>> OCFS2_LOCK_BUSY  should be cleared in ocfs2_locking_ast() since previous locking AST has come back(moving lock to grant list).
>>>>>> That's why we return DLM_CANCELGRANT to caller to avoid calling AST. Otherwise ast() will be called twice, which is obviously a BUG.
>>>>>
>>>>> I mean master is already dead and ast won't come. So the
>>>>> OCFS2_LOCK_BUSY is not cleared. And there are two cases that lockres is
>>>>> moved to grant list:
>>>>> 1. AST comes back and OCFS2_LOCK_BUSY is cleared.
>>>>> 2. AST does not come back and OCFS2_LOCK_BUSY remains, recovery process
>>>>>       move it to grant list. In this case, we need do AST for it.
>>>>
>>>> For point 2, ocfs2 can handle it, so you still don't have to worry. It's no problem.
>>>> In dlmconvert_remote()
>>>
>>> What I worry about is that OCFS2_LOCK_BUSY won't be cleared if remote
>>> master is dead, as OCFS2_LOCK_BUSY is cleared in two cases:
>>> 1. remote AST come, clear it in dlm_proxy_ast_handler.
>>> 2. remote AST does not come when master dead, clear it in
>>>      o2dlm_unlock_ast_wrapper.

If AST doesn't manage to get back to requested node, why must flag OCFS2_LOCK_BUSY be cleared in o2dlm_unlock_ast_wrapper?

Yes, OCFS2_LOCK_BUSY can be cleared it either o2dlm_unlock_ast_wrapper() or o2dlm_lock_ast_wrapper() with o2cb stack applied.

If we return DLM_CANCELGRANT from ocfs2/dlm to dlm, then we must know that AST has ever come back or master node has moved the lock to grant list itself, thus we clear flag OCFS2_LOCK_BUSY in o2dlm_lock_ast_wrapper().
Otherwise we ascertain that we can stop current ongoing locking procedure and must clear OCFS2_LOCK_BUSY in o2dlm_unlock_ast_wrapper() (*synchronized*).

Let's summarize this, OCFS2_LOCK_BUSY should be cleared whether by locking success or cancellation success.

And my way already check if the lock is granted then return DLM_CANCELGRANT or not.

>>
>> Please don't worry about point 2.
>> Like my previous e-mail, after dlm recovery picking up a new master for corresponding lock resource.
>> dlmlock() -> convert will retry and send request to new master.
>> Eventually, this locking(convert) will succeed and ast() will be called for ocfs2 layer to clear OCFS2_LOCK_BUSY.
> 
> Perhaps you misunderstand my meaning. The AST I meant is for dlmunlock,
> not for dlmlock, so it won't cause retry sending request to new master.

If the unlocking AST is not back then dlm_send_remote_unlock_request() returns NORMAL.
Because unlocking AST is synchronized with unlocking request, which means master won't sent back a separated message back(proxy ast type).
That means *status* returned from dlm_send_remote_unlock_request() indicates AST for unlocking should be called or not.
How can AST for unlocking can't be invoked.

You really make me puzzled. :- (((

Thanks,
Changwei

> 
>>
>> Thanks,
>> Changwei
>>
>>>
>>> For case 2, if DLM_CANCELGRANT is set, o2dlm_unlock_ast_wrapper just
>>> return and won't clear OCFS2_LOCK_BUSY. So in this case, I think we
>>> should do AST again for it.
>>>
>>>>
>>>>
>>>> 341         if (status != DLM_NORMAL) {
>>>> 342                 if (status != DLM_NOTQUEUED)
>>>> 343                         dlm_error(status);
>>>> 344                 dlm_revert_pending_convert(res, lock);
>>>> 345         } else if (!lock->convert_pending) {
>>>> 346                 mlog(0, "%s: res %.*s, owner died and lock has been moved back "
>>>> 347                                 "to granted list, retry convert.\n",
>>>> 348                                 dlm->name, res->lockname.len, res->lockname.name);
>>>> 349                 status = DLM_RECOVERING;
>>>> 350         }
>>>>
>>>> Thus, dlmlock() will wait until RECOVERY is done.
>>>> And for ocfs2 layer, it's apparent, it doesn't have to be aware of what happened to DLM.
>>>
>>> I do not think the waiting can solve this problem.
>>>
>>>>
>>>> Thanks,
>>>> Changwei
>>>>
>>>>
>>>>>
>>>>> Perhaps we need do AST again for it.
>>>>>
>>>>>>
>>>>>>> And this will cause stuck problem when ocfs2_drop_lock. The reason is
>>>>>>> that unlock ast won't be done when DLM_CANCELGRANT is set. So I think
>>>>>>
>>>>>> With above elaboration, you don't have to worry the hang issue anymore.
>>>>>>
>>>>>> Thanks,
>>>>>> Changwei
>>>>>>
>>>>>>> we need distinguish all the cases of moving lockres to grant list.
>>>>>>>
>>>>>>>> +                       }
>>>>>>>> +
>>>>>>>>                              lock->cancel_pending = 1;
>>>>>>>> -               else
>>>>>>>> +               } else
>>>>>>>>                              lock->unlock_pending = 1;
>>>>>>>>                      spin_unlock(&lock->spinlock);
>>>>>>>>                      spin_unlock(&res->spinlock);
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Changwei
>>>>>>>>
>>>>>>>>> ::cancel_pendig is set. If there is not any other better solutions,
>>>>>>>>> could we accept this patch? This bug is very harmful.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Jun
>>>>>>>>>
>>>>>>>>> On 2018/12/8 18:05, wangjian wrote:
>>>>>>>>>> Hi Changwei,
>>>>>>>>>>
>>>>>>>>>> I understand your idea. But we should be aware that the cancel_convert process and
>>>>>>>>>> other processes (accepting the AST process, the recovery process) are asynchronous.
>>>>>>>>>> For example, according to your idea, check if the lock is in the grant queue before
>>>>>>>>>> calling the dlm_send_remote_unlock_request function in the dlm_proxy_ast_handler function.
>>>>>>>>>> Then decide whether to clear cancel_pending. But if the AST does not come at this time,
>>>>>>>>>> the check passes and cancel_pendig will not be cleared. Then AST immediately came over again,
>>>>>>>>>> which also led to a bug. I personally think that for asynchronous processes we can't guarantee
>>>>>>>>>> the speed of execution of each process. All we can do is to avoid the BUG scene.
>>>>>>>>>>
>>>>>>>>>> As for the question you said ("If you remove the BUG check why you still call dlm_commit_pending_cancel()
>>>>>>>>>> to move the lock back to grant on matter it's on converting list or not?").
>>>>>>>>>> I think we should first check if the lock is in the grant queue
>>>>>>>>>> (at this time, dlm->spinlock and res->spinlock have been added), then decide whether to call
>>>>>>>>>> dlm_commit_pending_cancel function.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Jian
>>>>>>>>>>
>>>>>>>>>> On 12/7/2018 11:12 AM, Changwei Ge wrote:
>>>>>>>>>>> Hi Jian,
>>>>>>>>>>>
>>>>>>>>>>> I suppose that the situation you described truly exists.
>>>>>>>>>>> But the way you fix the issue is not in my favor.
>>>>>>>>>>>
>>>>>>>>>>> If you remove the BUG check why you still call dlm_commit_pending_cancel() to
>>>>>>>>>>> move the lock back to grant on matter it's on converting list or not?
>>>>>>>>>>>
>>>>>>>>>>> So how about keeping the BUG check in dlm_move_lockres_to_recovery_list().
>>>>>>>>>>> If the locking _ast_ comes back very fast just check ::cancel_pending in dlm_proxy_ast_handler() and clear it.
>>>>>>>>>>> Then with the logic checking if the lock is on grant list (in dlmunlock_common() more less like your current method)
>>>>>>>>>>> or not we can easily tell if the cancellation succeeds or not.
>>>>>>>>>>>
>>>>>>>>>>> That complies the original dlm design, which I think is better and easier for maintainers to understand.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Changwei
>>>>>>>>>>>
>>>>>>>>>>> On 2018/12/6 20:06, wangjian wrote:
>>>>>>>>>>>> Hi Changwei,
>>>>>>>>>>>>
>>>>>>>>>>>> I don't fully agree with your point of view. In my opinion, the lock cancellation process and the lock
>>>>>>>>>>>> conversion process are asynchronous. We can't guarantee that the lock must be in the convert list
>>>>>>>>>>>> during the lock conversion process, otherwise this BUG will not happen.
>>>>>>>>>>>> So, I think this is a meaningless BUG.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Jian
>>>>>>>>>>>>
>>>>>>>>>>>> On 12/5/2018 9:49 AM, Changwei Ge wrote:
>>>>>>>>>>>>> Hi Jian,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I suppose you can't just remove the BUG_ON() check.
>>>>>>>>>>>>> If you remove it, below code violates the original logic.
>>>>>>>>>>>>>
>>>>>>>>>>>>> '''
>>>>>>>>>>>>> 2141                                 dlm_commit_pending_cancel(res, lock);
>>>>>>>>>>>>> '''
>>>>>>>>>>>>>
>>>>>>>>>>>>> What's more important is *locking cancel* must be against a *locking conversion* progress.
>>>>>>>>>>>>> So it makes sense to check if this lock is on converting list.
>>>>>>>>>>>>>
>>>>>>>>>>>>> So I have to NACK to this patch.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Changwei
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 2018/12/3 20:23, wangjian wrote:
>>>>>>>>>>>>>> In the dlm_move_lockres_to_recovery_list function, if the lock
>>>>>>>>>>>>>> is in the granted queue and cancel_pending is set, it will
>>>>>>>>>>>>>> encounter a BUG. I think this is a meaningless BUG,
>>>>>>>>>>>>>> so be prepared to remove it. A scenario that causes
>>>>>>>>>>>>>> this BUG will be given below.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> At the beginning, Node 1 is the master and has NL lock,
>>>>>>>>>>>>>> Node 2 has PR lock, Node 3 has PR lock too.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Node 1          Node 2          Node 3
>>>>>>>>>>>>>>                    want to get EX lock.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>                                    want to get EX lock.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Node 3 hinder
>>>>>>>>>>>>>> Node 2 to get
>>>>>>>>>>>>>> EX lock, send
>>>>>>>>>>>>>> Node 3 a BAST.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>                                    receive BAST from
>>>>>>>>>>>>>>                                    Node 1. downconvert
>>>>>>>>>>>>>>                                    thread begin to
>>>>>>>>>>>>>>                                    cancel PR to EX conversion.
>>>>>>>>>>>>>>                                    In dlmunlock_common function,
>>>>>>>>>>>>>>                                    downconvert thread has set
>>>>>>>>>>>>>>                                    lock->cancel_pending,
>>>>>>>>>>>>>>                                    but did not enter
>>>>>>>>>>>>>>                                    dlm_send_remote_unlock_request
>>>>>>>>>>>>>>                                    function.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>                    Node2 dies because
>>>>>>>>>>>>>>                    the host is powered down.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In recovery process,
>>>>>>>>>>>>>> clean the lock that
>>>>>>>>>>>>>> related to Node2.
>>>>>>>>>>>>>> then finish Node 3
>>>>>>>>>>>>>> PR to EX request.
>>>>>>>>>>>>>> give Node 3 a AST.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>                                    receive AST from Node 1.
>>>>>>>>>>>>>>                                    change lock level to EX,
>>>>>>>>>>>>>>                                    move lock to granted list.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Node1 dies because
>>>>>>>>>>>>>> the host is powered down.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>                                    In dlm_move_lockres_to_recovery_list
>>>>>>>>>>>>>>                                    function. the lock is in the
>>>>>>>>>>>>>>                                    granted queue and cancel_pending
>>>>>>>>>>>>>>                                    is set. BUG_ON.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> But after clearing this BUG, process will encounter
>>>>>>>>>>>>>> the second BUG in the ocfs2_unlock_ast function.
>>>>>>>>>>>>>> Here is a scenario that will cause the second BUG
>>>>>>>>>>>>>> in ocfs2_unlock_ast as follows:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> At the beginning, Node 1 is the master and has NL lock,
>>>>>>>>>>>>>> Node 2 has PR lock, Node 3 has PR lock too.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Node 1          Node 2          Node 3
>>>>>>>>>>>>>>                    want to get EX lock.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>                                    want to get EX lock.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Node 3 hinder
>>>>>>>>>>>>>> Node 2 to get
>>>>>>>>>>>>>> EX lock, send
>>>>>>>>>>>>>> Node 3 a BAST.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>                                    receive BAST from
>>>>>>>>>>>>>>                                    Node 1. downconvert
>>>>>>>>>>>>>>                                    thread begin to
>>>>>>>>>>>>>>                                    cancel PR to EX conversion.
>>>>>>>>>>>>>>                                    In dlmunlock_common function,
>>>>>>>>>>>>>>                                    downconvert thread has released
>>>>>>>>>>>>>>                                    lock->spinlock and res->spinlock,
>>>>>>>>>>>>>>                                    but did not enter
>>>>>>>>>>>>>>                                    dlm_send_remote_unlock_request
>>>>>>>>>>>>>>                                    function.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>                    Node2 dies because
>>>>>>>>>>>>>>                    the host is powered down.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In recovery process,
>>>>>>>>>>>>>> clean the lock that
>>>>>>>>>>>>>> related to Node2.
>>>>>>>>>>>>>> then finish Node 3
>>>>>>>>>>>>>> PR to EX request.
>>>>>>>>>>>>>> give Node 3 a AST.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>                                    receive AST from Node 1.
>>>>>>>>>>>>>>                                    change lock level to EX,
>>>>>>>>>>>>>>                                    move lock to granted list,
>>>>>>>>>>>>>>                                    set lockres->l_unlock_action
>>>>>>>>>>>>>>                                    as OCFS2_UNLOCK_INVALID
>>>>>>>>>>>>>>                                    in ocfs2_locking_ast function.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Node2 dies because
>>>>>>>>>>>>>> the host is powered down.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>                                    Node 3 realize that Node 1
>>>>>>>>>>>>>>                                    is dead, remove Node 1 from
>>>>>>>>>>>>>>                                    domain_map. downconvert thread
>>>>>>>>>>>>>>                                    get DLM_NORMAL from
>>>>>>>>>>>>>>                                    dlm_send_remote_unlock_request
>>>>>>>>>>>>>>                                    function and set *call_ast as 1.
>>>>>>>>>>>>>>                                    Then downconvert thread meet
>>>>>>>>>>>>>>                                    BUG in ocfs2_unlock_ast function.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> To avoid meet the second BUG, function dlmunlock_common shuold
>>>>>>>>>>>>>> return DLM_CANCELGRANT if the lock is on granted list and
>>>>>>>>>>>>>> the operation is canceled.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Signed-off-by: Jian Wang<wangjian161@huawei.com>
>>>>>>>>>>>>>> Reviewed-by: Yiwen Jiang<jiangyiwen@huawei.com>
>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>         fs/ocfs2/dlm/dlmrecovery.c | 1 -
>>>>>>>>>>>>>>         fs/ocfs2/dlm/dlmunlock.c   | 5 +++++
>>>>>>>>>>>>>>         2 files changed, 5 insertions(+), 1 deletion(-)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
>>>>>>>>>>>>>> index 802636d..7489652 100644
>>>>>>>>>>>>>> --- a/fs/ocfs2/dlm/dlmrecovery.c
>>>>>>>>>>>>>> +++ b/fs/ocfs2/dlm/dlmrecovery.c
>>>>>>>>>>>>>> @@ -2134,7 +2134,6 @@ void dlm_move_lockres_to_recovery_list(struct dlm_ctxt *dlm,
>>>>>>>>>>>>>>         				 * if this had completed successfully
>>>>>>>>>>>>>>         				 * before sending this lock state to the
>>>>>>>>>>>>>>         				 * new master */
>>>>>>>>>>>>>> -				BUG_ON(i != DLM_CONVERTING_LIST);
>>>>>>>>>>>>>>         				mlog(0, "node died with cancel pending "
>>>>>>>>>>>>>>         				     "on %.*s. move back to granted list.\n",
>>>>>>>>>>>>>>         				     res->lockname.len, res->lockname.name);
>>>>>>>>>>>>>> diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
>>>>>>>>>>>>>> index 63d701c..505bb6c 100644
>>>>>>>>>>>>>> --- a/fs/ocfs2/dlm/dlmunlock.c
>>>>>>>>>>>>>> +++ b/fs/ocfs2/dlm/dlmunlock.c
>>>>>>>>>>>>>> @@ -183,6 +183,11 @@ static enum dlm_status dlmunlock_common(struct dlm_ctxt *dlm,
>>>>>>>>>>>>>>         							flags, owner);
>>>>>>>>>>>>>>         		spin_lock(&res->spinlock);
>>>>>>>>>>>>>>         		spin_lock(&lock->spinlock);
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>> +		if ((flags & LKM_CANCEL) &&
>>>>>>>>>>>>>> +				dlm_lock_on_list(&res->granted, lock))
>>>>>>>>>>>>>> +			status = DLM_CANCELGRANT;
>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>         		/* if the master told us the lock was already granted,
>>>>>>>>>>>>>>         		 * let the ast handle all of these actions */
>>>>>>>>>>>>>>         		if (status == DLM_CANCELGRANT) {
>>>>>>>>>>>>>> -- 
>>>>>>>>>>>>>> 1.8.3.1
>>>>>>>>>>>>>>
>>>>>>>>>>>>> .
>>>>>>>>>>>>>
>>>>>>>>>>> .
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Ocfs2-devel mailing list
>>>>>>>>>> Ocfs2-devel@oss.oracle.com
>>>>>>>>>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>>>>>>>>>>
>>>>>>>>>
>>>>>>>> .
>>>>>>>>
>>>>>>>
>>>>>> .
>>>>>>
>>>>>
>>>> .
>>>>
>>>
>> .
>>
>
piaojun Feb. 19, 2019, 12:47 a.m. UTC | #18
Hi Changwei,

On 2019/2/18 17:25, Changwei Ge wrote:
> Hi Jun,
> 
> On 2019/2/15 17:49, piaojun wrote:
>> Hi Changwei,
>>
>> On 2019/2/15 17:27, Changwei Ge wrote:
>>> On 2019/2/15 17:20, piaojun wrote:
>>>> Hi Changwei,
>>>>
>>>> On 2019/2/15 15:56, Changwei Ge wrote:
>>>>> Hi Jun
>>>>>
>>>>> I just read the code around unlock/cancel.
>>>>>
>>>>> On 2019/2/15 15:35, piaojun wrote:
>>>>>> Hi Changwei,
>>>>>>
>>>>>> On 2019/2/15 15:06, Changwei Ge wrote:
>>>>>>> Hi Jun,
>>>>>>>
>>>>>>> On 2019/2/15 14:51, piaojun wrote:
>>>>>>>> Hi Changwei,
>>>>>>>>
>>>>>>>> On 2019/2/14 18:13, Changwei Ge wrote:
>>>>>>>>> On 2019/2/14 17:09, piaojun wrote:
>>>>>>>>>> Hi Changwei,
>>>>>>>>>>
>>>>>>>>>> The problem can't be solved completely if clear ::cancel_pending in
>>>>>>>>>> dlm_proxy_ast_handler, as AST will come at anytime just before
>>>>>>>>>
>>>>>>>>> So how about also add check here bere setting ::cancel_pending in dlmunlock_common() before invoking dlm_send_remote_unlock_request().
>>>>>>>>> If already on grant list just return DLM_CANCELGRANT
>>>>>>>>>
>>>>>>>>> Then a further reference code might look like:
>>>>>>>>>
>>>>>>>>> root@ubuntu16:/home/chge/linux[master]# git diff
>>>>>>>>> diff --git a/fs/ocfs2/dlm/dlmast.c b/fs/ocfs2/dlm/dlmast.c
>>>>>>>>> index 39831fc..812843b 100644
>>>>>>>>> --- a/fs/ocfs2/dlm/dlmast.c
>>>>>>>>> +++ b/fs/ocfs2/dlm/dlmast.c
>>>>>>>>> @@ -372,8 +372,11 @@ int dlm_proxy_ast_handler(struct o2net_msg *msg, u32 len, void *data,
>>>>>>>>>              head = &res->converting;
>>>>>>>>>              lock = NULL;
>>>>>>>>>              list_for_each_entry(lock, head, list) {
>>>>>>>>> -               if (lock->ml.cookie == cookie)
>>>>>>>>> +               if (lock->ml.cookie == cookie) {
>>>>>>>>> +                       if (lock->cancel_pending)
>>>>>>>>> +                               lock->cancel_pending = 0;
>>>>>>>>>                              goto do_ast;
>>>>>>>>> +               }
>>>>>>>>>              }
>>>>>>>>>
>>>>>>>>>              /* if not on convert, try blocked for ast, granted for bast */
>>>>>>>>> diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
>>>>>>>>> index c8e9b70..b4728b5 100644
>>>>>>>>> --- a/fs/ocfs2/dlm/dlmunlock.c
>>>>>>>>> +++ b/fs/ocfs2/dlm/dlmunlock.c
>>>>>>>>> @@ -174,9 +174,14 @@ static enum dlm_status dlmunlock_common(struct dlm_ctxt *dlm,
>>>>>>>>>              if (!master_node) {
>>>>>>>>>                      owner = res->owner;
>>>>>>>>>                      /* drop locks and send message */
>>>>>>>>> -               if (flags & LKM_CANCEL)
>>>>>>>>> +               if (flags & LKM_CANCEL) {
>>>>>>>>> +                       if (dlm_lock_on_list(&res->granted, lock)) {
>>>>>>>>> +                               status = DLM_CANCELGRANT;
>>>>>>>>> +                               goto leave;
>>>>>
>>>>> I found that above code should be useless.
>>>>> As upstream code already take it into consideration that AST has come before cancellation.
>>>>> In dlm_get_cancel_actions()
>>>>> 	'''
>>>>> 	} else if (dlm_lock_on_list(&res->granted, lock)) {
>>>>> 		/* too late, already granted. */
>>>>> 		status = DLM_CANCELGRANT;
>>>>> 		*actions = DLM_UNLOCK_CALL_AST;
>>>>> 	} else {
>>>>> 	'''
>>>>>
>>>>>>>>
>>>>>>>> If master dead and then lockres is moved to granted list in
>>>>>>>> dlm_move_lockres_to_recovery_list, the OCFS2_LOCK_BUSY is not cleared.
>>>>>>>
>>>>>>> OCFS2_LOCK_BUSY  should be cleared in ocfs2_locking_ast() since previous locking AST has come back(moving lock to grant list).
>>>>>>> That's why we return DLM_CANCELGRANT to caller to avoid calling AST. Otherwise ast() will be called twice, which is obviously a BUG.
>>>>>>
>>>>>> I mean master is already dead and ast won't come. So the
>>>>>> OCFS2_LOCK_BUSY is not cleared. And there are two cases that lockres is
>>>>>> moved to grant list:
>>>>>> 1. AST comes back and OCFS2_LOCK_BUSY is cleared.
>>>>>> 2. AST does not come back and OCFS2_LOCK_BUSY remains, recovery process
>>>>>>       move it to grant list. In this case, we need do AST for it.
>>>>>
>>>>> For point 2, ocfs2 can handle it, so you still don't have to worry. It's no problem.
>>>>> In dlmconvert_remote()
>>>>
>>>> What I worry about is that OCFS2_LOCK_BUSY won't be cleared if remote
>>>> master is dead, as OCFS2_LOCK_BUSY is cleared in two cases:
>>>> 1. remote AST come, clear it in dlm_proxy_ast_handler.
>>>> 2. remote AST does not come when master dead, clear it in
>>>>      o2dlm_unlock_ast_wrapper.
> 
> If AST doesn't manage to get back to requested node, why must flag OCFS2_LOCK_BUSY be cleared in o2dlm_unlock_ast_wrapper?
> 
> Yes, OCFS2_LOCK_BUSY can be cleared it either o2dlm_unlock_ast_wrapper() or o2dlm_lock_ast_wrapper() with o2cb stack applied.
> 
> If we return DLM_CANCELGRANT from ocfs2/dlm to dlm, then we must know that AST has ever come back or master node has moved the lock to grant list itself, thus we clear flag OCFS2_LOCK_BUSY in o2dlm_lock_ast_wrapper().
> Otherwise we ascertain that we can stop current ongoing locking procedure and must clear OCFS2_LOCK_BUSY in o2dlm_unlock_ast_wrapper() (*synchronized*).
> 
> Let's summarize this, OCFS2_LOCK_BUSY should be cleared whether by locking success or cancellation success.
> 
> And my way already check if the lock is granted then return DLM_CANCELGRANT or not.
> 

OCFS2_LOCK_BUSY won't be cleared if DLM_CANCELGRANT is set in
o2dlm_unlock_ast_wrapper, and that's what I'm concerned about:

static void o2dlm_unlock_ast_wrapper(void *astarg, enum dlm_status status)
{
	struct ocfs2_dlm_lksb *lksb = astarg;
	int error = dlm_status_to_errno(status);

	/*
	 * In o2dlm, you can get both the lock_ast() for the lock being
	 * granted and the unlock_ast() for the CANCEL failing.  A
	 * successful cancel sends DLM_NORMAL here.  If the
	 * lock grant happened before the cancel arrived, you get
	 * DLM_CANCELGRANT.
	 *
	 * There's no need for the double-ast.  If we see DLM_CANCELGRANT,
	 * we just ignore it.  We expect the lock_ast() to handle the
	 * granted lock.
	 */
	if (status == DLM_CANCELGRANT)
		return;

	lksb->lksb_conn->cc_proto->lp_unlock_ast(lksb, error);
}

>>>
>>> Please don't worry about point 2.
>>> Like my previous e-mail, after dlm recovery picking up a new master for corresponding lock resource.
>>> dlmlock() -> convert will retry and send request to new master.
>>> Eventually, this locking(convert) will succeed and ast() will be called for ocfs2 layer to clear OCFS2_LOCK_BUSY.
>>
>> Perhaps you misunderstand my meaning. The AST I meant is for dlmunlock,
>> not for dlmlock, so it won't cause retry sending request to new master.
> 
> If the unlocking AST is not back then dlm_send_remote_unlock_request() returns NORMAL.
> Because unlocking AST is synchronized with unlocking request, which means master won't sent back a separated message back(proxy ast type).
> That means *status* returned from dlm_send_remote_unlock_request() indicates AST for unlocking should be called or not.
> How can AST for unlocking can't be invoked.

Sure, unlocking AST will be invoked, but do nothing(including clear
OCFS2_LOCK_BUSY).

> 
> You really make me puzzled. :- (((
> 
> Thanks,
> Changwei
> 
>>
>>>
>>> Thanks,
>>> Changwei
>>>
>>>>
>>>> For case 2, if DLM_CANCELGRANT is set, o2dlm_unlock_ast_wrapper just
>>>> return and won't clear OCFS2_LOCK_BUSY. So in this case, I think we
>>>> should do AST again for it.
>>>>
>>>>>
>>>>>
>>>>> 341         if (status != DLM_NORMAL) {
>>>>> 342                 if (status != DLM_NOTQUEUED)
>>>>> 343                         dlm_error(status);
>>>>> 344                 dlm_revert_pending_convert(res, lock);
>>>>> 345         } else if (!lock->convert_pending) {
>>>>> 346                 mlog(0, "%s: res %.*s, owner died and lock has been moved back "
>>>>> 347                                 "to granted list, retry convert.\n",
>>>>> 348                                 dlm->name, res->lockname.len, res->lockname.name);
>>>>> 349                 status = DLM_RECOVERING;
>>>>> 350         }
>>>>>
>>>>> Thus, dlmlock() will wait until RECOVERY is done.
>>>>> And for ocfs2 layer, it's apparent, it doesn't have to be aware of what happened to DLM.
>>>>
>>>> I do not think the waiting can solve this problem.
>>>>
>>>>>
>>>>> Thanks,
>>>>> Changwei
>>>>>
>>>>>
>>>>>>
>>>>>> Perhaps we need do AST again for it.
>>>>>>
>>>>>>>
>>>>>>>> And this will cause stuck problem when ocfs2_drop_lock. The reason is
>>>>>>>> that unlock ast won't be done when DLM_CANCELGRANT is set. So I think
>>>>>>>
>>>>>>> With above elaboration, you don't have to worry the hang issue anymore.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Changwei
>>>>>>>
>>>>>>>> we need distinguish all the cases of moving lockres to grant list.
>>>>>>>>
>>>>>>>>> +                       }
>>>>>>>>> +
>>>>>>>>>                              lock->cancel_pending = 1;
>>>>>>>>> -               else
>>>>>>>>> +               } else
>>>>>>>>>                              lock->unlock_pending = 1;
>>>>>>>>>                      spin_unlock(&lock->spinlock);
>>>>>>>>>                      spin_unlock(&res->spinlock);
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Changwei
>>>>>>>>>
>>>>>>>>>> ::cancel_pendig is set. If there is not any other better solutions,
>>>>>>>>>> could we accept this patch? This bug is very harmful.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Jun
>>>>>>>>>>
>>>>>>>>>> On 2018/12/8 18:05, wangjian wrote:
>>>>>>>>>>> Hi Changwei,
>>>>>>>>>>>
>>>>>>>>>>> I understand your idea. But we should be aware that the cancel_convert process and
>>>>>>>>>>> other processes (accepting the AST process, the recovery process) are asynchronous.
>>>>>>>>>>> For example, according to your idea, check if the lock is in the grant queue before
>>>>>>>>>>> calling the dlm_send_remote_unlock_request function in the dlm_proxy_ast_handler function.
>>>>>>>>>>> Then decide whether to clear cancel_pending. But if the AST does not come at this time,
>>>>>>>>>>> the check passes and cancel_pendig will not be cleared. Then AST immediately came over again,
>>>>>>>>>>> which also led to a bug. I personally think that for asynchronous processes we can't guarantee
>>>>>>>>>>> the speed of execution of each process. All we can do is to avoid the BUG scene.
>>>>>>>>>>>
>>>>>>>>>>> As for the question you said ("If you remove the BUG check why you still call dlm_commit_pending_cancel()
>>>>>>>>>>> to move the lock back to grant on matter it's on converting list or not?").
>>>>>>>>>>> I think we should first check if the lock is in the grant queue
>>>>>>>>>>> (at this time, dlm->spinlock and res->spinlock have been added), then decide whether to call
>>>>>>>>>>> dlm_commit_pending_cancel function.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Jian
>>>>>>>>>>>
>>>>>>>>>>> On 12/7/2018 11:12 AM, Changwei Ge wrote:
>>>>>>>>>>>> Hi Jian,
>>>>>>>>>>>>
>>>>>>>>>>>> I suppose that the situation you described truly exists.
>>>>>>>>>>>> But the way you fix the issue is not in my favor.
>>>>>>>>>>>>
>>>>>>>>>>>> If you remove the BUG check why you still call dlm_commit_pending_cancel() to
>>>>>>>>>>>> move the lock back to grant on matter it's on converting list or not?
>>>>>>>>>>>>
>>>>>>>>>>>> So how about keeping the BUG check in dlm_move_lockres_to_recovery_list().
>>>>>>>>>>>> If the locking _ast_ comes back very fast just check ::cancel_pending in dlm_proxy_ast_handler() and clear it.
>>>>>>>>>>>> Then with the logic checking if the lock is on grant list (in dlmunlock_common() more less like your current method)
>>>>>>>>>>>> or not we can easily tell if the cancellation succeeds or not.
>>>>>>>>>>>>
>>>>>>>>>>>> That complies the original dlm design, which I think is better and easier for maintainers to understand.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Changwei
>>>>>>>>>>>>
>>>>>>>>>>>> On 2018/12/6 20:06, wangjian wrote:
>>>>>>>>>>>>> Hi Changwei,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I don't fully agree with your point of view. In my opinion, the lock cancellation process and the lock
>>>>>>>>>>>>> conversion process are asynchronous. We can't guarantee that the lock must be in the convert list
>>>>>>>>>>>>> during the lock conversion process, otherwise this BUG will not happen.
>>>>>>>>>>>>> So, I think this is a meaningless BUG.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Jian
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 12/5/2018 9:49 AM, Changwei Ge wrote:
>>>>>>>>>>>>>> Hi Jian,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I suppose you can't just remove the BUG_ON() check.
>>>>>>>>>>>>>> If you remove it, below code violates the original logic.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> '''
>>>>>>>>>>>>>> 2141                                 dlm_commit_pending_cancel(res, lock);
>>>>>>>>>>>>>> '''
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> What's more important is *locking cancel* must be against a *locking conversion* progress.
>>>>>>>>>>>>>> So it makes sense to check if this lock is on converting list.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> So I have to NACK to this patch.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Changwei
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 2018/12/3 20:23, wangjian wrote:
>>>>>>>>>>>>>>> In the dlm_move_lockres_to_recovery_list function, if the lock
>>>>>>>>>>>>>>> is in the granted queue and cancel_pending is set, it will
>>>>>>>>>>>>>>> encounter a BUG. I think this is a meaningless BUG,
>>>>>>>>>>>>>>> so be prepared to remove it. A scenario that causes
>>>>>>>>>>>>>>> this BUG will be given below.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> At the beginning, Node 1 is the master and has NL lock,
>>>>>>>>>>>>>>> Node 2 has PR lock, Node 3 has PR lock too.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Node 1          Node 2          Node 3
>>>>>>>>>>>>>>>                    want to get EX lock.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>                                    want to get EX lock.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Node 3 hinder
>>>>>>>>>>>>>>> Node 2 to get
>>>>>>>>>>>>>>> EX lock, send
>>>>>>>>>>>>>>> Node 3 a BAST.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>                                    receive BAST from
>>>>>>>>>>>>>>>                                    Node 1. downconvert
>>>>>>>>>>>>>>>                                    thread begin to
>>>>>>>>>>>>>>>                                    cancel PR to EX conversion.
>>>>>>>>>>>>>>>                                    In dlmunlock_common function,
>>>>>>>>>>>>>>>                                    downconvert thread has set
>>>>>>>>>>>>>>>                                    lock->cancel_pending,
>>>>>>>>>>>>>>>                                    but did not enter
>>>>>>>>>>>>>>>                                    dlm_send_remote_unlock_request
>>>>>>>>>>>>>>>                                    function.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>                    Node2 dies because
>>>>>>>>>>>>>>>                    the host is powered down.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> In recovery process,
>>>>>>>>>>>>>>> clean the lock that
>>>>>>>>>>>>>>> related to Node2.
>>>>>>>>>>>>>>> then finish Node 3
>>>>>>>>>>>>>>> PR to EX request.
>>>>>>>>>>>>>>> give Node 3 a AST.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>                                    receive AST from Node 1.
>>>>>>>>>>>>>>>                                    change lock level to EX,
>>>>>>>>>>>>>>>                                    move lock to granted list.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Node1 dies because
>>>>>>>>>>>>>>> the host is powered down.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>                                    In dlm_move_lockres_to_recovery_list
>>>>>>>>>>>>>>>                                    function. the lock is in the
>>>>>>>>>>>>>>>                                    granted queue and cancel_pending
>>>>>>>>>>>>>>>                                    is set. BUG_ON.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> But after clearing this BUG, process will encounter
>>>>>>>>>>>>>>> the second BUG in the ocfs2_unlock_ast function.
>>>>>>>>>>>>>>> Here is a scenario that will cause the second BUG
>>>>>>>>>>>>>>> in ocfs2_unlock_ast as follows:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> At the beginning, Node 1 is the master and has NL lock,
>>>>>>>>>>>>>>> Node 2 has PR lock, Node 3 has PR lock too.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Node 1          Node 2          Node 3
>>>>>>>>>>>>>>>                    want to get EX lock.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>                                    want to get EX lock.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Node 3 hinder
>>>>>>>>>>>>>>> Node 2 to get
>>>>>>>>>>>>>>> EX lock, send
>>>>>>>>>>>>>>> Node 3 a BAST.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>                                    receive BAST from
>>>>>>>>>>>>>>>                                    Node 1. downconvert
>>>>>>>>>>>>>>>                                    thread begin to
>>>>>>>>>>>>>>>                                    cancel PR to EX conversion.
>>>>>>>>>>>>>>>                                    In dlmunlock_common function,
>>>>>>>>>>>>>>>                                    downconvert thread has released
>>>>>>>>>>>>>>>                                    lock->spinlock and res->spinlock,
>>>>>>>>>>>>>>>                                    but did not enter
>>>>>>>>>>>>>>>                                    dlm_send_remote_unlock_request
>>>>>>>>>>>>>>>                                    function.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>                    Node2 dies because
>>>>>>>>>>>>>>>                    the host is powered down.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> In recovery process,
>>>>>>>>>>>>>>> clean the lock that
>>>>>>>>>>>>>>> related to Node2.
>>>>>>>>>>>>>>> then finish Node 3
>>>>>>>>>>>>>>> PR to EX request.
>>>>>>>>>>>>>>> give Node 3 a AST.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>                                    receive AST from Node 1.
>>>>>>>>>>>>>>>                                    change lock level to EX,
>>>>>>>>>>>>>>>                                    move lock to granted list,
>>>>>>>>>>>>>>>                                    set lockres->l_unlock_action
>>>>>>>>>>>>>>>                                    as OCFS2_UNLOCK_INVALID
>>>>>>>>>>>>>>>                                    in ocfs2_locking_ast function.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Node2 dies because
>>>>>>>>>>>>>>> the host is powered down.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>                                    Node 3 realize that Node 1
>>>>>>>>>>>>>>>                                    is dead, remove Node 1 from
>>>>>>>>>>>>>>>                                    domain_map. downconvert thread
>>>>>>>>>>>>>>>                                    get DLM_NORMAL from
>>>>>>>>>>>>>>>                                    dlm_send_remote_unlock_request
>>>>>>>>>>>>>>>                                    function and set *call_ast as 1.
>>>>>>>>>>>>>>>                                    Then downconvert thread meet
>>>>>>>>>>>>>>>                                    BUG in ocfs2_unlock_ast function.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> To avoid meet the second BUG, function dlmunlock_common shuold
>>>>>>>>>>>>>>> return DLM_CANCELGRANT if the lock is on granted list and
>>>>>>>>>>>>>>> the operation is canceled.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Signed-off-by: Jian Wang<wangjian161@huawei.com>
>>>>>>>>>>>>>>> Reviewed-by: Yiwen Jiang<jiangyiwen@huawei.com>
>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>>         fs/ocfs2/dlm/dlmrecovery.c | 1 -
>>>>>>>>>>>>>>>         fs/ocfs2/dlm/dlmunlock.c   | 5 +++++
>>>>>>>>>>>>>>>         2 files changed, 5 insertions(+), 1 deletion(-)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
>>>>>>>>>>>>>>> index 802636d..7489652 100644
>>>>>>>>>>>>>>> --- a/fs/ocfs2/dlm/dlmrecovery.c
>>>>>>>>>>>>>>> +++ b/fs/ocfs2/dlm/dlmrecovery.c
>>>>>>>>>>>>>>> @@ -2134,7 +2134,6 @@ void dlm_move_lockres_to_recovery_list(struct dlm_ctxt *dlm,
>>>>>>>>>>>>>>>         				 * if this had completed successfully
>>>>>>>>>>>>>>>         				 * before sending this lock state to the
>>>>>>>>>>>>>>>         				 * new master */
>>>>>>>>>>>>>>> -				BUG_ON(i != DLM_CONVERTING_LIST);
>>>>>>>>>>>>>>>         				mlog(0, "node died with cancel pending "
>>>>>>>>>>>>>>>         				     "on %.*s. move back to granted list.\n",
>>>>>>>>>>>>>>>         				     res->lockname.len, res->lockname.name);
>>>>>>>>>>>>>>> diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
>>>>>>>>>>>>>>> index 63d701c..505bb6c 100644
>>>>>>>>>>>>>>> --- a/fs/ocfs2/dlm/dlmunlock.c
>>>>>>>>>>>>>>> +++ b/fs/ocfs2/dlm/dlmunlock.c
>>>>>>>>>>>>>>> @@ -183,6 +183,11 @@ static enum dlm_status dlmunlock_common(struct dlm_ctxt *dlm,
>>>>>>>>>>>>>>>         							flags, owner);
>>>>>>>>>>>>>>>         		spin_lock(&res->spinlock);
>>>>>>>>>>>>>>>         		spin_lock(&lock->spinlock);
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>> +		if ((flags & LKM_CANCEL) &&
>>>>>>>>>>>>>>> +				dlm_lock_on_list(&res->granted, lock))
>>>>>>>>>>>>>>> +			status = DLM_CANCELGRANT;
>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>         		/* if the master told us the lock was already granted,
>>>>>>>>>>>>>>>         		 * let the ast handle all of these actions */
>>>>>>>>>>>>>>>         		if (status == DLM_CANCELGRANT) {
>>>>>>>>>>>>>>> -- 
>>>>>>>>>>>>>>> 1.8.3.1
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> .
>>>>>>>>>>>>>>
>>>>>>>>>>>> .
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> Ocfs2-devel mailing list
>>>>>>>>>>> Ocfs2-devel@oss.oracle.com
>>>>>>>>>>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> .
>>>>>>>>>
>>>>>>>>
>>>>>>> .
>>>>>>>
>>>>>>
>>>>> .
>>>>>
>>>>
>>> .
>>>
>>
> .
>
Changwei Ge Feb. 19, 2019, 2:38 a.m. UTC | #19
Hi Jun,

On 2019/2/19 8:48, piaojun wrote:
> Hi Changwei,
> 
> On 2019/2/18 17:25, Changwei Ge wrote:
>> Hi Jun,
>>
>> On 2019/2/15 17:49, piaojun wrote:
>>> Hi Changwei,
>>>
>>> On 2019/2/15 17:27, Changwei Ge wrote:
>>>> On 2019/2/15 17:20, piaojun wrote:
>>>>> Hi Changwei,
>>>>>
>>>>> On 2019/2/15 15:56, Changwei Ge wrote:
>>>>>> Hi Jun
>>>>>>
>>>>>> I just read the code around unlock/cancel.
>>>>>>
>>>>>> On 2019/2/15 15:35, piaojun wrote:
>>>>>>> Hi Changwei,
>>>>>>>
>>>>>>> On 2019/2/15 15:06, Changwei Ge wrote:
>>>>>>>> Hi Jun,
>>>>>>>>
>>>>>>>> On 2019/2/15 14:51, piaojun wrote:
>>>>>>>>> Hi Changwei,
>>>>>>>>>
>>>>>>>>> On 2019/2/14 18:13, Changwei Ge wrote:
>>>>>>>>>> On 2019/2/14 17:09, piaojun wrote:
>>>>>>>>>>> Hi Changwei,
>>>>>>>>>>>
>>>>>>>>>>> The problem can't be solved completely if clear ::cancel_pending in
>>>>>>>>>>> dlm_proxy_ast_handler, as AST will come at anytime just before
>>>>>>>>>>
>>>>>>>>>> So how about also add check here bere setting ::cancel_pending in dlmunlock_common() before invoking dlm_send_remote_unlock_request().
>>>>>>>>>> If already on grant list just return DLM_CANCELGRANT
>>>>>>>>>>
>>>>>>>>>> Then a further reference code might look like:
>>>>>>>>>>
>>>>>>>>>> root@ubuntu16:/home/chge/linux[master]# git diff
>>>>>>>>>> diff --git a/fs/ocfs2/dlm/dlmast.c b/fs/ocfs2/dlm/dlmast.c
>>>>>>>>>> index 39831fc..812843b 100644
>>>>>>>>>> --- a/fs/ocfs2/dlm/dlmast.c
>>>>>>>>>> +++ b/fs/ocfs2/dlm/dlmast.c
>>>>>>>>>> @@ -372,8 +372,11 @@ int dlm_proxy_ast_handler(struct o2net_msg *msg, u32 len, void *data,
>>>>>>>>>>               head = &res->converting;
>>>>>>>>>>               lock = NULL;
>>>>>>>>>>               list_for_each_entry(lock, head, list) {
>>>>>>>>>> -               if (lock->ml.cookie == cookie)
>>>>>>>>>> +               if (lock->ml.cookie == cookie) {
>>>>>>>>>> +                       if (lock->cancel_pending)
>>>>>>>>>> +                               lock->cancel_pending = 0;
>>>>>>>>>>                               goto do_ast;
>>>>>>>>>> +               }
>>>>>>>>>>               }
>>>>>>>>>>
>>>>>>>>>>               /* if not on convert, try blocked for ast, granted for bast */
>>>>>>>>>> diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
>>>>>>>>>> index c8e9b70..b4728b5 100644
>>>>>>>>>> --- a/fs/ocfs2/dlm/dlmunlock.c
>>>>>>>>>> +++ b/fs/ocfs2/dlm/dlmunlock.c
>>>>>>>>>> @@ -174,9 +174,14 @@ static enum dlm_status dlmunlock_common(struct dlm_ctxt *dlm,
>>>>>>>>>>               if (!master_node) {
>>>>>>>>>>                       owner = res->owner;
>>>>>>>>>>                       /* drop locks and send message */
>>>>>>>>>> -               if (flags & LKM_CANCEL)
>>>>>>>>>> +               if (flags & LKM_CANCEL) {
>>>>>>>>>> +                       if (dlm_lock_on_list(&res->granted, lock)) {
>>>>>>>>>> +                               status = DLM_CANCELGRANT;
>>>>>>>>>> +                               goto leave;
>>>>>>
>>>>>> I found that above code should be useless.
>>>>>> As upstream code already take it into consideration that AST has come before cancellation.
>>>>>> In dlm_get_cancel_actions()
>>>>>> 	'''
>>>>>> 	} else if (dlm_lock_on_list(&res->granted, lock)) {
>>>>>> 		/* too late, already granted. */
>>>>>> 		status = DLM_CANCELGRANT;
>>>>>> 		*actions = DLM_UNLOCK_CALL_AST;
>>>>>> 	} else {
>>>>>> 	'''
>>>>>>
>>>>>>>>>
>>>>>>>>> If master dead and then lockres is moved to granted list in
>>>>>>>>> dlm_move_lockres_to_recovery_list, the OCFS2_LOCK_BUSY is not cleared.
>>>>>>>>
>>>>>>>> OCFS2_LOCK_BUSY  should be cleared in ocfs2_locking_ast() since previous locking AST has come back(moving lock to grant list).
>>>>>>>> That's why we return DLM_CANCELGRANT to caller to avoid calling AST. Otherwise ast() will be called twice, which is obviously a BUG.
>>>>>>>
>>>>>>> I mean master is already dead and ast won't come. So the
>>>>>>> OCFS2_LOCK_BUSY is not cleared. And there are two cases that lockres is
>>>>>>> moved to grant list:
>>>>>>> 1. AST comes back and OCFS2_LOCK_BUSY is cleared.
>>>>>>> 2. AST does not come back and OCFS2_LOCK_BUSY remains, recovery process
>>>>>>>        move it to grant list. In this case, we need do AST for it.
>>>>>>
>>>>>> For point 2, ocfs2 can handle it, so you still don't have to worry. It's no problem.
>>>>>> In dlmconvert_remote()
>>>>>
>>>>> What I worry about is that OCFS2_LOCK_BUSY won't be cleared if remote
>>>>> master is dead, as OCFS2_LOCK_BUSY is cleared in two cases:
>>>>> 1. remote AST come, clear it in dlm_proxy_ast_handler.
>>>>> 2. remote AST does not come when master dead, clear it in
>>>>>       o2dlm_unlock_ast_wrapper.
>>
>> If AST doesn't manage to get back to requested node, why must flag OCFS2_LOCK_BUSY be cleared in o2dlm_unlock_ast_wrapper?
>>
>> Yes, OCFS2_LOCK_BUSY can be cleared it either o2dlm_unlock_ast_wrapper() or o2dlm_lock_ast_wrapper() with o2cb stack applied.
>>
>> If we return DLM_CANCELGRANT from ocfs2/dlm to dlm, then we must know that AST has ever come back or master node has moved the lock to grant list itself, thus we clear flag OCFS2_LOCK_BUSY in o2dlm_lock_ast_wrapper().
>> Otherwise we ascertain that we can stop current ongoing locking procedure and must clear OCFS2_LOCK_BUSY in o2dlm_unlock_ast_wrapper() (*synchronized*).
>>
>> Let's summarize this, OCFS2_LOCK_BUSY should be cleared whether by locking success or cancellation success.
>>
>> And my way already check if the lock is granted then return DLM_CANCELGRANT or not.
>>
> 
> OCFS2_LOCK_BUSY won't be cleared if DLM_CANCELGRANT is set in
> o2dlm_unlock_ast_wrapper, and that's what I'm concerned about:

But we already *ascertain* that previous locking request has been *granted* before deciding to return DLM_CANCELGRANT during cancellation to o2dlm_unlock_ast_wrapper().

If above condition stands, o2dlm_lock_ast_wrapper() must will be or have been called, which also clears OCFS2_LOCK_BUSY.

> 
> static void o2dlm_unlock_ast_wrapper(void *astarg, enum dlm_status status)
> {
> 	struct ocfs2_dlm_lksb *lksb = astarg;
> 	int error = dlm_status_to_errno(status);
> 
> 	/*
> 	 * In o2dlm, you can get both the lock_ast() for the lock being
> 	 * granted and the unlock_ast() for the CANCEL failing.  A
> 	 * successful cancel sends DLM_NORMAL here.  If the
> 	 * lock grant happened before the cancel arrived, you get
> 	 * DLM_CANCELGRANT.
> 	 *
> 	 * There's no need for the double-ast.  If we see DLM_CANCELGRANT,
> 	 * we just ignore it.  We expect the lock_ast() to handle the
> 	 * granted lock.
> 	 */
> 	if (status == DLM_CANCELGRANT)
> 		return;
> 
> 	lksb->lksb_conn->cc_proto->lp_unlock_ast(lksb, error);
> }
> 
>>>>
>>>> Please don't worry about point 2.
>>>> Like my previous e-mail, after dlm recovery picking up a new master for corresponding lock resource.
>>>> dlmlock() -> convert will retry and send request to new master.
>>>> Eventually, this locking(convert) will succeed and ast() will be called for ocfs2 layer to clear OCFS2_LOCK_BUSY.
>>>
>>> Perhaps you misunderstand my meaning. The AST I meant is for dlmunlock,
>>> not for dlmlock, so it won't cause retry sending request to new master.
>>
>> If the unlocking AST is not back then dlm_send_remote_unlock_request() returns NORMAL.
>> Because unlocking AST is synchronized with unlocking request, which means master won't sent back a separated message back(proxy ast type).
>> That means *status* returned from dlm_send_remote_unlock_request() indicates AST for unlocking should be called or not.
>> How can AST for unlocking can't be invoked.
> 
> Sure, unlocking AST will be invoked, but do nothing(including clear
> OCFS2_LOCK_BUSY).

Actually, it should do nothing since what we want will be done around the locking-grant code logic (including clearing OCFS2_LOCK_BUSY).
Please note that we must ensure the locking can granted, which will fail relevant cancellation.

Thanks,
Changwei

> 
>>
>> You really make me puzzled. :- (((
>>
>> Thanks,
>> Changwei
>>
>>>
>>>>
>>>> Thanks,
>>>> Changwei
>>>>
>>>>>
>>>>> For case 2, if DLM_CANCELGRANT is set, o2dlm_unlock_ast_wrapper just
>>>>> return and won't clear OCFS2_LOCK_BUSY. So in this case, I think we
>>>>> should do AST again for it.
>>>>>
>>>>>>
>>>>>>
>>>>>> 341         if (status != DLM_NORMAL) {
>>>>>> 342                 if (status != DLM_NOTQUEUED)
>>>>>> 343                         dlm_error(status);
>>>>>> 344                 dlm_revert_pending_convert(res, lock);
>>>>>> 345         } else if (!lock->convert_pending) {
>>>>>> 346                 mlog(0, "%s: res %.*s, owner died and lock has been moved back "
>>>>>> 347                                 "to granted list, retry convert.\n",
>>>>>> 348                                 dlm->name, res->lockname.len, res->lockname.name);
>>>>>> 349                 status = DLM_RECOVERING;
>>>>>> 350         }
>>>>>>
>>>>>> Thus, dlmlock() will wait until RECOVERY is done.
>>>>>> And for ocfs2 layer, it's apparent, it doesn't have to be aware of what happened to DLM.
>>>>>
>>>>> I do not think the waiting can solve this problem.
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Changwei
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Perhaps we need do AST again for it.
>>>>>>>
>>>>>>>>
>>>>>>>>> And this will cause stuck problem when ocfs2_drop_lock. The reason is
>>>>>>>>> that unlock ast won't be done when DLM_CANCELGRANT is set. So I think
>>>>>>>>
>>>>>>>> With above elaboration, you don't have to worry the hang issue anymore.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Changwei
>>>>>>>>
>>>>>>>>> we need distinguish all the cases of moving lockres to grant list.
>>>>>>>>>
>>>>>>>>>> +                       }
>>>>>>>>>> +
>>>>>>>>>>                               lock->cancel_pending = 1;
>>>>>>>>>> -               else
>>>>>>>>>> +               } else
>>>>>>>>>>                               lock->unlock_pending = 1;
>>>>>>>>>>                       spin_unlock(&lock->spinlock);
>>>>>>>>>>                       spin_unlock(&res->spinlock);
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Changwei
>>>>>>>>>>
>>>>>>>>>>> ::cancel_pendig is set. If there is not any other better solutions,
>>>>>>>>>>> could we accept this patch? This bug is very harmful.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Jun
>>>>>>>>>>>
>>>>>>>>>>> On 2018/12/8 18:05, wangjian wrote:
>>>>>>>>>>>> Hi Changwei,
>>>>>>>>>>>>
>>>>>>>>>>>> I understand your idea. But we should be aware that the cancel_convert process and
>>>>>>>>>>>> other processes (accepting the AST process, the recovery process) are asynchronous.
>>>>>>>>>>>> For example, according to your idea, check if the lock is in the grant queue before
>>>>>>>>>>>> calling the dlm_send_remote_unlock_request function in the dlm_proxy_ast_handler function.
>>>>>>>>>>>> Then decide whether to clear cancel_pending. But if the AST does not come at this time,
>>>>>>>>>>>> the check passes and cancel_pendig will not be cleared. Then AST immediately came over again,
>>>>>>>>>>>> which also led to a bug. I personally think that for asynchronous processes we can't guarantee
>>>>>>>>>>>> the speed of execution of each process. All we can do is to avoid the BUG scene.
>>>>>>>>>>>>
>>>>>>>>>>>> As for the question you said ("If you remove the BUG check why you still call dlm_commit_pending_cancel()
>>>>>>>>>>>> to move the lock back to grant on matter it's on converting list or not?").
>>>>>>>>>>>> I think we should first check if the lock is in the grant queue
>>>>>>>>>>>> (at this time, dlm->spinlock and res->spinlock have been added), then decide whether to call
>>>>>>>>>>>> dlm_commit_pending_cancel function.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Jian
>>>>>>>>>>>>
>>>>>>>>>>>> On 12/7/2018 11:12 AM, Changwei Ge wrote:
>>>>>>>>>>>>> Hi Jian,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I suppose that the situation you described truly exists.
>>>>>>>>>>>>> But the way you fix the issue is not in my favor.
>>>>>>>>>>>>>
>>>>>>>>>>>>> If you remove the BUG check why you still call dlm_commit_pending_cancel() to
>>>>>>>>>>>>> move the lock back to grant on matter it's on converting list or not?
>>>>>>>>>>>>>
>>>>>>>>>>>>> So how about keeping the BUG check in dlm_move_lockres_to_recovery_list().
>>>>>>>>>>>>> If the locking _ast_ comes back very fast just check ::cancel_pending in dlm_proxy_ast_handler() and clear it.
>>>>>>>>>>>>> Then with the logic checking if the lock is on grant list (in dlmunlock_common() more less like your current method)
>>>>>>>>>>>>> or not we can easily tell if the cancellation succeeds or not.
>>>>>>>>>>>>>
>>>>>>>>>>>>> That complies the original dlm design, which I think is better and easier for maintainers to understand.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Changwei
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 2018/12/6 20:06, wangjian wrote:
>>>>>>>>>>>>>> Hi Changwei,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I don't fully agree with your point of view. In my opinion, the lock cancellation process and the lock
>>>>>>>>>>>>>> conversion process are asynchronous. We can't guarantee that the lock must be in the convert list
>>>>>>>>>>>>>> during the lock conversion process, otherwise this BUG will not happen.
>>>>>>>>>>>>>> So, I think this is a meaningless BUG.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Jian
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 12/5/2018 9:49 AM, Changwei Ge wrote:
>>>>>>>>>>>>>>> Hi Jian,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I suppose you can't just remove the BUG_ON() check.
>>>>>>>>>>>>>>> If you remove it, below code violates the original logic.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> '''
>>>>>>>>>>>>>>> 2141                                 dlm_commit_pending_cancel(res, lock);
>>>>>>>>>>>>>>> '''
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> What's more important is *locking cancel* must be against a *locking conversion* progress.
>>>>>>>>>>>>>>> So it makes sense to check if this lock is on converting list.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> So I have to NACK to this patch.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Changwei
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 2018/12/3 20:23, wangjian wrote:
>>>>>>>>>>>>>>>> In the dlm_move_lockres_to_recovery_list function, if the lock
>>>>>>>>>>>>>>>> is in the granted queue and cancel_pending is set, it will
>>>>>>>>>>>>>>>> encounter a BUG. I think this is a meaningless BUG,
>>>>>>>>>>>>>>>> so be prepared to remove it. A scenario that causes
>>>>>>>>>>>>>>>> this BUG will be given below.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> At the beginning, Node 1 is the master and has NL lock,
>>>>>>>>>>>>>>>> Node 2 has PR lock, Node 3 has PR lock too.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Node 1          Node 2          Node 3
>>>>>>>>>>>>>>>>                     want to get EX lock.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>                                     want to get EX lock.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Node 3 hinder
>>>>>>>>>>>>>>>> Node 2 to get
>>>>>>>>>>>>>>>> EX lock, send
>>>>>>>>>>>>>>>> Node 3 a BAST.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>                                     receive BAST from
>>>>>>>>>>>>>>>>                                     Node 1. downconvert
>>>>>>>>>>>>>>>>                                     thread begin to
>>>>>>>>>>>>>>>>                                     cancel PR to EX conversion.
>>>>>>>>>>>>>>>>                                     In dlmunlock_common function,
>>>>>>>>>>>>>>>>                                     downconvert thread has set
>>>>>>>>>>>>>>>>                                     lock->cancel_pending,
>>>>>>>>>>>>>>>>                                     but did not enter
>>>>>>>>>>>>>>>>                                     dlm_send_remote_unlock_request
>>>>>>>>>>>>>>>>                                     function.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>                     Node2 dies because
>>>>>>>>>>>>>>>>                     the host is powered down.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> In recovery process,
>>>>>>>>>>>>>>>> clean the lock that
>>>>>>>>>>>>>>>> related to Node2.
>>>>>>>>>>>>>>>> then finish Node 3
>>>>>>>>>>>>>>>> PR to EX request.
>>>>>>>>>>>>>>>> give Node 3 a AST.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>                                     receive AST from Node 1.
>>>>>>>>>>>>>>>>                                     change lock level to EX,
>>>>>>>>>>>>>>>>                                     move lock to granted list.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Node1 dies because
>>>>>>>>>>>>>>>> the host is powered down.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>                                     In dlm_move_lockres_to_recovery_list
>>>>>>>>>>>>>>>>                                     function. the lock is in the
>>>>>>>>>>>>>>>>                                     granted queue and cancel_pending
>>>>>>>>>>>>>>>>                                     is set. BUG_ON.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> But after clearing this BUG, process will encounter
>>>>>>>>>>>>>>>> the second BUG in the ocfs2_unlock_ast function.
>>>>>>>>>>>>>>>> Here is a scenario that will cause the second BUG
>>>>>>>>>>>>>>>> in ocfs2_unlock_ast as follows:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> At the beginning, Node 1 is the master and has NL lock,
>>>>>>>>>>>>>>>> Node 2 has PR lock, Node 3 has PR lock too.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Node 1          Node 2          Node 3
>>>>>>>>>>>>>>>>                     want to get EX lock.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>                                     want to get EX lock.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Node 3 hinder
>>>>>>>>>>>>>>>> Node 2 to get
>>>>>>>>>>>>>>>> EX lock, send
>>>>>>>>>>>>>>>> Node 3 a BAST.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>                                     receive BAST from
>>>>>>>>>>>>>>>>                                     Node 1. downconvert
>>>>>>>>>>>>>>>>                                     thread begin to
>>>>>>>>>>>>>>>>                                     cancel PR to EX conversion.
>>>>>>>>>>>>>>>>                                     In dlmunlock_common function,
>>>>>>>>>>>>>>>>                                     downconvert thread has released
>>>>>>>>>>>>>>>>                                     lock->spinlock and res->spinlock,
>>>>>>>>>>>>>>>>                                     but did not enter
>>>>>>>>>>>>>>>>                                     dlm_send_remote_unlock_request
>>>>>>>>>>>>>>>>                                     function.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>                     Node2 dies because
>>>>>>>>>>>>>>>>                     the host is powered down.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> In recovery process,
>>>>>>>>>>>>>>>> clean the lock that
>>>>>>>>>>>>>>>> related to Node2.
>>>>>>>>>>>>>>>> then finish Node 3
>>>>>>>>>>>>>>>> PR to EX request.
>>>>>>>>>>>>>>>> give Node 3 a AST.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>                                     receive AST from Node 1.
>>>>>>>>>>>>>>>>                                     change lock level to EX,
>>>>>>>>>>>>>>>>                                     move lock to granted list,
>>>>>>>>>>>>>>>>                                     set lockres->l_unlock_action
>>>>>>>>>>>>>>>>                                     as OCFS2_UNLOCK_INVALID
>>>>>>>>>>>>>>>>                                     in ocfs2_locking_ast function.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Node2 dies because
>>>>>>>>>>>>>>>> the host is powered down.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>                                     Node 3 realize that Node 1
>>>>>>>>>>>>>>>>                                     is dead, remove Node 1 from
>>>>>>>>>>>>>>>>                                     domain_map. downconvert thread
>>>>>>>>>>>>>>>>                                     get DLM_NORMAL from
>>>>>>>>>>>>>>>>                                     dlm_send_remote_unlock_request
>>>>>>>>>>>>>>>>                                     function and set *call_ast as 1.
>>>>>>>>>>>>>>>>                                     Then downconvert thread meet
>>>>>>>>>>>>>>>>                                     BUG in ocfs2_unlock_ast function.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> To avoid meet the second BUG, function dlmunlock_common shuold
>>>>>>>>>>>>>>>> return DLM_CANCELGRANT if the lock is on granted list and
>>>>>>>>>>>>>>>> the operation is canceled.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Signed-off-by: Jian Wang<wangjian161@huawei.com>
>>>>>>>>>>>>>>>> Reviewed-by: Yiwen Jiang<jiangyiwen@huawei.com>
>>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>>>          fs/ocfs2/dlm/dlmrecovery.c | 1 -
>>>>>>>>>>>>>>>>          fs/ocfs2/dlm/dlmunlock.c   | 5 +++++
>>>>>>>>>>>>>>>>          2 files changed, 5 insertions(+), 1 deletion(-)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
>>>>>>>>>>>>>>>> index 802636d..7489652 100644
>>>>>>>>>>>>>>>> --- a/fs/ocfs2/dlm/dlmrecovery.c
>>>>>>>>>>>>>>>> +++ b/fs/ocfs2/dlm/dlmrecovery.c
>>>>>>>>>>>>>>>> @@ -2134,7 +2134,6 @@ void dlm_move_lockres_to_recovery_list(struct dlm_ctxt *dlm,
>>>>>>>>>>>>>>>>          				 * if this had completed successfully
>>>>>>>>>>>>>>>>          				 * before sending this lock state to the
>>>>>>>>>>>>>>>>          				 * new master */
>>>>>>>>>>>>>>>> -				BUG_ON(i != DLM_CONVERTING_LIST);
>>>>>>>>>>>>>>>>          				mlog(0, "node died with cancel pending "
>>>>>>>>>>>>>>>>          				     "on %.*s. move back to granted list.\n",
>>>>>>>>>>>>>>>>          				     res->lockname.len, res->lockname.name);
>>>>>>>>>>>>>>>> diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
>>>>>>>>>>>>>>>> index 63d701c..505bb6c 100644
>>>>>>>>>>>>>>>> --- a/fs/ocfs2/dlm/dlmunlock.c
>>>>>>>>>>>>>>>> +++ b/fs/ocfs2/dlm/dlmunlock.c
>>>>>>>>>>>>>>>> @@ -183,6 +183,11 @@ static enum dlm_status dlmunlock_common(struct dlm_ctxt *dlm,
>>>>>>>>>>>>>>>>          							flags, owner);
>>>>>>>>>>>>>>>>          		spin_lock(&res->spinlock);
>>>>>>>>>>>>>>>>          		spin_lock(&lock->spinlock);
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>> +		if ((flags & LKM_CANCEL) &&
>>>>>>>>>>>>>>>> +				dlm_lock_on_list(&res->granted, lock))
>>>>>>>>>>>>>>>> +			status = DLM_CANCELGRANT;
>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>          		/* if the master told us the lock was already granted,
>>>>>>>>>>>>>>>>          		 * let the ast handle all of these actions */
>>>>>>>>>>>>>>>>          		if (status == DLM_CANCELGRANT) {
>>>>>>>>>>>>>>>> -- 
>>>>>>>>>>>>>>>> 1.8.3.1
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> .
>>>>>>>>>>>>>>>
>>>>>>>>>>>>> .
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> Ocfs2-devel mailing list
>>>>>>>>>>>> Ocfs2-devel@oss.oracle.com
>>>>>>>>>>>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> .
>>>>>>>>>>
>>>>>>>>>
>>>>>>>> .
>>>>>>>>
>>>>>>>
>>>>>> .
>>>>>>
>>>>>
>>>> .
>>>>
>>>
>> .
>>
>
piaojun Feb. 19, 2019, 8:26 a.m. UTC | #20
Hi Changwei,

On 2019/2/19 10:38, Changwei Ge wrote:
> Hi Jun,
> 
> On 2019/2/19 8:48, piaojun wrote:
>> Hi Changwei,
>>
>> On 2019/2/18 17:25, Changwei Ge wrote:
>>> Hi Jun,
>>>
>>> On 2019/2/15 17:49, piaojun wrote:
>>>> Hi Changwei,
>>>>
>>>> On 2019/2/15 17:27, Changwei Ge wrote:
>>>>> On 2019/2/15 17:20, piaojun wrote:
>>>>>> Hi Changwei,
>>>>>>
>>>>>> On 2019/2/15 15:56, Changwei Ge wrote:
>>>>>>> Hi Jun
>>>>>>>
>>>>>>> I just read the code around unlock/cancel.
>>>>>>>
>>>>>>> On 2019/2/15 15:35, piaojun wrote:
>>>>>>>> Hi Changwei,
>>>>>>>>
>>>>>>>> On 2019/2/15 15:06, Changwei Ge wrote:
>>>>>>>>> Hi Jun,
>>>>>>>>>
>>>>>>>>> On 2019/2/15 14:51, piaojun wrote:
>>>>>>>>>> Hi Changwei,
>>>>>>>>>>
>>>>>>>>>> On 2019/2/14 18:13, Changwei Ge wrote:
>>>>>>>>>>> On 2019/2/14 17:09, piaojun wrote:
>>>>>>>>>>>> Hi Changwei,
>>>>>>>>>>>>
>>>>>>>>>>>> The problem can't be solved completely if clear ::cancel_pending in
>>>>>>>>>>>> dlm_proxy_ast_handler, as AST will come at anytime just before
>>>>>>>>>>>
>>>>>>>>>>> So how about also add check here bere setting ::cancel_pending in dlmunlock_common() before invoking dlm_send_remote_unlock_request().
>>>>>>>>>>> If already on grant list just return DLM_CANCELGRANT
>>>>>>>>>>>
>>>>>>>>>>> Then a further reference code might look like:
>>>>>>>>>>>
>>>>>>>>>>> root@ubuntu16:/home/chge/linux[master]# git diff
>>>>>>>>>>> diff --git a/fs/ocfs2/dlm/dlmast.c b/fs/ocfs2/dlm/dlmast.c
>>>>>>>>>>> index 39831fc..812843b 100644
>>>>>>>>>>> --- a/fs/ocfs2/dlm/dlmast.c
>>>>>>>>>>> +++ b/fs/ocfs2/dlm/dlmast.c
>>>>>>>>>>> @@ -372,8 +372,11 @@ int dlm_proxy_ast_handler(struct o2net_msg *msg, u32 len, void *data,
>>>>>>>>>>>               head = &res->converting;
>>>>>>>>>>>               lock = NULL;
>>>>>>>>>>>               list_for_each_entry(lock, head, list) {
>>>>>>>>>>> -               if (lock->ml.cookie == cookie)
>>>>>>>>>>> +               if (lock->ml.cookie == cookie) {
>>>>>>>>>>> +                       if (lock->cancel_pending)
>>>>>>>>>>> +                               lock->cancel_pending = 0;
>>>>>>>>>>>                               goto do_ast;
>>>>>>>>>>> +               }
>>>>>>>>>>>               }
>>>>>>>>>>>
>>>>>>>>>>>               /* if not on convert, try blocked for ast, granted for bast */
>>>>>>>>>>> diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
>>>>>>>>>>> index c8e9b70..b4728b5 100644
>>>>>>>>>>> --- a/fs/ocfs2/dlm/dlmunlock.c
>>>>>>>>>>> +++ b/fs/ocfs2/dlm/dlmunlock.c
>>>>>>>>>>> @@ -174,9 +174,14 @@ static enum dlm_status dlmunlock_common(struct dlm_ctxt *dlm,
>>>>>>>>>>>               if (!master_node) {
>>>>>>>>>>>                       owner = res->owner;
>>>>>>>>>>>                       /* drop locks and send message */
>>>>>>>>>>> -               if (flags & LKM_CANCEL)
>>>>>>>>>>> +               if (flags & LKM_CANCEL) {
>>>>>>>>>>> +                       if (dlm_lock_on_list(&res->granted, lock)) {
>>>>>>>>>>> +                               status = DLM_CANCELGRANT;
>>>>>>>>>>> +                               goto leave;
>>>>>>>
>>>>>>> I found that above code should be useless.
>>>>>>> As upstream code already take it into consideration that AST has come before cancellation.
>>>>>>> In dlm_get_cancel_actions()
>>>>>>> 	'''
>>>>>>> 	} else if (dlm_lock_on_list(&res->granted, lock)) {
>>>>>>> 		/* too late, already granted. */
>>>>>>> 		status = DLM_CANCELGRANT;
>>>>>>> 		*actions = DLM_UNLOCK_CALL_AST;
>>>>>>> 	} else {
>>>>>>> 	'''
>>>>>>>
>>>>>>>>>>
>>>>>>>>>> If master dead and then lockres is moved to granted list in
>>>>>>>>>> dlm_move_lockres_to_recovery_list, the OCFS2_LOCK_BUSY is not cleared.
>>>>>>>>>
>>>>>>>>> OCFS2_LOCK_BUSY  should be cleared in ocfs2_locking_ast() since previous locking AST has come back(moving lock to grant list).
>>>>>>>>> That's why we return DLM_CANCELGRANT to caller to avoid calling AST. Otherwise ast() will be called twice, which is obviously a BUG.
>>>>>>>>
>>>>>>>> I mean master is already dead and ast won't come. So the
>>>>>>>> OCFS2_LOCK_BUSY is not cleared. And there are two cases that lockres is
>>>>>>>> moved to grant list:
>>>>>>>> 1. AST comes back and OCFS2_LOCK_BUSY is cleared.
>>>>>>>> 2. AST does not come back and OCFS2_LOCK_BUSY remains, recovery process
>>>>>>>>        move it to grant list. In this case, we need do AST for it.
>>>>>>>
>>>>>>> For point 2, ocfs2 can handle it, so you still don't have to worry. It's no problem.
>>>>>>> In dlmconvert_remote()
>>>>>>
>>>>>> What I worry about is that OCFS2_LOCK_BUSY won't be cleared if remote
>>>>>> master is dead, as OCFS2_LOCK_BUSY is cleared in two cases:
>>>>>> 1. remote AST come, clear it in dlm_proxy_ast_handler.
>>>>>> 2. remote AST does not come when master dead, clear it in
>>>>>>       o2dlm_unlock_ast_wrapper.
>>>
>>> If AST doesn't manage to get back to requested node, why must flag OCFS2_LOCK_BUSY be cleared in o2dlm_unlock_ast_wrapper?
>>>
>>> Yes, OCFS2_LOCK_BUSY can be cleared it either o2dlm_unlock_ast_wrapper() or o2dlm_lock_ast_wrapper() with o2cb stack applied.
>>>
>>> If we return DLM_CANCELGRANT from ocfs2/dlm to dlm, then we must know that AST has ever come back or master node has moved the lock to grant list itself, thus we clear flag OCFS2_LOCK_BUSY in o2dlm_lock_ast_wrapper().
>>> Otherwise we ascertain that we can stop current ongoing locking procedure and must clear OCFS2_LOCK_BUSY in o2dlm_unlock_ast_wrapper() (*synchronized*).
>>>
>>> Let's summarize this, OCFS2_LOCK_BUSY should be cleared whether by locking success or cancellation success.
>>>
>>> And my way already check if the lock is granted then return DLM_CANCELGRANT or not.
>>>
>>
>> OCFS2_LOCK_BUSY won't be cleared if DLM_CANCELGRANT is set in
>> o2dlm_unlock_ast_wrapper, and that's what I'm concerned about:
> 
> But we already *ascertain* that previous locking request has been *granted* before deciding to return DLM_CANCELGRANT during cancellation to o2dlm_unlock_ast_wrapper().
> 
> If above condition stands, o2dlm_lock_ast_wrapper() must will be or have been called, which also clears OCFS2_LOCK_BUSY.
> 

1. Node1 already has PR lock, and wants to get ex.
2. Node1 receive BAST and do unlock, here OCFS2_LOCK_BUSY is set.
3. Node1 can not receive the AST for unlock as master dead.
4. Then o2dlm_unlock_ast_wrapper will be called rather than o2dlm_lock_ast_wrapper.
5. Actually the *granted* lock request has nothing to do with OCFS2_LOCK_BUSY.

>>
>> static void o2dlm_unlock_ast_wrapper(void *astarg, enum dlm_status status)
>> {
>> 	struct ocfs2_dlm_lksb *lksb = astarg;
>> 	int error = dlm_status_to_errno(status);
>>
>> 	/*
>> 	 * In o2dlm, you can get both the lock_ast() for the lock being
>> 	 * granted and the unlock_ast() for the CANCEL failing.  A
>> 	 * successful cancel sends DLM_NORMAL here.  If the
>> 	 * lock grant happened before the cancel arrived, you get
>> 	 * DLM_CANCELGRANT.
>> 	 *
>> 	 * There's no need for the double-ast.  If we see DLM_CANCELGRANT,
>> 	 * we just ignore it.  We expect the lock_ast() to handle the
>> 	 * granted lock.
>> 	 */
>> 	if (status == DLM_CANCELGRANT)
>> 		return;
>>
>> 	lksb->lksb_conn->cc_proto->lp_unlock_ast(lksb, error);
>> }
>>
>>>>>
>>>>> Please don't worry about point 2.
>>>>> Like my previous e-mail, after dlm recovery picking up a new master for corresponding lock resource.
>>>>> dlmlock() -> convert will retry and send request to new master.
>>>>> Eventually, this locking(convert) will succeed and ast() will be called for ocfs2 layer to clear OCFS2_LOCK_BUSY.
>>>>
>>>> Perhaps you misunderstand my meaning. The AST I meant is for dlmunlock,
>>>> not for dlmlock, so it won't cause retry sending request to new master.
>>>
>>> If the unlocking AST is not back then dlm_send_remote_unlock_request() returns NORMAL.
>>> Because unlocking AST is synchronized with unlocking request, which means master won't sent back a separated message back(proxy ast type).
>>> That means *status* returned from dlm_send_remote_unlock_request() indicates AST for unlocking should be called or not.
>>> How can AST for unlocking can't be invoked.
>>
>> Sure, unlocking AST will be invoked, but do nothing(including clear
>> OCFS2_LOCK_BUSY).
> 
> Actually, it should do nothing since what we want will be done around the locking-grant code logic (including clearing OCFS2_LOCK_BUSY).
> Please note that we must ensure the locking can granted, which will fail relevant cancellation.
> 
> Thanks,
> Changwei
> 
>>
>>>
>>> You really make me puzzled. :- (((
>>>
>>> Thanks,
>>> Changwei
>>>
>>>>
>>>>>
>>>>> Thanks,
>>>>> Changwei
>>>>>
>>>>>>
>>>>>> For case 2, if DLM_CANCELGRANT is set, o2dlm_unlock_ast_wrapper just
>>>>>> return and won't clear OCFS2_LOCK_BUSY. So in this case, I think we
>>>>>> should do AST again for it.
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 341         if (status != DLM_NORMAL) {
>>>>>>> 342                 if (status != DLM_NOTQUEUED)
>>>>>>> 343                         dlm_error(status);
>>>>>>> 344                 dlm_revert_pending_convert(res, lock);
>>>>>>> 345         } else if (!lock->convert_pending) {
>>>>>>> 346                 mlog(0, "%s: res %.*s, owner died and lock has been moved back "
>>>>>>> 347                                 "to granted list, retry convert.\n",
>>>>>>> 348                                 dlm->name, res->lockname.len, res->lockname.name);
>>>>>>> 349                 status = DLM_RECOVERING;
>>>>>>> 350         }
>>>>>>>
>>>>>>> Thus, dlmlock() will wait until RECOVERY is done.
>>>>>>> And for ocfs2 layer, it's apparent, it doesn't have to be aware of what happened to DLM.
>>>>>>
>>>>>> I do not think the waiting can solve this problem.
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Changwei
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Perhaps we need do AST again for it.
>>>>>>>>
>>>>>>>>>
>>>>>>>>>> And this will cause stuck problem when ocfs2_drop_lock. The reason is
>>>>>>>>>> that unlock ast won't be done when DLM_CANCELGRANT is set. So I think
>>>>>>>>>
>>>>>>>>> With above elaboration, you don't have to worry the hang issue anymore.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Changwei
>>>>>>>>>
>>>>>>>>>> we need distinguish all the cases of moving lockres to grant list.
>>>>>>>>>>
>>>>>>>>>>> +                       }
>>>>>>>>>>> +
>>>>>>>>>>>                               lock->cancel_pending = 1;
>>>>>>>>>>> -               else
>>>>>>>>>>> +               } else
>>>>>>>>>>>                               lock->unlock_pending = 1;
>>>>>>>>>>>                       spin_unlock(&lock->spinlock);
>>>>>>>>>>>                       spin_unlock(&res->spinlock);
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Changwei
>>>>>>>>>>>
>>>>>>>>>>>> ::cancel_pendig is set. If there is not any other better solutions,
>>>>>>>>>>>> could we accept this patch? This bug is very harmful.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Jun
>>>>>>>>>>>>
>>>>>>>>>>>> On 2018/12/8 18:05, wangjian wrote:
>>>>>>>>>>>>> Hi Changwei,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I understand your idea. But we should be aware that the cancel_convert process and
>>>>>>>>>>>>> other processes (accepting the AST process, the recovery process) are asynchronous.
>>>>>>>>>>>>> For example, according to your idea, check if the lock is in the grant queue before
>>>>>>>>>>>>> calling the dlm_send_remote_unlock_request function in the dlm_proxy_ast_handler function.
>>>>>>>>>>>>> Then decide whether to clear cancel_pending. But if the AST does not come at this time,
>>>>>>>>>>>>> the check passes and cancel_pendig will not be cleared. Then AST immediately came over again,
>>>>>>>>>>>>> which also led to a bug. I personally think that for asynchronous processes we can't guarantee
>>>>>>>>>>>>> the speed of execution of each process. All we can do is to avoid the BUG scene.
>>>>>>>>>>>>>
>>>>>>>>>>>>> As for the question you said ("If you remove the BUG check why you still call dlm_commit_pending_cancel()
>>>>>>>>>>>>> to move the lock back to grant on matter it's on converting list or not?").
>>>>>>>>>>>>> I think we should first check if the lock is in the grant queue
>>>>>>>>>>>>> (at this time, dlm->spinlock and res->spinlock have been added), then decide whether to call
>>>>>>>>>>>>> dlm_commit_pending_cancel function.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Jian
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 12/7/2018 11:12 AM, Changwei Ge wrote:
>>>>>>>>>>>>>> Hi Jian,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I suppose that the situation you described truly exists.
>>>>>>>>>>>>>> But the way you fix the issue is not in my favor.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> If you remove the BUG check why you still call dlm_commit_pending_cancel() to
>>>>>>>>>>>>>> move the lock back to grant on matter it's on converting list or not?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> So how about keeping the BUG check in dlm_move_lockres_to_recovery_list().
>>>>>>>>>>>>>> If the locking _ast_ comes back very fast just check ::cancel_pending in dlm_proxy_ast_handler() and clear it.
>>>>>>>>>>>>>> Then with the logic checking if the lock is on grant list (in dlmunlock_common() more less like your current method)
>>>>>>>>>>>>>> or not we can easily tell if the cancellation succeeds or not.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> That complies the original dlm design, which I think is better and easier for maintainers to understand.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Changwei
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 2018/12/6 20:06, wangjian wrote:
>>>>>>>>>>>>>>> Hi Changwei,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I don't fully agree with your point of view. In my opinion, the lock cancellation process and the lock
>>>>>>>>>>>>>>> conversion process are asynchronous. We can't guarantee that the lock must be in the convert list
>>>>>>>>>>>>>>> during the lock conversion process, otherwise this BUG will not happen.
>>>>>>>>>>>>>>> So, I think this is a meaningless BUG.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Jian
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 12/5/2018 9:49 AM, Changwei Ge wrote:
>>>>>>>>>>>>>>>> Hi Jian,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I suppose you can't just remove the BUG_ON() check.
>>>>>>>>>>>>>>>> If you remove it, below code violates the original logic.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> '''
>>>>>>>>>>>>>>>> 2141                                 dlm_commit_pending_cancel(res, lock);
>>>>>>>>>>>>>>>> '''
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> What's more important is *locking cancel* must be against a *locking conversion* progress.
>>>>>>>>>>>>>>>> So it makes sense to check if this lock is on converting list.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> So I have to NACK to this patch.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Changwei
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On 2018/12/3 20:23, wangjian wrote:
>>>>>>>>>>>>>>>>> In the dlm_move_lockres_to_recovery_list function, if the lock
>>>>>>>>>>>>>>>>> is in the granted queue and cancel_pending is set, it will
>>>>>>>>>>>>>>>>> encounter a BUG. I think this is a meaningless BUG,
>>>>>>>>>>>>>>>>> so be prepared to remove it. A scenario that causes
>>>>>>>>>>>>>>>>> this BUG will be given below.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> At the beginning, Node 1 is the master and has NL lock,
>>>>>>>>>>>>>>>>> Node 2 has PR lock, Node 3 has PR lock too.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Node 1          Node 2          Node 3
>>>>>>>>>>>>>>>>>                     want to get EX lock.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>                                     want to get EX lock.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Node 3 hinder
>>>>>>>>>>>>>>>>> Node 2 to get
>>>>>>>>>>>>>>>>> EX lock, send
>>>>>>>>>>>>>>>>> Node 3 a BAST.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>                                     receive BAST from
>>>>>>>>>>>>>>>>>                                     Node 1. downconvert
>>>>>>>>>>>>>>>>>                                     thread begin to
>>>>>>>>>>>>>>>>>                                     cancel PR to EX conversion.
>>>>>>>>>>>>>>>>>                                     In dlmunlock_common function,
>>>>>>>>>>>>>>>>>                                     downconvert thread has set
>>>>>>>>>>>>>>>>>                                     lock->cancel_pending,
>>>>>>>>>>>>>>>>>                                     but did not enter
>>>>>>>>>>>>>>>>>                                     dlm_send_remote_unlock_request
>>>>>>>>>>>>>>>>>                                     function.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>                     Node2 dies because
>>>>>>>>>>>>>>>>>                     the host is powered down.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> In recovery process,
>>>>>>>>>>>>>>>>> clean the lock that
>>>>>>>>>>>>>>>>> related to Node2.
>>>>>>>>>>>>>>>>> then finish Node 3
>>>>>>>>>>>>>>>>> PR to EX request.
>>>>>>>>>>>>>>>>> give Node 3 a AST.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>                                     receive AST from Node 1.
>>>>>>>>>>>>>>>>>                                     change lock level to EX,
>>>>>>>>>>>>>>>>>                                     move lock to granted list.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Node1 dies because
>>>>>>>>>>>>>>>>> the host is powered down.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>                                     In dlm_move_lockres_to_recovery_list
>>>>>>>>>>>>>>>>>                                     function. the lock is in the
>>>>>>>>>>>>>>>>>                                     granted queue and cancel_pending
>>>>>>>>>>>>>>>>>                                     is set. BUG_ON.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> But after clearing this BUG, process will encounter
>>>>>>>>>>>>>>>>> the second BUG in the ocfs2_unlock_ast function.
>>>>>>>>>>>>>>>>> Here is a scenario that will cause the second BUG
>>>>>>>>>>>>>>>>> in ocfs2_unlock_ast as follows:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> At the beginning, Node 1 is the master and has NL lock,
>>>>>>>>>>>>>>>>> Node 2 has PR lock, Node 3 has PR lock too.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Node 1          Node 2          Node 3
>>>>>>>>>>>>>>>>>                     want to get EX lock.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>                                     want to get EX lock.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Node 3 hinder
>>>>>>>>>>>>>>>>> Node 2 to get
>>>>>>>>>>>>>>>>> EX lock, send
>>>>>>>>>>>>>>>>> Node 3 a BAST.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>                                     receive BAST from
>>>>>>>>>>>>>>>>>                                     Node 1. downconvert
>>>>>>>>>>>>>>>>>                                     thread begin to
>>>>>>>>>>>>>>>>>                                     cancel PR to EX conversion.
>>>>>>>>>>>>>>>>>                                     In dlmunlock_common function,
>>>>>>>>>>>>>>>>>                                     downconvert thread has released
>>>>>>>>>>>>>>>>>                                     lock->spinlock and res->spinlock,
>>>>>>>>>>>>>>>>>                                     but did not enter
>>>>>>>>>>>>>>>>>                                     dlm_send_remote_unlock_request
>>>>>>>>>>>>>>>>>                                     function.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>                     Node2 dies because
>>>>>>>>>>>>>>>>>                     the host is powered down.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> In recovery process,
>>>>>>>>>>>>>>>>> clean the lock that
>>>>>>>>>>>>>>>>> related to Node2.
>>>>>>>>>>>>>>>>> then finish Node 3
>>>>>>>>>>>>>>>>> PR to EX request.
>>>>>>>>>>>>>>>>> give Node 3 a AST.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>                                     receive AST from Node 1.
>>>>>>>>>>>>>>>>>                                     change lock level to EX,
>>>>>>>>>>>>>>>>>                                     move lock to granted list,
>>>>>>>>>>>>>>>>>                                     set lockres->l_unlock_action
>>>>>>>>>>>>>>>>>                                     as OCFS2_UNLOCK_INVALID
>>>>>>>>>>>>>>>>>                                     in ocfs2_locking_ast function.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Node2 dies because
>>>>>>>>>>>>>>>>> the host is powered down.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>                                     Node 3 realize that Node 1
>>>>>>>>>>>>>>>>>                                     is dead, remove Node 1 from
>>>>>>>>>>>>>>>>>                                     domain_map. downconvert thread
>>>>>>>>>>>>>>>>>                                     get DLM_NORMAL from
>>>>>>>>>>>>>>>>>                                     dlm_send_remote_unlock_request
>>>>>>>>>>>>>>>>>                                     function and set *call_ast as 1.
>>>>>>>>>>>>>>>>>                                     Then downconvert thread meet
>>>>>>>>>>>>>>>>>                                     BUG in ocfs2_unlock_ast function.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> To avoid meet the second BUG, function dlmunlock_common shuold
>>>>>>>>>>>>>>>>> return DLM_CANCELGRANT if the lock is on granted list and
>>>>>>>>>>>>>>>>> the operation is canceled.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Signed-off-by: Jian Wang<wangjian161@huawei.com>
>>>>>>>>>>>>>>>>> Reviewed-by: Yiwen Jiang<jiangyiwen@huawei.com>
>>>>>>>>>>>>>>>>> ---
>>>>>>>>>>>>>>>>>          fs/ocfs2/dlm/dlmrecovery.c | 1 -
>>>>>>>>>>>>>>>>>          fs/ocfs2/dlm/dlmunlock.c   | 5 +++++
>>>>>>>>>>>>>>>>>          2 files changed, 5 insertions(+), 1 deletion(-)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
>>>>>>>>>>>>>>>>> index 802636d..7489652 100644
>>>>>>>>>>>>>>>>> --- a/fs/ocfs2/dlm/dlmrecovery.c
>>>>>>>>>>>>>>>>> +++ b/fs/ocfs2/dlm/dlmrecovery.c
>>>>>>>>>>>>>>>>> @@ -2134,7 +2134,6 @@ void dlm_move_lockres_to_recovery_list(struct dlm_ctxt *dlm,
>>>>>>>>>>>>>>>>>          				 * if this had completed successfully
>>>>>>>>>>>>>>>>>          				 * before sending this lock state to the
>>>>>>>>>>>>>>>>>          				 * new master */
>>>>>>>>>>>>>>>>> -				BUG_ON(i != DLM_CONVERTING_LIST);
>>>>>>>>>>>>>>>>>          				mlog(0, "node died with cancel pending "
>>>>>>>>>>>>>>>>>          				     "on %.*s. move back to granted list.\n",
>>>>>>>>>>>>>>>>>          				     res->lockname.len, res->lockname.name);
>>>>>>>>>>>>>>>>> diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
>>>>>>>>>>>>>>>>> index 63d701c..505bb6c 100644
>>>>>>>>>>>>>>>>> --- a/fs/ocfs2/dlm/dlmunlock.c
>>>>>>>>>>>>>>>>> +++ b/fs/ocfs2/dlm/dlmunlock.c
>>>>>>>>>>>>>>>>> @@ -183,6 +183,11 @@ static enum dlm_status dlmunlock_common(struct dlm_ctxt *dlm,
>>>>>>>>>>>>>>>>>          							flags, owner);
>>>>>>>>>>>>>>>>>          		spin_lock(&res->spinlock);
>>>>>>>>>>>>>>>>>          		spin_lock(&lock->spinlock);
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>> +		if ((flags & LKM_CANCEL) &&
>>>>>>>>>>>>>>>>> +				dlm_lock_on_list(&res->granted, lock))
>>>>>>>>>>>>>>>>> +			status = DLM_CANCELGRANT;
>>>>>>>>>>>>>>>>> +
>>>>>>>>>>>>>>>>>          		/* if the master told us the lock was already granted,
>>>>>>>>>>>>>>>>>          		 * let the ast handle all of these actions */
>>>>>>>>>>>>>>>>>          		if (status == DLM_CANCELGRANT) {
>>>>>>>>>>>>>>>>> -- 
>>>>>>>>>>>>>>>>> 1.8.3.1
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> .
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> .
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> Ocfs2-devel mailing list
>>>>>>>>>>>>> Ocfs2-devel@oss.oracle.com
>>>>>>>>>>>>> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> .
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> .
>>>>>>>>>
>>>>>>>>
>>>>>>> .
>>>>>>>
>>>>>>
>>>>> .
>>>>>
>>>>
>>> .
>>>
>>
> .
>
Changwei Ge Feb. 21, 2019, 6:46 a.m. UTC | #21
Hi jun
Good afternoon.

>>>> If AST doesn't manage to get back to requested node, why must flag OCFS2_LOCK_BUSY be cleared in o2dlm_unlock_ast_wrapper?
>>>>
>>>> Yes, OCFS2_LOCK_BUSY can be cleared it either o2dlm_unlock_ast_wrapper() or o2dlm_lock_ast_wrapper() with o2cb stack applied.
>>>>
>>>> If we return DLM_CANCELGRANT from ocfs2/dlm to dlm, then we must know that AST has ever come back or master node has moved the lock to grant list itself, thus we clear flag OCFS2_LOCK_BUSY in o2dlm_lock_ast_wrapper().
>>>> Otherwise we ascertain that we can stop current ongoing locking procedure and must clear OCFS2_LOCK_BUSY in o2dlm_unlock_ast_wrapper() (*synchronized*).
>>>>
>>>> Let's summarize this, OCFS2_LOCK_BUSY should be cleared whether by locking success or cancellation success.
>>>>
>>>> And my way already check if the lock is granted then return DLM_CANCELGRANT or not.
>>>>
>>>
>>> OCFS2_LOCK_BUSY won't be cleared if DLM_CANCELGRANT is set in
>>> o2dlm_unlock_ast_wrapper, and that's what I'm concerned about:
>>
>> But we already *ascertain* that previous locking request has been *granted* before deciding to return DLM_CANCELGRANT during cancellation to o2dlm_unlock_ast_wrapper().
>>
>> If above condition stands, o2dlm_lock_ast_wrapper() must will be or have been called, which also clears OCFS2_LOCK_BUSY.
>>
> 
> 1. Node1 already has PR lock, and wants to get ex.
Well, a locking up-conversion procedure.

> 2. Node1 receive BAST and do unlock, here OCFS2_LOCK_BUSY is set.
Because there are two concurrent up-conversion, which conflict, so one of them must be canceled!

> 3. Node1 can not receive the AST for unlock as master dead.
So here you mean the lock can't be granted.

> 4. Then o2dlm_unlock_ast_wrapper will be called rather than o2dlm_lock_ast_wrapper.
Then the cancellation succeeds as the master dies.

> 5. Actually the *granted* lock request has nothing to do with OCFS2_LOCK_BUSY.
Yes, o2dlm_lock_ast_wrapper will not clear OCFS2_LOCK_BUSY.

But my suggestion was not against above timing sequence.
Did you misunderstand my suggestion?
And the original logic of Jian's patch also returns DLM_CANCELGRANT.

Thanks,
Changwei
piaojun Feb. 22, 2019, 3:15 a.m. UTC | #22
Hi Changwei,

On 2019/2/21 14:46, Changwei Ge wrote:
> Hi jun
> Good afternoon.
> 
>>>>> If AST doesn't manage to get back to requested node, why must flag OCFS2_LOCK_BUSY be cleared in o2dlm_unlock_ast_wrapper?
>>>>>
>>>>> Yes, OCFS2_LOCK_BUSY can be cleared it either o2dlm_unlock_ast_wrapper() or o2dlm_lock_ast_wrapper() with o2cb stack applied.
>>>>>
>>>>> If we return DLM_CANCELGRANT from ocfs2/dlm to dlm, then we must know that AST has ever come back or master node has moved the lock to grant list itself, thus we clear flag OCFS2_LOCK_BUSY in o2dlm_lock_ast_wrapper().
>>>>> Otherwise we ascertain that we can stop current ongoing locking procedure and must clear OCFS2_LOCK_BUSY in o2dlm_unlock_ast_wrapper() (*synchronized*).
>>>>>
>>>>> Let's summarize this, OCFS2_LOCK_BUSY should be cleared whether by locking success or cancellation success.
>>>>>
>>>>> And my way already check if the lock is granted then return DLM_CANCELGRANT or not.
>>>>>
>>>>
>>>> OCFS2_LOCK_BUSY won't be cleared if DLM_CANCELGRANT is set in
>>>> o2dlm_unlock_ast_wrapper, and that's what I'm concerned about:
>>>
>>> But we already *ascertain* that previous locking request has been *granted* before deciding to return DLM_CANCELGRANT during cancellation to o2dlm_unlock_ast_wrapper().
>>>
>>> If above condition stands, o2dlm_lock_ast_wrapper() must will be or have been called, which also clears OCFS2_LOCK_BUSY.
>>>
>>
>> 1. Node1 already has PR lock, and wants to get ex.
> Well, a locking up-conversion procedure.
> 
>> 2. Node1 receive BAST and do unlock, here OCFS2_LOCK_BUSY is set.
> Because there are two concurrent up-conversion, which conflict, so one of them must be canceled!
> 
>> 3. Node1 can not receive the AST for unlock as master dead.
> So here you mean the lock can't be granted.
> 
>> 4. Then o2dlm_unlock_ast_wrapper will be called rather than o2dlm_lock_ast_wrapper.
> Then the cancellation succeeds as the master dies.
> 
>> 5. Actually the *granted* lock request has nothing to do with OCFS2_LOCK_BUSY.
> Yes, o2dlm_lock_ast_wrapper will not clear OCFS2_LOCK_BUSY.
> 
> But my suggestion was not against above timing sequence.
> Did you misunderstand my suggestion?
> And the original logic of Jian's patch also returns DLM_CANCELGRANT.

Yes, Jian's last patch can't solve the problem either, and I think we
should find another solution for it. I'm considering deleting the check
for DLM_CANCELGRANT, and clear OCFS2_LOCK_BUSY in the following process.

Thanks,
Jun

> 
> Thanks,
> Changwei
> 
> 
>   
> 
> .
>
Changwei Ge Feb. 22, 2019, 3:32 a.m. UTC | #23
Hi Jun,

On 2019/2/22 11:16, piaojun wrote:
> Hi Changwei,
> 
> On 2019/2/21 14:46, Changwei Ge wrote:
>> Hi jun
>> Good afternoon.
>>
>>>>>> If AST doesn't manage to get back to requested node, why must flag OCFS2_LOCK_BUSY be cleared in o2dlm_unlock_ast_wrapper?
>>>>>>
>>>>>> Yes, OCFS2_LOCK_BUSY can be cleared it either o2dlm_unlock_ast_wrapper() or o2dlm_lock_ast_wrapper() with o2cb stack applied.
>>>>>>
>>>>>> If we return DLM_CANCELGRANT from ocfs2/dlm to dlm, then we must know that AST has ever come back or master node has moved the lock to grant list itself, thus we clear flag OCFS2_LOCK_BUSY in o2dlm_lock_ast_wrapper().
>>>>>> Otherwise we ascertain that we can stop current ongoing locking procedure and must clear OCFS2_LOCK_BUSY in o2dlm_unlock_ast_wrapper() (*synchronized*).
>>>>>>
>>>>>> Let's summarize this, OCFS2_LOCK_BUSY should be cleared whether by locking success or cancellation success.
>>>>>>
>>>>>> And my way already check if the lock is granted then return DLM_CANCELGRANT or not.
>>>>>>
>>>>>
>>>>> OCFS2_LOCK_BUSY won't be cleared if DLM_CANCELGRANT is set in
>>>>> o2dlm_unlock_ast_wrapper, and that's what I'm concerned about:
>>>>
>>>> But we already *ascertain* that previous locking request has been *granted* before deciding to return DLM_CANCELGRANT during cancellation to o2dlm_unlock_ast_wrapper().
>>>>
>>>> If above condition stands, o2dlm_lock_ast_wrapper() must will be or have been called, which also clears OCFS2_LOCK_BUSY.
>>>>
>>>
>>> 1. Node1 already has PR lock, and wants to get ex.
>> Well, a locking up-conversion procedure.
>>
>>> 2. Node1 receive BAST and do unlock, here OCFS2_LOCK_BUSY is set.
>> Because there are two concurrent up-conversion, which conflict, so one of them must be canceled!
>>
>>> 3. Node1 can not receive the AST for unlock as master dead.
>> So here you mean the lock can't be granted.
>>
>>> 4. Then o2dlm_unlock_ast_wrapper will be called rather than o2dlm_lock_ast_wrapper.
>> Then the cancellation succeeds as the master dies.
>>
>>> 5. Actually the *granted* lock request has nothing to do with OCFS2_LOCK_BUSY.
>> Yes, o2dlm_lock_ast_wrapper will not clear OCFS2_LOCK_BUSY.
>>
>> But my suggestion was not against above timing sequence.
>> Did you misunderstand my suggestion?
>> And the original logic of Jian's patch also returns DLM_CANCELGRANT.
> 
> Yes, Jian's last patch can't solve the problem either, and I think we
> should find another solution for it. I'm considering deleting the check
> for DLM_CANCELGRANT, and clear OCFS2_LOCK_BUSY in the following process.

If Jian's patch can't fix the issue either, I am going to ask Andrew to drop this patch.
Hopefully, Jian or you can help post another patch for it. Then we can have another round of discussion.

Thanks,
Changwei

> 
> Thanks,
> Jun
> 
>>
>> Thanks,
>> Changwei
>>
>>
>>    
>>
>> .
>>
>
piaojun Feb. 22, 2019, 3:34 a.m. UTC | #24
Hi Changwei,

On 2019/2/22 11:32, Changwei Ge wrote:
> Hi Jun,
> 
> On 2019/2/22 11:16, piaojun wrote:
>> Hi Changwei,
>>
>> On 2019/2/21 14:46, Changwei Ge wrote:
>>> Hi jun
>>> Good afternoon.
>>>
>>>>>>> If AST doesn't manage to get back to requested node, why must flag OCFS2_LOCK_BUSY be cleared in o2dlm_unlock_ast_wrapper?
>>>>>>>
>>>>>>> Yes, OCFS2_LOCK_BUSY can be cleared it either o2dlm_unlock_ast_wrapper() or o2dlm_lock_ast_wrapper() with o2cb stack applied.
>>>>>>>
>>>>>>> If we return DLM_CANCELGRANT from ocfs2/dlm to dlm, then we must know that AST has ever come back or master node has moved the lock to grant list itself, thus we clear flag OCFS2_LOCK_BUSY in o2dlm_lock_ast_wrapper().
>>>>>>> Otherwise we ascertain that we can stop current ongoing locking procedure and must clear OCFS2_LOCK_BUSY in o2dlm_unlock_ast_wrapper() (*synchronized*).
>>>>>>>
>>>>>>> Let's summarize this, OCFS2_LOCK_BUSY should be cleared whether by locking success or cancellation success.
>>>>>>>
>>>>>>> And my way already check if the lock is granted then return DLM_CANCELGRANT or not.
>>>>>>>
>>>>>>
>>>>>> OCFS2_LOCK_BUSY won't be cleared if DLM_CANCELGRANT is set in
>>>>>> o2dlm_unlock_ast_wrapper, and that's what I'm concerned about:
>>>>>
>>>>> But we already *ascertain* that previous locking request has been *granted* before deciding to return DLM_CANCELGRANT during cancellation to o2dlm_unlock_ast_wrapper().
>>>>>
>>>>> If above condition stands, o2dlm_lock_ast_wrapper() must will be or have been called, which also clears OCFS2_LOCK_BUSY.
>>>>>
>>>>
>>>> 1. Node1 already has PR lock, and wants to get ex.
>>> Well, a locking up-conversion procedure.
>>>
>>>> 2. Node1 receive BAST and do unlock, here OCFS2_LOCK_BUSY is set.
>>> Because there are two concurrent up-conversion, which conflict, so one of them must be canceled!
>>>
>>>> 3. Node1 can not receive the AST for unlock as master dead.
>>> So here you mean the lock can't be granted.
>>>
>>>> 4. Then o2dlm_unlock_ast_wrapper will be called rather than o2dlm_lock_ast_wrapper.
>>> Then the cancellation succeeds as the master dies.
>>>
>>>> 5. Actually the *granted* lock request has nothing to do with OCFS2_LOCK_BUSY.
>>> Yes, o2dlm_lock_ast_wrapper will not clear OCFS2_LOCK_BUSY.
>>>
>>> But my suggestion was not against above timing sequence.
>>> Did you misunderstand my suggestion?
>>> And the original logic of Jian's patch also returns DLM_CANCELGRANT.
>>
>> Yes, Jian's last patch can't solve the problem either, and I think we
>> should find another solution for it. I'm considering deleting the check
>> for DLM_CANCELGRANT, and clear OCFS2_LOCK_BUSY in the following process.
> 
> If Jian's patch can't fix the issue either, I am going to ask Andrew to drop this patch.
> Hopefully, Jian or you can help post another patch for it. Then we can have another round of discussion.

Agreed!

> 
> Thanks,
> Changwei
> 
>>
>> Thanks,
>> Jun
>>
>>>
>>> Thanks,
>>> Changwei
>>>
>>>
>>>    
>>>
>>> .
>>>
>>
> .
>

Patch
diff mbox series

diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
index 802636d..7489652 100644
--- a/fs/ocfs2/dlm/dlmrecovery.c
+++ b/fs/ocfs2/dlm/dlmrecovery.c
@@ -2134,7 +2134,6 @@  void dlm_move_lockres_to_recovery_list(struct dlm_ctxt *dlm,
  				 * if this had completed successfully
  				 * before sending this lock state to the
  				 * new master */
-				BUG_ON(i != DLM_CONVERTING_LIST);
  				mlog(0, "node died with cancel pending "
  				     "on %.*s. move back to granted list.\n",
  				     res->lockname.len, res->lockname.name);
diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
index 63d701c..505bb6c 100644
--- a/fs/ocfs2/dlm/dlmunlock.c
+++ b/fs/ocfs2/dlm/dlmunlock.c
@@ -183,6 +183,11 @@  static enum dlm_status dlmunlock_common(struct dlm_ctxt *dlm,
  							flags, owner);
  		spin_lock(&res->spinlock);
  		spin_lock(&lock->spinlock);
+
+		if ((flags & LKM_CANCEL) &&
+				dlm_lock_on_list(&res->granted, lock))
+			status = DLM_CANCELGRANT;
+
  		/* if the master told us the lock was already granted,
  		 * let the ast handle all of these actions */
  		if (status == DLM_CANCELGRANT) {