diff mbox series

ocfs2/dlm: Clean DLM_LKSB_GET_LVB and DLM_LKSB_PUT_LVB when the cancel_pending is set

Message ID bffbe5a5-acb6-652d-eb8a-99fb051e6631@huawei.com (mailing list archive)
State New, archived
Headers show
Series ocfs2/dlm: Clean DLM_LKSB_GET_LVB and DLM_LKSB_PUT_LVB when the cancel_pending is set | expand

Commit Message

wangjian Dec. 3, 2018, 12:06 p.m. UTC
Function dlm_move_lockres_to_recovery_list should clean
DLM_LKSB_GET_LVB and DLM_LKSB_PUT_LVB when the cancel_pending
is set. Otherwise node may panic in dlm_proxy_ast_handler.

Here is the situation: At the beginning, Node1 is the master
of the lock resource and has NL lock, Node2 has PR lock,
Node3 has PR lock, Node4 has NL lock.

Node1        Node2            Node3            Node4
              convert lock_2 from
              PR to EX.

the mode of lock_3 is
PR, which blocks the
conversion request of
Node2. move lock_2 to
conversion list.

                           convert lock_3 from
                           PR to EX.

move lock_3 to conversion
list. send BAST to Node3.

                           receive BAST from Node1.
                           downconvert thread execute
                           canceling convert operation.

Node2 dies because
the host is powered down.

                           in dlmunlock_common function,
                           the downconvert thread set
                           cancel_pending. at the same
                           time, Node 3 realized that
                           Node 1 is dead, so move lock_3
                           back to granted list in
                           dlm_move_lockres_to_recovery_list
                           function and remove Node 1 from
                           the domain_map in
                           __dlm_hb_node_down function.
                           then downconvert thread failed
                           to send the lock cancellation
                           request to Node1 and return
                           DLM_NORMAL from
                           dlm_send_remote_unlock_request
                           function.

                                                 become recovery master.

              during the recovery
              process, send
              lock_2 that is
              converting form
              PR to EX to Node4.

                           during the recovery process,
                           send lock_3 in the granted list and
                           cantain the DLM_LKSB_GET_LVB
                           flag to Node4. Then downconvert thread
                           delete DLM_LKSB_GET_LVB flag in
                           dlmunlock_common function.

                                                 Node4 finish recovery.
                                                 the mode of lock_3 is
                                                 PR, which blocks the
                                                 conversion request of
                                                 Node2, so send BAST
                                                 to Node3.

                           receive BAST from Node4.
                           convert lock_3 from PR to NL.

                                                 change the mode of lock_3
                                                 from PR to NL and send
                                                 message to Node3.

                           receive message from
                           Node4. The message contain
                           LKM_GET_LVB flag, but the
                           lock->lksb->flags does not
                           contain DLM_LKSB_GET_LVB,
                           BUG_ON in dlm_proxy_ast_handler
                           function.

Signed-off-by: Jian Wang <wangjian161@huawei.com>
Reviewed-by: Yiwen Jiang <jiangyiwen@huawei.com>
---
  fs/ocfs2/dlm/dlmunlock.c | 1 +
  1 file changed, 1 insertion(+)

Comments

Changwei Ge Dec. 5, 2018, 2:01 a.m. UTC | #1
Hi Jian,

Could your please also share the panic backtrace in dlm_proxy_ast_handler()?
After that I can help to review this patch and analyze what's wrong in DLM.

It will be better for you to tell your intention why LVB flags should be cleared rather
than just giving a longggg time sequence diagram.

Moreover, your patches are hard be applied to my tree :(

Thanks,
Changwei


On 2018/12/3 20:08, wangjian wrote:
> Function dlm_move_lockres_to_recovery_list should clean
> DLM_LKSB_GET_LVB and DLM_LKSB_PUT_LVB when the cancel_pending
> is set. Otherwise node may panic in dlm_proxy_ast_handler.
> 
> Here is the situation: At the beginning, Node1 is the master
> of the lock resource and has NL lock, Node2 has PR lock,
> Node3 has PR lock, Node4 has NL lock.
> 
> Node1        Node2            Node3            Node4
>               convert lock_2 from
>               PR to EX.
> 
> the mode of lock_3 is
> PR, which blocks the
> conversion request of
> Node2. move lock_2 to
> conversion list.
> 
>                            convert lock_3 from
>                            PR to EX.
> 
> move lock_3 to conversion
> list. send BAST to Node3.
> 
>                            receive BAST from Node1.
>                            downconvert thread execute
>                            canceling convert operation.
> 
> Node2 dies because
> the host is powered down.
> 
>                            in dlmunlock_common function,
>                            the downconvert thread set
>                            cancel_pending. at the same
>                            time, Node 3 realized that
>                            Node 1 is dead, so move lock_3
>                            back to granted list in
>                            dlm_move_lockres_to_recovery_list
>                            function and remove Node 1 from
>                            the domain_map in
>                            __dlm_hb_node_down function.
>                            then downconvert thread failed
>                            to send the lock cancellation
>                            request to Node1 and return
>                            DLM_NORMAL from
>                            dlm_send_remote_unlock_request
>                            function.
> 
>                                                  become recovery master.
> 
>               during the recovery
>               process, send
>               lock_2 that is
>               converting form
>               PR to EX to Node4.
> 
>                            during the recovery process,
>                            send lock_3 in the granted list and
>                            cantain the DLM_LKSB_GET_LVB
>                            flag to Node4. Then downconvert thread
>                            delete DLM_LKSB_GET_LVB flag in
>                            dlmunlock_common function.
> 
>                                                  Node4 finish recovery.
>                                                  the mode of lock_3 is
>                                                  PR, which blocks the
>                                                  conversion request of
>                                                  Node2, so send BAST
>                                                  to Node3.
> 
>                            receive BAST from Node4.
>                            convert lock_3 from PR to NL.
> 
>                                                  change the mode of lock_3
>                                                  from PR to NL and send
>                                                  message to Node3.
> 
>                            receive message from
>                            Node4. The message contain
>                            LKM_GET_LVB flag, but the
>                            lock->lksb->flags does not
>                            contain DLM_LKSB_GET_LVB,
>                            BUG_ON in dlm_proxy_ast_handler
>                            function.
> 
> Signed-off-by: Jian Wang<wangjian161@huawei.com>
> Reviewed-by: Yiwen Jiang<jiangyiwen@huawei.com>
> ---
>   fs/ocfs2/dlm/dlmunlock.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
> index 63d701c..6e04fc7 100644
> --- a/fs/ocfs2/dlm/dlmunlock.c
> +++ b/fs/ocfs2/dlm/dlmunlock.c
> @@ -277,6 +277,7 @@ void dlm_commit_pending_cancel(struct dlm_lock_resource *res,
>   {
>   	list_move_tail(&lock->list, &res->granted);
>   	lock->ml.convert_type = LKM_IVMODE;
> +	lock->lksb->flags &= ~(DLM_LKSB_GET_LVB|DLM_LKSB_PUT_LVB);
>   }
>   
>   
> -- 
> 1.8.3.1
>
wangjian Dec. 6, 2018, 11:53 a.m. UTC | #2
Hi Changwei,

The core information that causes the bug in the dlm_proxy_ast_handler function is as follows.

[  699.795843] kernel BUG at /home/Euler_compile_env/usr/src/linux-4.18/fs/ocfs2/dlm/dlmast.c:427!
[  699.797525] invalid opcode: 0000 [#1] SMP NOPTI
[  699.798383] CPU: 8 PID: 510 Comm: kworker/u24:1 Kdump: loaded Tainted: G           OE     4.18.0 #1
[  699.800002] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/30/2014
[  699.801963] Workqueue: o2net o2net_rx_until_empty [ocfs2_nodemanager]
[  699.803275] RIP: 0010:dlm_proxy_ast_handler+0x738/0x740 [ocfs2_dlm]
[  699.804710] Code: 00 10 48 8d 7c 24 48 48 89 44 24 48 48 c7 c1 f1 35 92 c0 ba 30 01 00 00 48 c7 c6 30 a9 91 c0 31 c0 e8
ac 88 fb ff 0f 0b 0f 0b <0f> 0b 66 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 e5 41 57 45 89 c7
[  699.808506] RSP: 0018:ffffba64c6f2fd38 EFLAGS: 00010246
[  699.809456] RAX: ffff9f34a9b39148 RBX: ffff9f30b7af4000 RCX: ffff9f34a9b39148
[  699.810698] RDX: 000000000000019e RSI: ffffffffc091a930 RDI: ffffba64c6f2fd80
[  699.811927] RBP: ffff9f2cb7aa3000 R08: ffff9f2cb7b99400 R09: 000000000000001f
[  699.813457] R10: ffff9f34a9249200 R11: ffff9f34af23aa00 R12: 0000000040000000
[  699.814719] R13: ffff9f34a9249210 R14: 0000000000000002 R15: ffff9f34af23aa28
[  699.815984] FS:  0000000000000000(0000) GS:ffff9f32b7c00000(0000) knlGS:0000000000000000
[  699.817417] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  699.818825] CR2: 00007fd772f5a140 CR3: 000000005b00a001 CR4: 00000000001606e0
[  699.820123] Call Trace:
[  699.820658]  o2net_rx_until_empty+0x94b/0xcc0 [ocfs2_nodemanager]
[  699.821848]  process_one_work+0x171/0x370
[  699.822595]  worker_thread+0x49/0x3f0
[  699.823301]  kthread+0xf8/0x130
[  699.823972]  ? max_active_store+0x80/0x80
[  699.824881]  ? kthread_bind+0x10/0x10
[  699.825589]  ret_from_fork+0x35/0x40

The reasons for clearing the LVB flag are as follows. First, The owner of the lock resource
may have died, the lock has been moved to the grant queue, the purpose of the lock
cancellation has been reached, and the LVB flag should be cleared. Second, solve this panic problem.

Thanks,
Jian

On 12/5/2018 10:01 AM, Changwei Ge wrote:
> Hi Jian,
>
> Could your please also share the panic backtrace in dlm_proxy_ast_handler()?
> After that I can help to review this patch and analyze what's wrong in DLM.
>
> It will be better for you to tell your intention why LVB flags should be cleared rather
> than just giving a longggg time sequence diagram.
>
> Moreover, your patches are hard be applied to my tree :(
>
> Thanks,
> Changwei
>
>
> On 2018/12/3 20:08, wangjian wrote:
>> Function dlm_move_lockres_to_recovery_list should clean
>> DLM_LKSB_GET_LVB and DLM_LKSB_PUT_LVB when the cancel_pending
>> is set. Otherwise node may panic in dlm_proxy_ast_handler.
>>
>> Here is the situation: At the beginning, Node1 is the master
>> of the lock resource and has NL lock, Node2 has PR lock,
>> Node3 has PR lock, Node4 has NL lock.
>>
>> Node1        Node2            Node3            Node4
>>                convert lock_2 from
>>                PR to EX.
>>
>> the mode of lock_3 is
>> PR, which blocks the
>> conversion request of
>> Node2. move lock_2 to
>> conversion list.
>>
>>                             convert lock_3 from
>>                             PR to EX.
>>
>> move lock_3 to conversion
>> list. send BAST to Node3.
>>
>>                             receive BAST from Node1.
>>                             downconvert thread execute
>>                             canceling convert operation.
>>
>> Node2 dies because
>> the host is powered down.
>>
>>                             in dlmunlock_common function,
>>                             the downconvert thread set
>>                             cancel_pending. at the same
>>                             time, Node 3 realized that
>>                             Node 1 is dead, so move lock_3
>>                             back to granted list in
>>                             dlm_move_lockres_to_recovery_list
>>                             function and remove Node 1 from
>>                             the domain_map in
>>                             __dlm_hb_node_down function.
>>                             then downconvert thread failed
>>                             to send the lock cancellation
>>                             request to Node1 and return
>>                             DLM_NORMAL from
>>                             dlm_send_remote_unlock_request
>>                             function.
>>
>>                                                   become recovery master.
>>
>>                during the recovery
>>                process, send
>>                lock_2 that is
>>                converting form
>>                PR to EX to Node4.
>>
>>                             during the recovery process,
>>                             send lock_3 in the granted list and
>>                             cantain the DLM_LKSB_GET_LVB
>>                             flag to Node4. Then downconvert thread
>>                             delete DLM_LKSB_GET_LVB flag in
>>                             dlmunlock_common function.
>>
>>                                                   Node4 finish recovery.
>>                                                   the mode of lock_3 is
>>                                                   PR, which blocks the
>>                                                   conversion request of
>>                                                   Node2, so send BAST
>>                                                   to Node3.
>>
>>                             receive BAST from Node4.
>>                             convert lock_3 from PR to NL.
>>
>>                                                   change the mode of lock_3
>>                                                   from PR to NL and send
>>                                                   message to Node3.
>>
>>                             receive message from
>>                             Node4. The message contain
>>                             LKM_GET_LVB flag, but the
>>                             lock->lksb->flags does not
>>                             contain DLM_LKSB_GET_LVB,
>>                             BUG_ON in dlm_proxy_ast_handler
>>                             function.
>>
>> Signed-off-by: Jian Wang<wangjian161@huawei.com>
>> Reviewed-by: Yiwen Jiang<jiangyiwen@huawei.com>
>> ---
>>    fs/ocfs2/dlm/dlmunlock.c | 1 +
>>    1 file changed, 1 insertion(+)
>>
>> diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
>> index 63d701c..6e04fc7 100644
>> --- a/fs/ocfs2/dlm/dlmunlock.c
>> +++ b/fs/ocfs2/dlm/dlmunlock.c
>> @@ -277,6 +277,7 @@ void dlm_commit_pending_cancel(struct dlm_lock_resource *res,
>>    {
>>    	list_move_tail(&lock->list, &res->granted);
>>    	lock->ml.convert_type = LKM_IVMODE;
>> +	lock->lksb->flags &= ~(DLM_LKSB_GET_LVB|DLM_LKSB_PUT_LVB);
>>    }
>>    
>>    
>> -- 
>> 1.8.3.1
>>
> .
>
<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <pre>Hi Changwei,

The core information that causes the bug in the dlm_proxy_ast_handler function is as follows.

[  699.795843] kernel BUG at /home/Euler_compile_env/usr/src/linux-4.18/fs/ocfs2/dlm/dlmast.c:427!
[  699.797525] invalid opcode: 0000 [#1] SMP NOPTI
[  699.798383] CPU: 8 PID: 510 Comm: kworker/u24:1 Kdump: loaded Tainted: G           OE     4.18.0 #1
[  699.800002] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/30/2014
[  699.801963] Workqueue: o2net o2net_rx_until_empty [ocfs2_nodemanager]
[  699.803275] RIP: 0010:dlm_proxy_ast_handler+0x738/0x740 [ocfs2_dlm]
[  699.804710] Code: 00 10 48 8d 7c 24 48 48 89 44 24 48 48 c7 c1 f1 35 92 c0 ba 30 01 00 00 48 c7 c6 30 a9 91 c0 31 c0 e8 
ac 88 fb ff 0f 0b 0f 0b &lt;0f&gt; 0b 66 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 e5 41 57 45 89 c7
[  699.808506] RSP: 0018:ffffba64c6f2fd38 EFLAGS: 00010246
[  699.809456] RAX: ffff9f34a9b39148 RBX: ffff9f30b7af4000 RCX: ffff9f34a9b39148
[  699.810698] RDX: 000000000000019e RSI: ffffffffc091a930 RDI: ffffba64c6f2fd80
[  699.811927] RBP: ffff9f2cb7aa3000 R08: ffff9f2cb7b99400 R09: 000000000000001f
[  699.813457] R10: ffff9f34a9249200 R11: ffff9f34af23aa00 R12: 0000000040000000
[  699.814719] R13: ffff9f34a9249210 R14: 0000000000000002 R15: ffff9f34af23aa28
[  699.815984] FS:  0000000000000000(0000) GS:ffff9f32b7c00000(0000) knlGS:0000000000000000
[  699.817417] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  699.818825] CR2: 00007fd772f5a140 CR3: 000000005b00a001 CR4: 00000000001606e0
[  699.820123] Call Trace:
[  699.820658]  o2net_rx_until_empty+0x94b/0xcc0 [ocfs2_nodemanager]
[  699.821848]  process_one_work+0x171/0x370
[  699.822595]  worker_thread+0x49/0x3f0
[  699.823301]  kthread+0xf8/0x130
[  699.823972]  ? max_active_store+0x80/0x80
[  699.824881]  ? kthread_bind+0x10/0x10
[  699.825589]  ret_from_fork+0x35/0x40

The reasons for clearing the LVB flag are as follows. First, The owner of the lock resource
may have died, the lock has been moved to the grant queue, the purpose of the lock 
cancellation has been reached, and the LVB flag should be cleared. Second, solve this panic problem.

Thanks,
Jian
</pre>
    <div class="moz-cite-prefix">On 12/5/2018 10:01 AM, Changwei Ge
      wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:63ADC13FD55D6546B7DECE290D39E37301277EFAE6@H3CMLB12-EX.srv.huawei-3com.com">
      <pre class="moz-quote-pre" wrap="">Hi Jian,

Could your please also share the panic backtrace in dlm_proxy_ast_handler()?
After that I can help to review this patch and analyze what's wrong in DLM.

It will be better for you to tell your intention why LVB flags should be cleared rather
than just giving a longggg time sequence diagram.

Moreover, your patches are hard be applied to my tree :(

Thanks,
Changwei


On 2018/12/3 20:08, wangjian wrote:
</pre>
      <blockquote type="cite">
        <pre class="moz-quote-pre" wrap="">Function dlm_move_lockres_to_recovery_list should clean
DLM_LKSB_GET_LVB and DLM_LKSB_PUT_LVB when the cancel_pending
is set. Otherwise node may panic in dlm_proxy_ast_handler.

Here is the situation: At the beginning, Node1 is the master
of the lock resource and has NL lock, Node2 has PR lock,
Node3 has PR lock, Node4 has NL lock.

Node1        Node2            Node3            Node4
              convert lock_2 from
              PR to EX.

the mode of lock_3 is
PR, which blocks the
conversion request of
Node2. move lock_2 to
conversion list.

                           convert lock_3 from
                           PR to EX.

move lock_3 to conversion
list. send BAST to Node3.

                           receive BAST from Node1.
                           downconvert thread execute
                           canceling convert operation.

Node2 dies because
the host is powered down.

                           in dlmunlock_common function,
                           the downconvert thread set
                           cancel_pending. at the same
                           time, Node 3 realized that
                           Node 1 is dead, so move lock_3
                           back to granted list in
                           dlm_move_lockres_to_recovery_list
                           function and remove Node 1 from
                           the domain_map in
                           __dlm_hb_node_down function.
                           then downconvert thread failed
                           to send the lock cancellation
                           request to Node1 and return
                           DLM_NORMAL from
                           dlm_send_remote_unlock_request
                           function.

                                                 become recovery master.

              during the recovery
              process, send
              lock_2 that is
              converting form
              PR to EX to Node4.

                           during the recovery process,
                           send lock_3 in the granted list and
                           cantain the DLM_LKSB_GET_LVB
                           flag to Node4. Then downconvert thread
                           delete DLM_LKSB_GET_LVB flag in
                           dlmunlock_common function.

                                                 Node4 finish recovery.
                                                 the mode of lock_3 is
                                                 PR, which blocks the
                                                 conversion request of
                                                 Node2, so send BAST
                                                 to Node3.

                           receive BAST from Node4.
                           convert lock_3 from PR to NL.

                                                 change the mode of lock_3
                                                 from PR to NL and send
                                                 message to Node3.

                           receive message from
                           Node4. The message contain
                           LKM_GET_LVB flag, but the
                           lock-&gt;lksb-&gt;flags does not
                           contain DLM_LKSB_GET_LVB,
                           BUG_ON in dlm_proxy_ast_handler
                           function.

Signed-off-by: Jian Wang<a class="moz-txt-link-rfc2396E" href="mailto:wangjian161@huawei.com">&lt;wangjian161@huawei.com&gt;</a>
Reviewed-by: Yiwen Jiang<a class="moz-txt-link-rfc2396E" href="mailto:jiangyiwen@huawei.com">&lt;jiangyiwen@huawei.com&gt;</a>
---
  fs/ocfs2/dlm/dlmunlock.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
index 63d701c..6e04fc7 100644
--- a/fs/ocfs2/dlm/dlmunlock.c
+++ b/fs/ocfs2/dlm/dlmunlock.c
@@ -277,6 +277,7 @@ void dlm_commit_pending_cancel(struct dlm_lock_resource *res,
  {
  	list_move_tail(&amp;lock-&gt;list, &amp;res-&gt;granted);
  	lock-&gt;ml.convert_type = LKM_IVMODE;
+	lock-&gt;lksb-&gt;flags &amp;= ~(DLM_LKSB_GET_LVB|DLM_LKSB_PUT_LVB);
  }
Changwei Ge Dec. 7, 2018, 3:40 a.m. UTC | #3
Hi Jian,

Um...
I am a little puzzled after delving into this patch.

Do you mean the BUG check below?
'''
425                 /* if we requested the lvb, fetch it into our lksb now */
426                 if (flags & LKM_GET_LVB) {
427                         BUG_ON(!(lock->lksb->flags & DLM_LKSB_GET_LVB));
428                         memcpy(lock->lksb->lvb, past->lvb, DLM_LVB_LEN);
429                 }

'''

If so, you clear DLM_LKSB_GET_LVB in dlm_commit_pending_cancel() and how could the LVB bit be set in dlm_proxy_ast_handler()?

Thanks,
Changwei


On 2018/12/6 19:54, wangjian wrote:
> Hi Changwei,
> 
> The core information that causes the bug in the dlm_proxy_ast_handler function is as follows.
> 
> [  699.795843] kernel BUG at /home/Euler_compile_env/usr/src/linux-4.18/fs/ocfs2/dlm/dlmast.c:427!
> [  699.797525] invalid opcode: 0000 [#1] SMP NOPTI
> [  699.798383] CPU: 8 PID: 510 Comm: kworker/u24:1 Kdump: loaded Tainted: G           OE     4.18.0 #1
> [  699.800002] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/30/2014
> [  699.801963] Workqueue: o2net o2net_rx_until_empty [ocfs2_nodemanager]
> [  699.803275] RIP: 0010:dlm_proxy_ast_handler+0x738/0x740 [ocfs2_dlm]
> [  699.804710] Code: 00 10 48 8d 7c 24 48 48 89 44 24 48 48 c7 c1 f1 35 92 c0 ba 30 01 00 00 48 c7 c6 30 a9 91 c0 31 c0 e8
> ac 88 fb ff 0f 0b 0f 0b <0f> 0b 66 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 e5 41 57 45 89 c7
> [  699.808506] RSP: 0018:ffffba64c6f2fd38 EFLAGS: 00010246
> [  699.809456] RAX: ffff9f34a9b39148 RBX: ffff9f30b7af4000 RCX: ffff9f34a9b39148
> [  699.810698] RDX: 000000000000019e RSI: ffffffffc091a930 RDI: ffffba64c6f2fd80
> [  699.811927] RBP: ffff9f2cb7aa3000 R08: ffff9f2cb7b99400 R09: 000000000000001f
> [  699.813457] R10: ffff9f34a9249200 R11: ffff9f34af23aa00 R12: 0000000040000000
> [  699.814719] R13: ffff9f34a9249210 R14: 0000000000000002 R15: ffff9f34af23aa28
> [  699.815984] FS:  0000000000000000(0000) GS:ffff9f32b7c00000(0000) knlGS:0000000000000000
> [  699.817417] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  699.818825] CR2: 00007fd772f5a140 CR3: 000000005b00a001 CR4: 00000000001606e0
> [  699.820123] Call Trace:
> [  699.820658]  o2net_rx_until_empty+0x94b/0xcc0 [ocfs2_nodemanager]
> [  699.821848]  process_one_work+0x171/0x370
> [  699.822595]  worker_thread+0x49/0x3f0
> [  699.823301]  kthread+0xf8/0x130
> [  699.823972]  ? max_active_store+0x80/0x80
> [  699.824881]  ? kthread_bind+0x10/0x10
> [  699.825589]  ret_from_fork+0x35/0x40
> 
> The reasons for clearing the LVB flag are as follows. First, The owner of the lock resource
> may have died, the lock has been moved to the grant queue, the purpose of the lock
> cancellation has been reached, and the LVB flag should be cleared. Second, solve this panic problem.
> 
> Thanks,
> Jian
> 
> On 12/5/2018 10:01 AM, Changwei Ge wrote:
>> Hi Jian,
>>
>> Could your please also share the panic backtrace in dlm_proxy_ast_handler()?
>> After that I can help to review this patch and analyze what's wrong in DLM.
>>
>> It will be better for you to tell your intention why LVB flags should be cleared rather
>> than just giving a longggg time sequence diagram.
>>
>> Moreover, your patches are hard be applied to my tree :(
>>
>> Thanks,
>> Changwei
>>
>>
>> On 2018/12/3 20:08, wangjian wrote:
>>> Function dlm_move_lockres_to_recovery_list should clean
>>> DLM_LKSB_GET_LVB and DLM_LKSB_PUT_LVB when the cancel_pending
>>> is set. Otherwise node may panic in dlm_proxy_ast_handler.
>>>
>>> Here is the situation: At the beginning, Node1 is the master
>>> of the lock resource and has NL lock, Node2 has PR lock,
>>> Node3 has PR lock, Node4 has NL lock.
>>>
>>> Node1        Node2            Node3            Node4
>>>                convert lock_2 from
>>>                PR to EX.
>>>
>>> the mode of lock_3 is
>>> PR, which blocks the
>>> conversion request of
>>> Node2. move lock_2 to
>>> conversion list.
>>>
>>>                             convert lock_3 from
>>>                             PR to EX.
>>>
>>> move lock_3 to conversion
>>> list. send BAST to Node3.
>>>
>>>                             receive BAST from Node1.
>>>                             downconvert thread execute
>>>                             canceling convert operation.
>>>
>>> Node2 dies because
>>> the host is powered down.
>>>
>>>                             in dlmunlock_common function,
>>>                             the downconvert thread set
>>>                             cancel_pending. at the same
>>>                             time, Node 3 realized that
>>>                             Node 1 is dead, so move lock_3
>>>                             back to granted list in
>>>                             dlm_move_lockres_to_recovery_list
>>>                             function and remove Node 1 from
>>>                             the domain_map in
>>>                             __dlm_hb_node_down function.
>>>                             then downconvert thread failed
>>>                             to send the lock cancellation
>>>                             request to Node1 and return
>>>                             DLM_NORMAL from
>>>                             dlm_send_remote_unlock_request
>>>                             function.
>>>
>>>                                                   become recovery master.
>>>
>>>                during the recovery
>>>                process, send
>>>                lock_2 that is
>>>                converting form
>>>                PR to EX to Node4.
>>>
>>>                             during the recovery process,
>>>                             send lock_3 in the granted list and
>>>                             cantain the DLM_LKSB_GET_LVB
>>>                             flag to Node4. Then downconvert thread
>>>                             delete DLM_LKSB_GET_LVB flag in
>>>                             dlmunlock_common function.
>>>
>>>                                                   Node4 finish recovery.
>>>                                                   the mode of lock_3 is
>>>                                                   PR, which blocks the
>>>                                                   conversion request of
>>>                                                   Node2, so send BAST
>>>                                                   to Node3.
>>>
>>>                             receive BAST from Node4.
>>>                             convert lock_3 from PR to NL.
>>>
>>>                                                   change the mode of lock_3
>>>                                                   from PR to NL and send
>>>                                                   message to Node3.
>>>
>>>                             receive message from
>>>                             Node4. The message contain
>>>                             LKM_GET_LVB flag, but the
>>>                             lock->lksb->flags does not
>>>                             contain DLM_LKSB_GET_LVB,
>>>                             BUG_ON in dlm_proxy_ast_handler
>>>                             function.
>>>
>>> Signed-off-by: Jian Wang<wangjian161@huawei.com>
>>> Reviewed-by: Yiwen Jiang<jiangyiwen@huawei.com>
>>> ---
>>>    fs/ocfs2/dlm/dlmunlock.c | 1 +
>>>    1 file changed, 1 insertion(+)
>>>
>>> diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
>>> index 63d701c..6e04fc7 100644
>>> --- a/fs/ocfs2/dlm/dlmunlock.c
>>> +++ b/fs/ocfs2/dlm/dlmunlock.c
>>> @@ -277,6 +277,7 @@ void dlm_commit_pending_cancel(struct dlm_lock_resource *res,
>>>    {
>>>    	list_move_tail(&lock->list, &res->granted);
>>>    	lock->ml.convert_type = LKM_IVMODE;
>>> +	lock->lksb->flags &= ~(DLM_LKSB_GET_LVB|DLM_LKSB_PUT_LVB);
>>>    }
>>>    
>>>    
>>> -- 
>>> 1.8.3.1
>>>
>> .
>>
wangjian Dec. 8, 2018, 7:40 a.m. UTC | #4
Hi Changwei,

What do you mean by setting the LVB bit to be 428 lines of memcpy?
If so, you should realize that under the scenario and problem solving method I mentioned,
the result of the last AST processing of the if statement in line 426 is false.
This is indeed a more complicated anomaly scenario,
and you may not fully understand the scenario I mentioned above.
Below I will describe the key points of this scenario again.

The master of the lock resource has died. Because Node3 is in the cancel_convert process,
Node3 moves the lock to the grant queue in the dlm_move_lockres_to_recovery_list
function (the DLM_LKSB_GET_LVB and DLM_LKSB_PUT_LVB are not cleared).
The lock information is sent to the new master (Node4) during the recovery process.
Then the cancel_convert process clears DLM_LKSB_GET_LVB and DLM_LKSB_PUT_LVB.
The lock_3 information on the new master (Node4) is inconsistent with
the local lock_3 information on Node3, causing a BUG to occur when Node3 receives the AST.

Thanks,
Jian

On 12/7/2018 11:40 AM, Changwei Ge wrote:
> Hi Jian,
>
> Um...
> I am a little puzzled after delving into this patch.
>
> Do you mean the BUG check below?
> '''
> 425                 /* if we requested the lvb, fetch it into our lksb now */
> 426                 if (flags & LKM_GET_LVB) {
> 427                         BUG_ON(!(lock->lksb->flags & DLM_LKSB_GET_LVB));
> 428                         memcpy(lock->lksb->lvb, past->lvb, DLM_LVB_LEN);
> 429                 }
>
> '''
>
> If so, you clear DLM_LKSB_GET_LVB in dlm_commit_pending_cancel() and how could the LVB bit be set in dlm_proxy_ast_handler()?
>
> Thanks,
> Changwei
>
>
> On 2018/12/6 19:54, wangjian wrote:
>> Hi Changwei,
>>
>> The core information that causes the bug in the dlm_proxy_ast_handler function is as follows.
>>
>> [  699.795843] kernel BUG at /home/Euler_compile_env/usr/src/linux-4.18/fs/ocfs2/dlm/dlmast.c:427!
>> [  699.797525] invalid opcode: 0000 [#1] SMP NOPTI
>> [  699.798383] CPU: 8 PID: 510 Comm: kworker/u24:1 Kdump: loaded Tainted: G           OE     4.18.0 #1
>> [  699.800002] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/30/2014
>> [  699.801963] Workqueue: o2net o2net_rx_until_empty [ocfs2_nodemanager]
>> [  699.803275] RIP: 0010:dlm_proxy_ast_handler+0x738/0x740 [ocfs2_dlm]
>> [  699.804710] Code: 00 10 48 8d 7c 24 48 48 89 44 24 48 48 c7 c1 f1 35 92 c0 ba 30 01 00 00 48 c7 c6 30 a9 91 c0 31 c0 e8
>> ac 88 fb ff 0f 0b 0f 0b <0f> 0b 66 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 e5 41 57 45 89 c7
>> [  699.808506] RSP: 0018:ffffba64c6f2fd38 EFLAGS: 00010246
>> [  699.809456] RAX: ffff9f34a9b39148 RBX: ffff9f30b7af4000 RCX: ffff9f34a9b39148
>> [  699.810698] RDX: 000000000000019e RSI: ffffffffc091a930 RDI: ffffba64c6f2fd80
>> [  699.811927] RBP: ffff9f2cb7aa3000 R08: ffff9f2cb7b99400 R09: 000000000000001f
>> [  699.813457] R10: ffff9f34a9249200 R11: ffff9f34af23aa00 R12: 0000000040000000
>> [  699.814719] R13: ffff9f34a9249210 R14: 0000000000000002 R15: ffff9f34af23aa28
>> [  699.815984] FS:  0000000000000000(0000) GS:ffff9f32b7c00000(0000) knlGS:0000000000000000
>> [  699.817417] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [  699.818825] CR2: 00007fd772f5a140 CR3: 000000005b00a001 CR4: 00000000001606e0
>> [  699.820123] Call Trace:
>> [  699.820658]  o2net_rx_until_empty+0x94b/0xcc0 [ocfs2_nodemanager]
>> [  699.821848]  process_one_work+0x171/0x370
>> [  699.822595]  worker_thread+0x49/0x3f0
>> [  699.823301]  kthread+0xf8/0x130
>> [  699.823972]  ? max_active_store+0x80/0x80
>> [  699.824881]  ? kthread_bind+0x10/0x10
>> [  699.825589]  ret_from_fork+0x35/0x40
>>
>> The reasons for clearing the LVB flag are as follows. First, The owner of the lock resource
>> may have died, the lock has been moved to the grant queue, the purpose of the lock
>> cancellation has been reached, and the LVB flag should be cleared. Second, solve this panic problem.
>>
>> Thanks,
>> Jian
>>
>> On 12/5/2018 10:01 AM, Changwei Ge wrote:
>>> Hi Jian,
>>>
>>> Could your please also share the panic backtrace in dlm_proxy_ast_handler()?
>>> After that I can help to review this patch and analyze what's wrong in DLM.
>>>
>>> It will be better for you to tell your intention why LVB flags should be cleared rather
>>> than just giving a longggg time sequence diagram.
>>>
>>> Moreover, your patches are hard be applied to my tree :(
>>>
>>> Thanks,
>>> Changwei
>>>
>>>
>>> On 2018/12/3 20:08, wangjian wrote:
>>>> Function dlm_move_lockres_to_recovery_list should clean
>>>> DLM_LKSB_GET_LVB and DLM_LKSB_PUT_LVB when the cancel_pending
>>>> is set. Otherwise node may panic in dlm_proxy_ast_handler.
>>>>
>>>> Here is the situation: At the beginning, Node1 is the master
>>>> of the lock resource and has NL lock, Node2 has PR lock,
>>>> Node3 has PR lock, Node4 has NL lock.
>>>>
>>>> Node1        Node2            Node3            Node4
>>>>                 convert lock_2 from
>>>>                 PR to EX.
>>>>
>>>> the mode of lock_3 is
>>>> PR, which blocks the
>>>> conversion request of
>>>> Node2. move lock_2 to
>>>> conversion list.
>>>>
>>>>                              convert lock_3 from
>>>>                              PR to EX.
>>>>
>>>> move lock_3 to conversion
>>>> list. send BAST to Node3.
>>>>
>>>>                              receive BAST from Node1.
>>>>                              downconvert thread execute
>>>>                              canceling convert operation.
>>>>
>>>> Node2 dies because
>>>> the host is powered down.
>>>>
>>>>                              in dlmunlock_common function,
>>>>                              the downconvert thread set
>>>>                              cancel_pending. at the same
>>>>                              time, Node 3 realized that
>>>>                              Node 1 is dead, so move lock_3
>>>>                              back to granted list in
>>>>                              dlm_move_lockres_to_recovery_list
>>>>                              function and remove Node 1 from
>>>>                              the domain_map in
>>>>                              __dlm_hb_node_down function.
>>>>                              then downconvert thread failed
>>>>                              to send the lock cancellation
>>>>                              request to Node1 and return
>>>>                              DLM_NORMAL from
>>>>                              dlm_send_remote_unlock_request
>>>>                              function.
>>>>
>>>>                                                    become recovery master.
>>>>
>>>>                 during the recovery
>>>>                 process, send
>>>>                 lock_2 that is
>>>>                 converting form
>>>>                 PR to EX to Node4.
>>>>
>>>>                              during the recovery process,
>>>>                              send lock_3 in the granted list and
>>>>                              cantain the DLM_LKSB_GET_LVB
>>>>                              flag to Node4. Then downconvert thread
>>>>                              delete DLM_LKSB_GET_LVB flag in
>>>>                              dlmunlock_common function.
>>>>
>>>>                                                    Node4 finish recovery.
>>>>                                                    the mode of lock_3 is
>>>>                                                    PR, which blocks the
>>>>                                                    conversion request of
>>>>                                                    Node2, so send BAST
>>>>                                                    to Node3.
>>>>
>>>>                              receive BAST from Node4.
>>>>                              convert lock_3 from PR to NL.
>>>>
>>>>                                                    change the mode of lock_3
>>>>                                                    from PR to NL and send
>>>>                                                    message to Node3.
>>>>
>>>>                              receive message from
>>>>                              Node4. The message contain
>>>>                              LKM_GET_LVB flag, but the
>>>>                              lock->lksb->flags does not
>>>>                              contain DLM_LKSB_GET_LVB,
>>>>                              BUG_ON in dlm_proxy_ast_handler
>>>>                              function.
>>>>
>>>> Signed-off-by: Jian Wang<wangjian161@huawei.com>
>>>> Reviewed-by: Yiwen Jiang<jiangyiwen@huawei.com>
>>>> ---
>>>>     fs/ocfs2/dlm/dlmunlock.c | 1 +
>>>>     1 file changed, 1 insertion(+)
>>>>
>>>> diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
>>>> index 63d701c..6e04fc7 100644
>>>> --- a/fs/ocfs2/dlm/dlmunlock.c
>>>> +++ b/fs/ocfs2/dlm/dlmunlock.c
>>>> @@ -277,6 +277,7 @@ void dlm_commit_pending_cancel(struct dlm_lock_resource *res,
>>>>     {
>>>>     	list_move_tail(&lock->list, &res->granted);
>>>>     	lock->ml.convert_type = LKM_IVMODE;
>>>> +	lock->lksb->flags &= ~(DLM_LKSB_GET_LVB|DLM_LKSB_PUT_LVB);
>>>>     }
>>>>     
>>>>     
>>>> -- 
>>>> 1.8.3.1
>>>>
>>> .
>>>
> .
>
<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <pre>Hi Changwei,

What do you mean by setting the LVB bit to be 428 lines of memcpy?
If so, you should realize that under the scenario and problem solving method I mentioned,
the result of the last AST processing of the if statement in line 426 is false.
This is indeed a more complicated anomaly scenario,
and you may not fully understand the scenario I mentioned above.
Below I will describe the key points of this scenario again.

The master of the lock resource has died. Because Node3 is in the cancel_convert process,
Node3 moves the lock to the grant queue in the dlm_move_lockres_to_recovery_list
function (the DLM_LKSB_GET_LVB and DLM_LKSB_PUT_LVB are not cleared).
The lock information is sent to the new master (Node4) during the recovery process.
Then the cancel_convert process clears DLM_LKSB_GET_LVB and DLM_LKSB_PUT_LVB.
The lock_3 information on the new master (Node4) is inconsistent with
the local lock_3 information on Node3, causing a BUG to occur when Node3 receives the AST.

Thanks,
Jian
</pre>
    <div class="moz-cite-prefix">On 12/7/2018 11:40 AM, Changwei Ge
      wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:63ADC13FD55D6546B7DECE290D39E37301277F15DF@H3CMLB12-EX.srv.huawei-3com.com">
      <pre class="moz-quote-pre" wrap="">Hi Jian,

Um...
I am a little puzzled after delving into this patch.

Do you mean the BUG check below?
'''
425                 /* if we requested the lvb, fetch it into our lksb now */
426                 if (flags &amp; LKM_GET_LVB) {
427                         BUG_ON(!(lock-&gt;lksb-&gt;flags &amp; DLM_LKSB_GET_LVB));
428                         memcpy(lock-&gt;lksb-&gt;lvb, past-&gt;lvb, DLM_LVB_LEN);
429                 }

'''

If so, you clear DLM_LKSB_GET_LVB in dlm_commit_pending_cancel() and how could the LVB bit be set in dlm_proxy_ast_handler()?

Thanks,
Changwei


On 2018/12/6 19:54, wangjian wrote:
</pre>
      <blockquote type="cite">
        <pre class="moz-quote-pre" wrap="">Hi Changwei,

The core information that causes the bug in the dlm_proxy_ast_handler function is as follows.

[  699.795843] kernel BUG at /home/Euler_compile_env/usr/src/linux-4.18/fs/ocfs2/dlm/dlmast.c:427!
[  699.797525] invalid opcode: 0000 [#1] SMP NOPTI
[  699.798383] CPU: 8 PID: 510 Comm: kworker/u24:1 Kdump: loaded Tainted: G           OE     4.18.0 #1
[  699.800002] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/30/2014
[  699.801963] Workqueue: o2net o2net_rx_until_empty [ocfs2_nodemanager]
[  699.803275] RIP: 0010:dlm_proxy_ast_handler+0x738/0x740 [ocfs2_dlm]
[  699.804710] Code: 00 10 48 8d 7c 24 48 48 89 44 24 48 48 c7 c1 f1 35 92 c0 ba 30 01 00 00 48 c7 c6 30 a9 91 c0 31 c0 e8
ac 88 fb ff 0f 0b 0f 0b &lt;0f&gt; 0b 66 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 e5 41 57 45 89 c7
[  699.808506] RSP: 0018:ffffba64c6f2fd38 EFLAGS: 00010246
[  699.809456] RAX: ffff9f34a9b39148 RBX: ffff9f30b7af4000 RCX: ffff9f34a9b39148
[  699.810698] RDX: 000000000000019e RSI: ffffffffc091a930 RDI: ffffba64c6f2fd80
[  699.811927] RBP: ffff9f2cb7aa3000 R08: ffff9f2cb7b99400 R09: 000000000000001f
[  699.813457] R10: ffff9f34a9249200 R11: ffff9f34af23aa00 R12: 0000000040000000
[  699.814719] R13: ffff9f34a9249210 R14: 0000000000000002 R15: ffff9f34af23aa28
[  699.815984] FS:  0000000000000000(0000) GS:ffff9f32b7c00000(0000) knlGS:0000000000000000
[  699.817417] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  699.818825] CR2: 00007fd772f5a140 CR3: 000000005b00a001 CR4: 00000000001606e0
[  699.820123] Call Trace:
[  699.820658]  o2net_rx_until_empty+0x94b/0xcc0 [ocfs2_nodemanager]
[  699.821848]  process_one_work+0x171/0x370
[  699.822595]  worker_thread+0x49/0x3f0
[  699.823301]  kthread+0xf8/0x130
[  699.823972]  ? max_active_store+0x80/0x80
[  699.824881]  ? kthread_bind+0x10/0x10
[  699.825589]  ret_from_fork+0x35/0x40

The reasons for clearing the LVB flag are as follows. First, The owner of the lock resource
may have died, the lock has been moved to the grant queue, the purpose of the lock
cancellation has been reached, and the LVB flag should be cleared. Second, solve this panic problem.

Thanks,
Jian

On 12/5/2018 10:01 AM, Changwei Ge wrote:
</pre>
        <blockquote type="cite">
          <pre class="moz-quote-pre" wrap="">Hi Jian,

Could your please also share the panic backtrace in dlm_proxy_ast_handler()?
After that I can help to review this patch and analyze what's wrong in DLM.

It will be better for you to tell your intention why LVB flags should be cleared rather
than just giving a longggg time sequence diagram.

Moreover, your patches are hard be applied to my tree :(

Thanks,
Changwei


On 2018/12/3 20:08, wangjian wrote:
</pre>
          <blockquote type="cite">
            <pre class="moz-quote-pre" wrap="">Function dlm_move_lockres_to_recovery_list should clean
DLM_LKSB_GET_LVB and DLM_LKSB_PUT_LVB when the cancel_pending
is set. Otherwise node may panic in dlm_proxy_ast_handler.

Here is the situation: At the beginning, Node1 is the master
of the lock resource and has NL lock, Node2 has PR lock,
Node3 has PR lock, Node4 has NL lock.

Node1        Node2            Node3            Node4
               convert lock_2 from
               PR to EX.

the mode of lock_3 is
PR, which blocks the
conversion request of
Node2. move lock_2 to
conversion list.

                            convert lock_3 from
                            PR to EX.

move lock_3 to conversion
list. send BAST to Node3.

                            receive BAST from Node1.
                            downconvert thread execute
                            canceling convert operation.

Node2 dies because
the host is powered down.

                            in dlmunlock_common function,
                            the downconvert thread set
                            cancel_pending. at the same
                            time, Node 3 realized that
                            Node 1 is dead, so move lock_3
                            back to granted list in
                            dlm_move_lockres_to_recovery_list
                            function and remove Node 1 from
                            the domain_map in
                            __dlm_hb_node_down function.
                            then downconvert thread failed
                            to send the lock cancellation
                            request to Node1 and return
                            DLM_NORMAL from
                            dlm_send_remote_unlock_request
                            function.

                                                  become recovery master.

               during the recovery
               process, send
               lock_2 that is
               converting form
               PR to EX to Node4.

                            during the recovery process,
                            send lock_3 in the granted list and
                            cantain the DLM_LKSB_GET_LVB
                            flag to Node4. Then downconvert thread
                            delete DLM_LKSB_GET_LVB flag in
                            dlmunlock_common function.

                                                  Node4 finish recovery.
                                                  the mode of lock_3 is
                                                  PR, which blocks the
                                                  conversion request of
                                                  Node2, so send BAST
                                                  to Node3.

                            receive BAST from Node4.
                            convert lock_3 from PR to NL.

                                                  change the mode of lock_3
                                                  from PR to NL and send
                                                  message to Node3.

                            receive message from
                            Node4. The message contain
                            LKM_GET_LVB flag, but the
                            lock-&gt;lksb-&gt;flags does not
                            contain DLM_LKSB_GET_LVB,
                            BUG_ON in dlm_proxy_ast_handler
                            function.

Signed-off-by: Jian Wang<a class="moz-txt-link-rfc2396E" href="mailto:wangjian161@huawei.com">&lt;wangjian161@huawei.com&gt;</a>
Reviewed-by: Yiwen Jiang<a class="moz-txt-link-rfc2396E" href="mailto:jiangyiwen@huawei.com">&lt;jiangyiwen@huawei.com&gt;</a>
---
   fs/ocfs2/dlm/dlmunlock.c | 1 +
   1 file changed, 1 insertion(+)

diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
index 63d701c..6e04fc7 100644
--- a/fs/ocfs2/dlm/dlmunlock.c
+++ b/fs/ocfs2/dlm/dlmunlock.c
@@ -277,6 +277,7 @@ void dlm_commit_pending_cancel(struct dlm_lock_resource *res,
   {
   	list_move_tail(&amp;lock-&gt;list, &amp;res-&gt;granted);
   	lock-&gt;ml.convert_type = LKM_IVMODE;
+	lock-&gt;lksb-&gt;flags &amp;= ~(DLM_LKSB_GET_LVB|DLM_LKSB_PUT_LVB);
   }
Changwei Ge Dec. 8, 2018, 8:14 a.m. UTC | #5
On 2018/12/8 15:40, wangjian wrote:
> Hi Changwei,
> 
> What do you mean by setting the LVB bit to be 428 lines of memcpy?
> If so, you should realize that under the scenario and problem solving method I mentioned,
> the result of the last AST processing of the if statement in line 426 is false.
> This is indeed a more complicated anomaly scenario,
> and you may not fully understand the scenario I mentioned above.
> Below I will describe the key points of this scenario again.
> 
> The master of the lock resource has died. Because Node3 is in the cancel_convert process,
> Node3 moves the lock to the grant queue in the dlm_move_lockres_to_recovery_list
> function (the DLM_LKSB_GET_LVB and DLM_LKSB_PUT_LVB are not cleared).
> The lock information is sent to the new master (Node4) during the recovery process.
> Then the cancel_convert process clears DLM_LKSB_GET_LVB and DLM_LKSB_PUT_LVB.
> The lock_3 information on the new master (Node4) is inconsistent with
> the local lock_3 information on Node3, causing a BUG to occur when Node3 receives the AST.

Aha... I got your point and what's the problem like.

So as the old master died Node3 thinks that the conversion cancellation succeeds and thus clears _DLM_LKSB_PUT_LVB_ and _DLM_LKSB_GET_LVB_ in dlmunlock_common().
But before that dlm recovery procedure had already sent lksb of lock to the *new* master, right? Um, so we have a very fast dlm recovery progress. :-)
I admit its possibility.

So actually, we can justify if conversion cancellation succeeds or not like your another patch, right?
If we already know we had a field  conversion cancellation, why we still clear _DLM_LKSB_PUT_LVB_ and _DLM_LKSB_GET_LVB_ ?

And IMO in your case sending _DLM_LKSB_PUT_LVB_ and _DLM_LKSB_GET_LVB_ to *new* master makes sense as we still can ask the new master transfer LVB back to boost performace.

So I prefer to merge your two patches addressing the problems and I think they have the same root cause.

Thanks,
Changwei


> 
> Thanks,
> Jian
> 
> On 12/7/2018 11:40 AM, Changwei Ge wrote:
>> Hi Jian,
>>
>> Um...
>> I am a little puzzled after delving into this patch.
>>
>> Do you mean the BUG check below?
>> '''
>> 425                 /* if we requested the lvb, fetch it into our lksb now */
>> 426                 if (flags & LKM_GET_LVB) {
>> 427                         BUG_ON(!(lock->lksb->flags & DLM_LKSB_GET_LVB));
>> 428                         memcpy(lock->lksb->lvb, past->lvb, DLM_LVB_LEN);
>> 429                 }
>>
>> '''
>>
>> If so, you clear DLM_LKSB_GET_LVB in dlm_commit_pending_cancel() and how could the LVB bit be set in dlm_proxy_ast_handler()?
>>
>> Thanks,
>> Changwei
>>
>>
>> On 2018/12/6 19:54, wangjian wrote:
>>> Hi Changwei,
>>>
>>> The core information that causes the bug in the dlm_proxy_ast_handler function is as follows.
>>>
>>> [  699.795843] kernel BUG at /home/Euler_compile_env/usr/src/linux-4.18/fs/ocfs2/dlm/dlmast.c:427!
>>> [  699.797525] invalid opcode: 0000 [#1] SMP NOPTI
>>> [  699.798383] CPU: 8 PID: 510 Comm: kworker/u24:1 Kdump: loaded Tainted: G           OE     4.18.0 #1
>>> [  699.800002] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/30/2014
>>> [  699.801963] Workqueue: o2net o2net_rx_until_empty [ocfs2_nodemanager]
>>> [  699.803275] RIP: 0010:dlm_proxy_ast_handler+0x738/0x740 [ocfs2_dlm]
>>> [  699.804710] Code: 00 10 48 8d 7c 24 48 48 89 44 24 48 48 c7 c1 f1 35 92 c0 ba 30 01 00 00 48 c7 c6 30 a9 91 c0 31 c0 e8
>>> ac 88 fb ff 0f 0b 0f 0b <0f> 0b 66 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 e5 41 57 45 89 c7
>>> [  699.808506] RSP: 0018:ffffba64c6f2fd38 EFLAGS: 00010246
>>> [  699.809456] RAX: ffff9f34a9b39148 RBX: ffff9f30b7af4000 RCX: ffff9f34a9b39148
>>> [  699.810698] RDX: 000000000000019e RSI: ffffffffc091a930 RDI: ffffba64c6f2fd80
>>> [  699.811927] RBP: ffff9f2cb7aa3000 R08: ffff9f2cb7b99400 R09: 000000000000001f
>>> [  699.813457] R10: ffff9f34a9249200 R11: ffff9f34af23aa00 R12: 0000000040000000
>>> [  699.814719] R13: ffff9f34a9249210 R14: 0000000000000002 R15: ffff9f34af23aa28
>>> [  699.815984] FS:  0000000000000000(0000) GS:ffff9f32b7c00000(0000) knlGS:0000000000000000
>>> [  699.817417] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [  699.818825] CR2: 00007fd772f5a140 CR3: 000000005b00a001 CR4: 00000000001606e0
>>> [  699.820123] Call Trace:
>>> [  699.820658]  o2net_rx_until_empty+0x94b/0xcc0 [ocfs2_nodemanager]
>>> [  699.821848]  process_one_work+0x171/0x370
>>> [  699.822595]  worker_thread+0x49/0x3f0
>>> [  699.823301]  kthread+0xf8/0x130
>>> [  699.823972]  ? max_active_store+0x80/0x80
>>> [  699.824881]  ? kthread_bind+0x10/0x10
>>> [  699.825589]  ret_from_fork+0x35/0x40
>>>
>>> The reasons for clearing the LVB flag are as follows. First, The owner of the lock resource
>>> may have died, the lock has been moved to the grant queue, the purpose of the lock
>>> cancellation has been reached, and the LVB flag should be cleared. Second, solve this panic problem.
>>>
>>> Thanks,
>>> Jian
>>>
>>> On 12/5/2018 10:01 AM, Changwei Ge wrote:
>>>> Hi Jian,
>>>>
>>>> Could your please also share the panic backtrace in dlm_proxy_ast_handler()?
>>>> After that I can help to review this patch and analyze what's wrong in DLM.
>>>>
>>>> It will be better for you to tell your intention why LVB flags should be cleared rather
>>>> than just giving a longggg time sequence diagram.
>>>>
>>>> Moreover, your patches are hard be applied to my tree :(
>>>>
>>>> Thanks,
>>>> Changwei
>>>>
>>>>
>>>> On 2018/12/3 20:08, wangjian wrote:
>>>>> Function dlm_move_lockres_to_recovery_list should clean
>>>>> DLM_LKSB_GET_LVB and DLM_LKSB_PUT_LVB when the cancel_pending
>>>>> is set. Otherwise node may panic in dlm_proxy_ast_handler.
>>>>>
>>>>> Here is the situation: At the beginning, Node1 is the master
>>>>> of the lock resource and has NL lock, Node2 has PR lock,
>>>>> Node3 has PR lock, Node4 has NL lock.
>>>>>
>>>>> Node1        Node2            Node3            Node4
>>>>>                 convert lock_2 from
>>>>>                 PR to EX.
>>>>>
>>>>> the mode of lock_3 is
>>>>> PR, which blocks the
>>>>> conversion request of
>>>>> Node2. move lock_2 to
>>>>> conversion list.
>>>>>
>>>>>                              convert lock_3 from
>>>>>                              PR to EX.
>>>>>
>>>>> move lock_3 to conversion
>>>>> list. send BAST to Node3.
>>>>>
>>>>>                              receive BAST from Node1.
>>>>>                              downconvert thread execute
>>>>>                              canceling convert operation.
>>>>>
>>>>> Node2 dies because
>>>>> the host is powered down.
>>>>>
>>>>>                              in dlmunlock_common function,
>>>>>                              the downconvert thread set
>>>>>                              cancel_pending. at the same
>>>>>                              time, Node 3 realized that
>>>>>                              Node 1 is dead, so move lock_3
>>>>>                              back to granted list in
>>>>>                              dlm_move_lockres_to_recovery_list
>>>>>                              function and remove Node 1 from
>>>>>                              the domain_map in
>>>>>                              __dlm_hb_node_down function.
>>>>>                              then downconvert thread failed
>>>>>                              to send the lock cancellation
>>>>>                              request to Node1 and return
>>>>>                              DLM_NORMAL from
>>>>>                              dlm_send_remote_unlock_request
>>>>>                              function.
>>>>>
>>>>>                                                    become recovery master.
>>>>>
>>>>>                 during the recovery
>>>>>                 process, send
>>>>>                 lock_2 that is
>>>>>                 converting form
>>>>>                 PR to EX to Node4.
>>>>>
>>>>>                              during the recovery process,
>>>>>                              send lock_3 in the granted list and
>>>>>                              cantain the DLM_LKSB_GET_LVB
>>>>>                              flag to Node4. Then downconvert thread
>>>>>                              delete DLM_LKSB_GET_LVB flag in
>>>>>                              dlmunlock_common function.
>>>>>
>>>>>                                                    Node4 finish recovery.
>>>>>                                                    the mode of lock_3 is
>>>>>                                                    PR, which blocks the
>>>>>                                                    conversion request of
>>>>>                                                    Node2, so send BAST
>>>>>                                                    to Node3.
>>>>>
>>>>>                              receive BAST from Node4.
>>>>>                              convert lock_3 from PR to NL.
>>>>>
>>>>>                                                    change the mode of lock_3
>>>>>                                                    from PR to NL and send
>>>>>                                                    message to Node3.
>>>>>
>>>>>                              receive message from
>>>>>                              Node4. The message contain
>>>>>                              LKM_GET_LVB flag, but the
>>>>>                              lock->lksb->flags does not
>>>>>                              contain DLM_LKSB_GET_LVB,
>>>>>                              BUG_ON in dlm_proxy_ast_handler
>>>>>                              function.
>>>>>
>>>>> Signed-off-by: Jian Wang<wangjian161@huawei.com>
>>>>> Reviewed-by: Yiwen Jiang<jiangyiwen@huawei.com>
>>>>> ---
>>>>>     fs/ocfs2/dlm/dlmunlock.c | 1 +
>>>>>     1 file changed, 1 insertion(+)
>>>>>
>>>>> diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
>>>>> index 63d701c..6e04fc7 100644
>>>>> --- a/fs/ocfs2/dlm/dlmunlock.c
>>>>> +++ b/fs/ocfs2/dlm/dlmunlock.c
>>>>> @@ -277,6 +277,7 @@ void dlm_commit_pending_cancel(struct dlm_lock_resource *res,
>>>>>     {
>>>>>     	list_move_tail(&lock->list, &res->granted);
>>>>>     	lock->ml.convert_type = LKM_IVMODE;
>>>>> +	lock->lksb->flags &= ~(DLM_LKSB_GET_LVB|DLM_LKSB_PUT_LVB);
>>>>>     }
>>>>>     
>>>>>     
>>>>> -- 
>>>>> 1.8.3.1
>>>>>
>>>> .
>>>>
>> .
>>
wangjian Dec. 8, 2018, 10:07 a.m. UTC | #6
Hi Changwei,

What I said is not the conversion cancellation success. For the dlm_send_remote_unlock_request function
called in the dlmunlock_common function, If the node is aware that the master of the lock resource is dead at this time,
the dlm_send_remote_unlock_request function will return DLM_NORMAL (at this time it also considers the execution to succeed).
DLM_LKSB_GET_LVB and DLM_LKSB_PUT_LVB are then cleared at the end of the dlm_send_remote_unlock_request function.
Let us look at dlm_send_remote_unlock_request function. The dlm_is_host_down function returns true and
the dlm_is_node_dead function returns true, then The dlm_send_remote_unlock_request function will returns true.

Also, this problem is not caused by the same problem that my other patch solves. In the scenario
that caused another bug (dlm/dlmrecovery.c 2137), the dlm_send_remote_unlock_request function has not been called yet.

Thanks,
Jian

On 12/8/2018 4:14 PM, Changwei Ge wrote:
> On 2018/12/8 15:40, wangjian wrote:
>> Hi Changwei,
>>
>> What do you mean by setting the LVB bit to be 428 lines of memcpy?
>> If so, you should realize that under the scenario and problem solving method I mentioned,
>> the result of the last AST processing of the if statement in line 426 is false.
>> This is indeed a more complicated anomaly scenario,
>> and you may not fully understand the scenario I mentioned above.
>> Below I will describe the key points of this scenario again.
>>
>> The master of the lock resource has died. Because Node3 is in the cancel_convert process,
>> Node3 moves the lock to the grant queue in the dlm_move_lockres_to_recovery_list
>> function (the DLM_LKSB_GET_LVB and DLM_LKSB_PUT_LVB are not cleared).
>> The lock information is sent to the new master (Node4) during the recovery process.
>> Then the cancel_convert process clears DLM_LKSB_GET_LVB and DLM_LKSB_PUT_LVB.
>> The lock_3 information on the new master (Node4) is inconsistent with
>> the local lock_3 information on Node3, causing a BUG to occur when Node3 receives the AST.
> Aha... I got your point and what's the problem like.
>
> So as the old master died Node3 thinks that the conversion cancellation succeeds and thus clears _DLM_LKSB_PUT_LVB_ and _DLM_LKSB_GET_LVB_ in dlmunlock_common().
> But before that dlm recovery procedure had already sent lksb of lock to the *new* master, right? Um, so we have a very fast dlm recovery progress. :-)
> I admit its possibility.
>
> So actually, we can justify if conversion cancellation succeeds or not like your another patch, right?
> If we already know we had a field  conversion cancellation, why we still clear _DLM_LKSB_PUT_LVB_ and _DLM_LKSB_GET_LVB_ ?
>
> And IMO in your case sending _DLM_LKSB_PUT_LVB_ and _DLM_LKSB_GET_LVB_ to *new* master makes sense as we still can ask the new master transfer LVB back to boost performace.
>
> So I prefer to merge your two patches addressing the problems and I think they have the same root cause.
>
> Thanks,
> Changwei
>
>
>> Thanks,
>> Jian
>>
>> On 12/7/2018 11:40 AM, Changwei Ge wrote:
>>> Hi Jian,
>>>
>>> Um...
>>> I am a little puzzled after delving into this patch.
>>>
>>> Do you mean the BUG check below?
>>> '''
>>> 425                 /* if we requested the lvb, fetch it into our lksb now */
>>> 426                 if (flags & LKM_GET_LVB) {
>>> 427                         BUG_ON(!(lock->lksb->flags & DLM_LKSB_GET_LVB));
>>> 428                         memcpy(lock->lksb->lvb, past->lvb, DLM_LVB_LEN);
>>> 429                 }
>>>
>>> '''
>>>
>>> If so, you clear DLM_LKSB_GET_LVB in dlm_commit_pending_cancel() and how could the LVB bit be set in dlm_proxy_ast_handler()?
>>>
>>> Thanks,
>>> Changwei
>>>
>>>
>>> On 2018/12/6 19:54, wangjian wrote:
>>>> Hi Changwei,
>>>>
>>>> The core information that causes the bug in the dlm_proxy_ast_handler function is as follows.
>>>>
>>>> [  699.795843] kernel BUG at /home/Euler_compile_env/usr/src/linux-4.18/fs/ocfs2/dlm/dlmast.c:427!
>>>> [  699.797525] invalid opcode: 0000 [#1] SMP NOPTI
>>>> [  699.798383] CPU: 8 PID: 510 Comm: kworker/u24:1 Kdump: loaded Tainted: G           OE     4.18.0 #1
>>>> [  699.800002] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/30/2014
>>>> [  699.801963] Workqueue: o2net o2net_rx_until_empty [ocfs2_nodemanager]
>>>> [  699.803275] RIP: 0010:dlm_proxy_ast_handler+0x738/0x740 [ocfs2_dlm]
>>>> [  699.804710] Code: 00 10 48 8d 7c 24 48 48 89 44 24 48 48 c7 c1 f1 35 92 c0 ba 30 01 00 00 48 c7 c6 30 a9 91 c0 31 c0 e8
>>>> ac 88 fb ff 0f 0b 0f 0b <0f> 0b 66 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 e5 41 57 45 89 c7
>>>> [  699.808506] RSP: 0018:ffffba64c6f2fd38 EFLAGS: 00010246
>>>> [  699.809456] RAX: ffff9f34a9b39148 RBX: ffff9f30b7af4000 RCX: ffff9f34a9b39148
>>>> [  699.810698] RDX: 000000000000019e RSI: ffffffffc091a930 RDI: ffffba64c6f2fd80
>>>> [  699.811927] RBP: ffff9f2cb7aa3000 R08: ffff9f2cb7b99400 R09: 000000000000001f
>>>> [  699.813457] R10: ffff9f34a9249200 R11: ffff9f34af23aa00 R12: 0000000040000000
>>>> [  699.814719] R13: ffff9f34a9249210 R14: 0000000000000002 R15: ffff9f34af23aa28
>>>> [  699.815984] FS:  0000000000000000(0000) GS:ffff9f32b7c00000(0000) knlGS:0000000000000000
>>>> [  699.817417] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>> [  699.818825] CR2: 00007fd772f5a140 CR3: 000000005b00a001 CR4: 00000000001606e0
>>>> [  699.820123] Call Trace:
>>>> [  699.820658]  o2net_rx_until_empty+0x94b/0xcc0 [ocfs2_nodemanager]
>>>> [  699.821848]  process_one_work+0x171/0x370
>>>> [  699.822595]  worker_thread+0x49/0x3f0
>>>> [  699.823301]  kthread+0xf8/0x130
>>>> [  699.823972]  ? max_active_store+0x80/0x80
>>>> [  699.824881]  ? kthread_bind+0x10/0x10
>>>> [  699.825589]  ret_from_fork+0x35/0x40
>>>>
>>>> The reasons for clearing the LVB flag are as follows. First, The owner of the lock resource
>>>> may have died, the lock has been moved to the grant queue, the purpose of the lock
>>>> cancellation has been reached, and the LVB flag should be cleared. Second, solve this panic problem.
>>>>
>>>> Thanks,
>>>> Jian
>>>>
>>>> On 12/5/2018 10:01 AM, Changwei Ge wrote:
>>>>> Hi Jian,
>>>>>
>>>>> Could your please also share the panic backtrace in dlm_proxy_ast_handler()?
>>>>> After that I can help to review this patch and analyze what's wrong in DLM.
>>>>>
>>>>> It will be better for you to tell your intention why LVB flags should be cleared rather
>>>>> than just giving a longggg time sequence diagram.
>>>>>
>>>>> Moreover, your patches are hard be applied to my tree :(
>>>>>
>>>>> Thanks,
>>>>> Changwei
>>>>>
>>>>>
>>>>> On 2018/12/3 20:08, wangjian wrote:
>>>>>> Function dlm_move_lockres_to_recovery_list should clean
>>>>>> DLM_LKSB_GET_LVB and DLM_LKSB_PUT_LVB when the cancel_pending
>>>>>> is set. Otherwise node may panic in dlm_proxy_ast_handler.
>>>>>>
>>>>>> Here is the situation: At the beginning, Node1 is the master
>>>>>> of the lock resource and has NL lock, Node2 has PR lock,
>>>>>> Node3 has PR lock, Node4 has NL lock.
>>>>>>
>>>>>> Node1        Node2            Node3            Node4
>>>>>>                  convert lock_2 from
>>>>>>                  PR to EX.
>>>>>>
>>>>>> the mode of lock_3 is
>>>>>> PR, which blocks the
>>>>>> conversion request of
>>>>>> Node2. move lock_2 to
>>>>>> conversion list.
>>>>>>
>>>>>>                               convert lock_3 from
>>>>>>                               PR to EX.
>>>>>>
>>>>>> move lock_3 to conversion
>>>>>> list. send BAST to Node3.
>>>>>>
>>>>>>                               receive BAST from Node1.
>>>>>>                               downconvert thread execute
>>>>>>                               canceling convert operation.
>>>>>>
>>>>>> Node2 dies because
>>>>>> the host is powered down.
>>>>>>
>>>>>>                               in dlmunlock_common function,
>>>>>>                               the downconvert thread set
>>>>>>                               cancel_pending. at the same
>>>>>>                               time, Node 3 realized that
>>>>>>                               Node 1 is dead, so move lock_3
>>>>>>                               back to granted list in
>>>>>>                               dlm_move_lockres_to_recovery_list
>>>>>>                               function and remove Node 1 from
>>>>>>                               the domain_map in
>>>>>>                               __dlm_hb_node_down function.
>>>>>>                               then downconvert thread failed
>>>>>>                               to send the lock cancellation
>>>>>>                               request to Node1 and return
>>>>>>                               DLM_NORMAL from
>>>>>>                               dlm_send_remote_unlock_request
>>>>>>                               function.
>>>>>>
>>>>>>                                                     become recovery master.
>>>>>>
>>>>>>                  during the recovery
>>>>>>                  process, send
>>>>>>                  lock_2 that is
>>>>>>                  converting form
>>>>>>                  PR to EX to Node4.
>>>>>>
>>>>>>                               during the recovery process,
>>>>>>                               send lock_3 in the granted list and
>>>>>>                               cantain the DLM_LKSB_GET_LVB
>>>>>>                               flag to Node4. Then downconvert thread
>>>>>>                               delete DLM_LKSB_GET_LVB flag in
>>>>>>                               dlmunlock_common function.
>>>>>>
>>>>>>                                                     Node4 finish recovery.
>>>>>>                                                     the mode of lock_3 is
>>>>>>                                                     PR, which blocks the
>>>>>>                                                     conversion request of
>>>>>>                                                     Node2, so send BAST
>>>>>>                                                     to Node3.
>>>>>>
>>>>>>                               receive BAST from Node4.
>>>>>>                               convert lock_3 from PR to NL.
>>>>>>
>>>>>>                                                     change the mode of lock_3
>>>>>>                                                     from PR to NL and send
>>>>>>                                                     message to Node3.
>>>>>>
>>>>>>                               receive message from
>>>>>>                               Node4. The message contain
>>>>>>                               LKM_GET_LVB flag, but the
>>>>>>                               lock->lksb->flags does not
>>>>>>                               contain DLM_LKSB_GET_LVB,
>>>>>>                               BUG_ON in dlm_proxy_ast_handler
>>>>>>                               function.
>>>>>>
>>>>>> Signed-off-by: Jian Wang<wangjian161@huawei.com>
>>>>>> Reviewed-by: Yiwen Jiang<jiangyiwen@huawei.com>
>>>>>> ---
>>>>>>      fs/ocfs2/dlm/dlmunlock.c | 1 +
>>>>>>      1 file changed, 1 insertion(+)
>>>>>>
>>>>>> diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
>>>>>> index 63d701c..6e04fc7 100644
>>>>>> --- a/fs/ocfs2/dlm/dlmunlock.c
>>>>>> +++ b/fs/ocfs2/dlm/dlmunlock.c
>>>>>> @@ -277,6 +277,7 @@ void dlm_commit_pending_cancel(struct dlm_lock_resource *res,
>>>>>>      {
>>>>>>      	list_move_tail(&lock->list, &res->granted);
>>>>>>      	lock->ml.convert_type = LKM_IVMODE;
>>>>>> +	lock->lksb->flags &= ~(DLM_LKSB_GET_LVB|DLM_LKSB_PUT_LVB);
>>>>>>      }
>>>>>>      
>>>>>>      
>>>>>> -- 
>>>>>> 1.8.3.1
>>>>>>
>>>>> .
>>>>>
>>> .
>>>
> .
>
<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <pre>Hi Changwei,

What I said is not the conversion cancellation success. For the dlm_send_remote_unlock_request function
called in the dlmunlock_common function, If the node is aware that the master of the lock resource is dead at this time,
the dlm_send_remote_unlock_request function will return DLM_NORMAL (at this time it also considers the execution to succeed).
DLM_LKSB_GET_LVB and DLM_LKSB_PUT_LVB are then cleared at the end of the dlm_send_remote_unlock_request function.
Let us look at dlm_send_remote_unlock_request function. The dlm_is_host_down function returns true and
the dlm_is_node_dead function returns true, then The dlm_send_remote_unlock_request function will returns true.

Also, this problem is not caused by the same problem that my other patch solves. In the scenario
that caused another bug (dlm/dlmrecovery.c 2137), the dlm_send_remote_unlock_request function has not been called yet.

Thanks,
Jian
</pre>
    <div class="moz-cite-prefix">On 12/8/2018 4:14 PM, Changwei Ge
      wrote:<br>
    </div>
    <blockquote type="cite"
cite="mid:63ADC13FD55D6546B7DECE290D39E37301277F2187@H3CMLB12-EX.srv.huawei-3com.com">
      <pre class="moz-quote-pre" wrap="">On 2018/12/8 15:40, wangjian wrote:
</pre>
      <blockquote type="cite">
        <pre class="moz-quote-pre" wrap="">Hi Changwei,

What do you mean by setting the LVB bit to be 428 lines of memcpy?
If so, you should realize that under the scenario and problem solving method I mentioned,
the result of the last AST processing of the if statement in line 426 is false.
This is indeed a more complicated anomaly scenario,
and you may not fully understand the scenario I mentioned above.
Below I will describe the key points of this scenario again.

The master of the lock resource has died. Because Node3 is in the cancel_convert process,
Node3 moves the lock to the grant queue in the dlm_move_lockres_to_recovery_list
function (the DLM_LKSB_GET_LVB and DLM_LKSB_PUT_LVB are not cleared).
The lock information is sent to the new master (Node4) during the recovery process.
Then the cancel_convert process clears DLM_LKSB_GET_LVB and DLM_LKSB_PUT_LVB.
The lock_3 information on the new master (Node4) is inconsistent with
the local lock_3 information on Node3, causing a BUG to occur when Node3 receives the AST.
</pre>
      </blockquote>
      <pre class="moz-quote-pre" wrap="">
Aha... I got your point and what's the problem like.

So as the old master died Node3 thinks that the conversion cancellation succeeds and thus clears _DLM_LKSB_PUT_LVB_ and _DLM_LKSB_GET_LVB_ in dlmunlock_common().
But before that dlm recovery procedure had already sent lksb of lock to the *new* master, right? Um, so we have a very fast dlm recovery progress. :-)
I admit its possibility.

So actually, we can justify if conversion cancellation succeeds or not like your another patch, right?
If we already know we had a field  conversion cancellation, why we still clear _DLM_LKSB_PUT_LVB_ and _DLM_LKSB_GET_LVB_ ?

And IMO in your case sending _DLM_LKSB_PUT_LVB_ and _DLM_LKSB_GET_LVB_ to *new* master makes sense as we still can ask the new master transfer LVB back to boost performace.

So I prefer to merge your two patches addressing the problems and I think they have the same root cause.

Thanks,
Changwei


</pre>
      <blockquote type="cite">
        <pre class="moz-quote-pre" wrap="">
Thanks,
Jian

On 12/7/2018 11:40 AM, Changwei Ge wrote:
</pre>
        <blockquote type="cite">
          <pre class="moz-quote-pre" wrap="">Hi Jian,

Um...
I am a little puzzled after delving into this patch.

Do you mean the BUG check below?
'''
425                 /* if we requested the lvb, fetch it into our lksb now */
426                 if (flags &amp; LKM_GET_LVB) {
427                         BUG_ON(!(lock-&gt;lksb-&gt;flags &amp; DLM_LKSB_GET_LVB));
428                         memcpy(lock-&gt;lksb-&gt;lvb, past-&gt;lvb, DLM_LVB_LEN);
429                 }

'''

If so, you clear DLM_LKSB_GET_LVB in dlm_commit_pending_cancel() and how could the LVB bit be set in dlm_proxy_ast_handler()?

Thanks,
Changwei


On 2018/12/6 19:54, wangjian wrote:
</pre>
          <blockquote type="cite">
            <pre class="moz-quote-pre" wrap="">Hi Changwei,

The core information that causes the bug in the dlm_proxy_ast_handler function is as follows.

[  699.795843] kernel BUG at /home/Euler_compile_env/usr/src/linux-4.18/fs/ocfs2/dlm/dlmast.c:427!
[  699.797525] invalid opcode: 0000 [#1] SMP NOPTI
[  699.798383] CPU: 8 PID: 510 Comm: kworker/u24:1 Kdump: loaded Tainted: G           OE     4.18.0 #1
[  699.800002] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/30/2014
[  699.801963] Workqueue: o2net o2net_rx_until_empty [ocfs2_nodemanager]
[  699.803275] RIP: 0010:dlm_proxy_ast_handler+0x738/0x740 [ocfs2_dlm]
[  699.804710] Code: 00 10 48 8d 7c 24 48 48 89 44 24 48 48 c7 c1 f1 35 92 c0 ba 30 01 00 00 48 c7 c6 30 a9 91 c0 31 c0 e8
ac 88 fb ff 0f 0b 0f 0b &lt;0f&gt; 0b 66 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 e5 41 57 45 89 c7
[  699.808506] RSP: 0018:ffffba64c6f2fd38 EFLAGS: 00010246
[  699.809456] RAX: ffff9f34a9b39148 RBX: ffff9f30b7af4000 RCX: ffff9f34a9b39148
[  699.810698] RDX: 000000000000019e RSI: ffffffffc091a930 RDI: ffffba64c6f2fd80
[  699.811927] RBP: ffff9f2cb7aa3000 R08: ffff9f2cb7b99400 R09: 000000000000001f
[  699.813457] R10: ffff9f34a9249200 R11: ffff9f34af23aa00 R12: 0000000040000000
[  699.814719] R13: ffff9f34a9249210 R14: 0000000000000002 R15: ffff9f34af23aa28
[  699.815984] FS:  0000000000000000(0000) GS:ffff9f32b7c00000(0000) knlGS:0000000000000000
[  699.817417] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  699.818825] CR2: 00007fd772f5a140 CR3: 000000005b00a001 CR4: 00000000001606e0
[  699.820123] Call Trace:
[  699.820658]  o2net_rx_until_empty+0x94b/0xcc0 [ocfs2_nodemanager]
[  699.821848]  process_one_work+0x171/0x370
[  699.822595]  worker_thread+0x49/0x3f0
[  699.823301]  kthread+0xf8/0x130
[  699.823972]  ? max_active_store+0x80/0x80
[  699.824881]  ? kthread_bind+0x10/0x10
[  699.825589]  ret_from_fork+0x35/0x40

The reasons for clearing the LVB flag are as follows. First, The owner of the lock resource
may have died, the lock has been moved to the grant queue, the purpose of the lock
cancellation has been reached, and the LVB flag should be cleared. Second, solve this panic problem.

Thanks,
Jian

On 12/5/2018 10:01 AM, Changwei Ge wrote:
</pre>
            <blockquote type="cite">
              <pre class="moz-quote-pre" wrap="">Hi Jian,

Could your please also share the panic backtrace in dlm_proxy_ast_handler()?
After that I can help to review this patch and analyze what's wrong in DLM.

It will be better for you to tell your intention why LVB flags should be cleared rather
than just giving a longggg time sequence diagram.

Moreover, your patches are hard be applied to my tree :(

Thanks,
Changwei


On 2018/12/3 20:08, wangjian wrote:
</pre>
              <blockquote type="cite">
                <pre class="moz-quote-pre" wrap="">Function dlm_move_lockres_to_recovery_list should clean
DLM_LKSB_GET_LVB and DLM_LKSB_PUT_LVB when the cancel_pending
is set. Otherwise node may panic in dlm_proxy_ast_handler.

Here is the situation: At the beginning, Node1 is the master
of the lock resource and has NL lock, Node2 has PR lock,
Node3 has PR lock, Node4 has NL lock.

Node1        Node2            Node3            Node4
                convert lock_2 from
                PR to EX.

the mode of lock_3 is
PR, which blocks the
conversion request of
Node2. move lock_2 to
conversion list.

                             convert lock_3 from
                             PR to EX.

move lock_3 to conversion
list. send BAST to Node3.

                             receive BAST from Node1.
                             downconvert thread execute
                             canceling convert operation.

Node2 dies because
the host is powered down.

                             in dlmunlock_common function,
                             the downconvert thread set
                             cancel_pending. at the same
                             time, Node 3 realized that
                             Node 1 is dead, so move lock_3
                             back to granted list in
                             dlm_move_lockres_to_recovery_list
                             function and remove Node 1 from
                             the domain_map in
                             __dlm_hb_node_down function.
                             then downconvert thread failed
                             to send the lock cancellation
                             request to Node1 and return
                             DLM_NORMAL from
                             dlm_send_remote_unlock_request
                             function.

                                                   become recovery master.

                during the recovery
                process, send
                lock_2 that is
                converting form
                PR to EX to Node4.

                             during the recovery process,
                             send lock_3 in the granted list and
                             cantain the DLM_LKSB_GET_LVB
                             flag to Node4. Then downconvert thread
                             delete DLM_LKSB_GET_LVB flag in
                             dlmunlock_common function.

                                                   Node4 finish recovery.
                                                   the mode of lock_3 is
                                                   PR, which blocks the
                                                   conversion request of
                                                   Node2, so send BAST
                                                   to Node3.

                             receive BAST from Node4.
                             convert lock_3 from PR to NL.

                                                   change the mode of lock_3
                                                   from PR to NL and send
                                                   message to Node3.

                             receive message from
                             Node4. The message contain
                             LKM_GET_LVB flag, but the
                             lock-&gt;lksb-&gt;flags does not
                             contain DLM_LKSB_GET_LVB,
                             BUG_ON in dlm_proxy_ast_handler
                             function.

Signed-off-by: Jian Wang<a class="moz-txt-link-rfc2396E" href="mailto:wangjian161@huawei.com">&lt;wangjian161@huawei.com&gt;</a>
Reviewed-by: Yiwen Jiang<a class="moz-txt-link-rfc2396E" href="mailto:jiangyiwen@huawei.com">&lt;jiangyiwen@huawei.com&gt;</a>
---
    fs/ocfs2/dlm/dlmunlock.c | 1 +
    1 file changed, 1 insertion(+)

diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
index 63d701c..6e04fc7 100644
--- a/fs/ocfs2/dlm/dlmunlock.c
+++ b/fs/ocfs2/dlm/dlmunlock.c
@@ -277,6 +277,7 @@ void dlm_commit_pending_cancel(struct dlm_lock_resource *res,
    {
    	list_move_tail(&amp;lock-&gt;list, &amp;res-&gt;granted);
    	lock-&gt;ml.convert_type = LKM_IVMODE;
+	lock-&gt;lksb-&gt;flags &amp;= ~(DLM_LKSB_GET_LVB|DLM_LKSB_PUT_LVB);
    }
diff mbox series

Patch

diff --git a/fs/ocfs2/dlm/dlmunlock.c b/fs/ocfs2/dlm/dlmunlock.c
index 63d701c..6e04fc7 100644
--- a/fs/ocfs2/dlm/dlmunlock.c
+++ b/fs/ocfs2/dlm/dlmunlock.c
@@ -277,6 +277,7 @@  void dlm_commit_pending_cancel(struct dlm_lock_resource *res,
  {
  	list_move_tail(&lock->list, &res->granted);
  	lock->ml.convert_type = LKM_IVMODE;
+	lock->lksb->flags &= ~(DLM_LKSB_GET_LVB|DLM_LKSB_PUT_LVB);
  }