Message ID | 1501459255.1499.71.camel@haakon3.daterainc.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Nicolas, W dniu 31.07.2017 o 02:00, Nicholas A. Bellinger pisze: > Hi Lukasz, > > On Thu, 2017-07-27 at 17:23 +0200, Łukasz Engel wrote: >> Hello >> >> I am experiencing problems using LIO FC target with vmware ESX (6.5). >> >> Thisis the same problem as discussed in >> https://www.spinics.net/lists/target-devel/msg15670.html >> I've tried 4.11.x (up to 12) and 4.12.3(and also some kernels from 4.9 >> line). [...] >> Yes, this is most certainly the same bug the patch here addresses: >> >> https://www.spinics.net/lists/target-devel/msg15759.html >> >> Also, a smaller version of that just went out with your CC' on it here: >> >> qla2xxx: Fix incorrect tcm_qla2xxx_free_cmd use during TMR ABORT (v2) >> https://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending.git/commit/?id=6bcbb3174caa5f1ccc894f8ae077631659d5a629 After applying both patches (in 6bcbb3174caa5f1ccc894f8ae077631659d5a629 manual adjusting in second chunk was required for 4.12.x - after-match is slightly different) - there is no BUG, but hang in kworker instead, with all I/Os from FC frozen: Jul 31 16:27:14 teststorage [ 42.897291] qla2xxx [0000:06:00.0]-500a:12: LOOP UP detected (4 Gbps). Jul 31 16:27:16 teststorage [ 42.959068] qla2xxx [0000:09:00.0]-ffff:11: qla24xx_do_nack_work create sess success ffff881f80e29200 Jul 31 16:27:16 teststorage [ 44.144811] qla2xxx [0000:06:00.0]-ffff:12: qla24xx_do_nack_work create sess success ffff881f80e28400 Jul 31 16:27:16 teststorage [ 44.230770] qla2xxx [0000:06:00.0]-ffff:12: qla24xx_do_nack_work create sess success ffff881f8f575a00 Jul 31 16:27:16 teststorage [ 44.473650] qla2xxx [0000:06:00.0]-ffff:12: qla24xx_do_nack_work create sess success ffff881f8f575600 Jul 31 16:27:16 teststorage [ 44.520821] qla2xxx [0000:06:00.0]-ffff:12: qla24xx_do_nack_work create sess success ffff881f8f574400 Jul 31 16:27:16 teststorage [ 44.538183] qla2xxx [0000:06:00.0]-ffff:12: qla24xx_do_nack_work create sess success ffff881f900f3400 Jul 31 16:27:16 teststorage [ 44.984052] qla2xxx [0000:06:00.0]-ffff:12: qla24xx_do_nack_work create sess success ffff881f8f577a00 Jul 31 16:28:54 teststorage [ 45.042585] qla2xxx [0000:06:00.0]-ffff:12: qla24xx_do_nack_work create sess success ffff881f80e28000 Jul 31 16:28:54 teststorage [ 142.431045] ABORT_TASK: Found referenced qla2xxx task_tag: 1189328 Jul 31 16:28:54 teststorage [ 142.431059] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 1189328 Jul 31 16:28:57 teststorage [ 145.632571] ABORT_TASK: Found referenced qla2xxx task_tag: 1177932 Jul 31 16:28:57 teststorage [ 145.897299] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 1177932 Jul 31 16:29:00 teststorage [ 148.388569] ABORT_TASK: Found referenced qla2xxx task_tag: 1164160 Jul 31 16:32:00 teststorage [ 328.848344] se_cmd waiting CDB: 0x89 tag: 1164160 t_state: 5se_cmd_flags: 0x01c80109 transport_state: 0x00000033 Jul 31 16:32:41 teststorage [ 369.808484] INFO: task kworker/1:93:2802 blocked for more than 120 seconds. Jul 31 16:32:41 teststorage [ 369.808493] Not tainted 4.12.3p1 #3 Jul 31 16:32:41 teststorage [ 369.808496] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Jul 31 16:32:41 teststorage [ 369.808499] kworker/1:93 D 0 2802 2 0x00000080 Jul 31 16:32:41 teststorage [ 369.808515] Workqueue: events qlt_free_session_done [qla2xxx] Jul 31 16:32:41 teststorage [ 369.808518] Call Trace: Jul 31 16:32:41 teststorage [ 369.808524] __schedule+0x285/0x840 Jul 31 16:32:41 teststorage [ 369.808528] schedule+0x36/0x80 Jul 31 16:32:41 teststorage [ 369.808531] schedule_timeout+0x242/0x2f0 Jul 31 16:32:41 teststorage [ 369.808537] ? ttwu_do_activate+0x77/0x80 Jul 31 16:32:41 teststorage [ 369.808540] wait_for_completion+0x111/0x170 Jul 31 16:32:41 teststorage [ 369.808544] ? wake_up_q+0x80/0x80 Jul 31 16:32:41 teststorage [ 369.808555] target_wait_for_sess_cmds+0x49/0x170 [target_core_mod] Jul 31 16:32:41 teststorage [ 369.808560] tcm_qla2xxx_free_session+0x45/0x90 [tcm_qla2xxx] Jul 31 16:32:41 teststorage [ 369.808572] qlt_free_session_done+0x109/0x5a0 [qla2xxx] Jul 31 16:32:41 teststorage [ 369.808579] ? qlt_unreg_sess+0xe7/0xf0 [qla2xxx] Jul 31 16:32:41 teststorage [ 369.808583] ? tcm_qla2xxx_put_sess+0x3c/0x50 [tcm_qla2xxx] Jul 31 16:32:41 teststorage [ 369.808587] process_one_work+0x144/0x370 Jul 31 16:32:41 teststorage [ 369.808591] worker_thread+0x4d/0x3c0 Jul 31 16:32:41 teststorage [ 369.808594] kthread+0x109/0x140 Jul 31 16:32:41 teststorage [ 369.808597] ? rescuer_thread+0x360/0x360 Jul 31 16:32:41 teststorage [ 369.808600] ? kthread_park+0x60/0x60 Jul 31 16:32:41 teststorage [ 369.808720] ret_from_fork+0x25/0x30 Jul 31 16:32:41 teststorage [ 369.809016] NMI backtrace for cpu 3 Jul 31 16:32:41 teststorage [ 369.809446] CPU: 3 PID: 107 Comm: khungtaskd Not tainted 4.12.3p1 #3 Jul 31 16:32:41 teststorage [ 369.809886] Hardware name: ASUS All Series/X99-E WS/USB 3.1, BIOS 3402 11/14/2016 Jul 31 16:32:41 teststorage [ 369.810326] Call Trace: Jul 31 16:32:41 teststorage [ 369.810764] dump_stack+0x63/0x8e Jul 31 16:32:41 teststorage [ 369.811190] nmi_cpu_backtrace+0x9b/0xa0 Jul 31 16:32:41 teststorage [ 369.811613] ? irq_force_complete_move+0xf0/0xf0 Jul 31 16:32:41 teststorage [ 369.812036] nmi_trigger_cpumask_backtrace+0x99/0xd0 Jul 31 16:32:41 teststorage [ 369.812461] arch_trigger_cpumask_backtrace+0x19/0x20 Jul 31 16:32:41 teststorage [ 369.812890] watchdog+0x2c7/0x340 Jul 31 16:32:41 teststorage [ 369.813321] kthread+0x109/0x140 Jul 31 16:32:41 teststorage [ 369.813748] ? reset_hung_task_detector+0x20/0x20 Jul 31 16:32:41 teststorage [ 369.814171] ? kthread_park+0x60/0x60 Jul 31 16:32:41 teststorage [ 369.814595] ? kthread_park+0x60/0x60 Jul 31 16:32:41 teststorage [ 369.815013] ret_from_fork+0x25/0x30 Jul 31 16:32:41 teststorage [ 369.815428] Sending NMI from CPU 3 to CPUs 0-2,4-11: Jul 31 16:32:41 teststorage [ 369.815846] NMI backtrace for cpu 0 skipped: idling at pc 0xffffffff818b5ac6 Jul 31 16:32:41 teststorage [ 369.815946] NMI backtrace for cpu 1 skipped: idling at pc 0xffffffff818b5ac6 Jul 31 16:32:41 teststorage [ 369.816045] NMI backtrace for cpu 2 skipped: idling at pc 0xffffffff818b5ac6 Jul 31 16:32:41 teststorage [ 369.816146] NMI backtrace for cpu 4 skipped: idling at pc 0xffffffff818b5ac6 Jul 31 16:32:41 teststorage [ 369.816247] NMI backtrace for cpu 5 skipped: idling at pc 0xffffffff818b5ac6 Jul 31 16:32:41 teststorage [ 369.816352] NMI backtrace for cpu 6 skipped: idling at pc 0xffffffff818b5ac6 Jul 31 16:32:41 teststorage [ 369.816446] NMI backtrace for cpu 7 skipped: idling at pc 0xffffffff818b5ac6 Jul 31 16:32:41 teststorage [ 369.816547] NMI backtrace for cpu 8 skipped: idling at pc 0xffffffff818b5ac6 Jul 31 16:32:41 teststorage [ 369.816648] NMI backtrace for cpu 9 skipped: idling at pc 0xffffffff818b5ac6 Jul 31 16:32:41 teststorage [ 369.816747] NMI backtrace for cpu 10 skipped: idling at pc 0xffffffff818b5ac6 Jul 31 16:32:41 teststorage [ 369.816851] NMI backtrace for cpu 11 skipped: idling at pc 0xffffffff818b5ac6 Jul 31 16:32:41 teststorage [ 369.816855] Kernel panic - not syncing: hung_task: blocked tasks Jul 31 16:32:41 teststorage [ 369.816857] CPU: 3 PID: 107 Comm: khungtaskd Not tainted 4.12.3p1 #3 Jul 31 16:32:41 teststorage [ 369.816857] Hardware name: ASUS All Series/X99-E WS/USB 3.1, BIOS 3402 11/14/2016 Jul 31 16:32:41 teststorage [ 369.816857] Call Trace: Jul 31 16:32:41 teststorage [ 369.816859] dump_stack+0x63/0x8e Jul 31 16:32:41 teststorage [ 369.816861] panic+0xeb/0x239 Jul 31 16:32:41 teststorage [ 369.816863] watchdog+0x2d5/0x340 Jul 31 16:32:41 teststorage [ 369.816864] kthread+0x109/0x140 Jul 31 16:32:41 teststorage [ 369.816864] ? reset_hung_task_detector+0x20/0x20 Jul 31 16:32:41 teststorage [ 369.816865] ? kthread_park+0x60/0x60 Jul 31 16:32:41 teststorage [ 369.816866] ? kthread_park+0x60/0x60 Jul 31 16:32:41 teststorage [ 369.816866] ret_from_fork+0x25/0x30 8<===== > Also, some more details on what your backend device storage behind > target-core looks like would be helpful. > > Physical HBAs, device information, and any MD/RAID/LVM/etc on top.. > I have adaptec 7805 HW RAID (6x SAS 600G 15k hdd in RAID0) - exporting 1 device to system (sdc), LVM sits on this device and LV (vgtest/test) is used as LIO block backstore and is exported to ESXs. (for system I have separate 2 SATA HDDs in software raid1 + LVM) [root@TESTSTORAGE ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sdb 8:16 0 465,8G 0 disk └─md126 9:126 0 465,8G 0 raid1 ├─md126p3 259:2 0 314G 0 md │ ├─vgsystem-swap 253:1 0 64G 0 lvm │ ├─vgsystem-root 253:0 0 100G 0 lvm / │ └─vgsystem-home 253:3 0 200G 0 lvm /home ├─md126p1 259:0 0 200M 0 md /boot/efi ├─md126p4 259:3 0 150,6G 0 md │ └─vgsystem-root 253:0 0 100G 0 lvm / └─md126p2 259:1 0 1G 0 md /boot sdc 8:32 1 3,3T 0 disk └─vgtest-test 253:2 0 3,3T 0 lvm sda 8:0 0 465,8G 0 disk └─md126 9:126 0 465,8G 0 raid1 ├─md126p3 259:2 0 314G 0 md │ ├─vgsystem-swap 253:1 0 64G 0 lvm │ ├─vgsystem-root 253:0 0 100G 0 lvm / │ └─vgsystem-home 253:3 0 200G 0 lvm /home ├─md126p1 259:0 0 200M 0 md /boot/efi ├─md126p4 259:3 0 150,6G 0 md │ └─vgsystem-root 253:0 0 100G 0 lvm / └─md126p2 259:1 0 1G 0 md /boot 8<=========
Hi Lukasz, (Adding Pascal CC', as he's been seeing something similar with MD backend storage) Thanks alot for the follow up. Comments below. On Mon, 2017-07-31 at 17:15 +0200, Łukasz Engel wrote: > Hi Nicolas, > > W dniu 31.07.2017 o 02:00, Nicholas A. Bellinger pisze: > > Hi Lukasz, > > > > On Thu, 2017-07-27 at 17:23 +0200, Łukasz Engel wrote: > >> Hello > >> > >> I am experiencing problems using LIO FC target with vmware ESX (6.5). > >> > >> Thisis the same problem as discussed in > >> https://www.spinics.net/lists/target-devel/msg15670.html > >> I've tried 4.11.x (up to 12) and 4.12.3(and also some kernels from 4.9 > >> line). > [...] > >> Yes, this is most certainly the same bug the patch here addresses: > >> > >> https://www.spinics.net/lists/target-devel/msg15759.html > >> > >> Also, a smaller version of that just went out with your CC' on it here: > >> > >> qla2xxx: Fix incorrect tcm_qla2xxx_free_cmd use during TMR ABORT (v2) > >> https://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending.git/commit/?id=6bcbb3174caa5f1ccc894f8ae077631659d5a629 > After applying both patches (in 6bcbb3174caa5f1ccc894f8ae077631659d5a629 > manual adjusting in second chunk was required for 4.12.x - after-match > is slightly different) - there is no BUG, but hang in kworker instead, > with all I/Os from FC frozen: > > Jul 31 16:27:14 teststorage [ 42.897291] qla2xxx [0000:06:00.0]-500a:12: LOOP UP detected (4 Gbps). > Jul 31 16:27:16 teststorage [ 42.959068] qla2xxx [0000:09:00.0]-ffff:11: qla24xx_do_nack_work create sess success ffff881f80e29200 > Jul 31 16:27:16 teststorage [ 44.144811] qla2xxx [0000:06:00.0]-ffff:12: qla24xx_do_nack_work create sess success ffff881f80e28400 > Jul 31 16:27:16 teststorage [ 44.230770] qla2xxx [0000:06:00.0]-ffff:12: qla24xx_do_nack_work create sess success ffff881f8f575a00 > Jul 31 16:27:16 teststorage [ 44.473650] qla2xxx [0000:06:00.0]-ffff:12: qla24xx_do_nack_work create sess success ffff881f8f575600 > Jul 31 16:27:16 teststorage [ 44.520821] qla2xxx [0000:06:00.0]-ffff:12: qla24xx_do_nack_work create sess success ffff881f8f574400 > Jul 31 16:27:16 teststorage [ 44.538183] qla2xxx [0000:06:00.0]-ffff:12: qla24xx_do_nack_work create sess success ffff881f900f3400 > Jul 31 16:27:16 teststorage [ 44.984052] qla2xxx [0000:06:00.0]-ffff:12: qla24xx_do_nack_work create sess success ffff881f8f577a00 > Jul 31 16:28:54 teststorage [ 45.042585] qla2xxx [0000:06:00.0]-ffff:12: qla24xx_do_nack_work create sess success ffff881f80e28000 > Jul 31 16:28:54 teststorage [ 142.431045] ABORT_TASK: Found referenced qla2xxx task_tag: 1189328 > Jul 31 16:28:54 teststorage [ 142.431059] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 1189328 > Jul 31 16:28:57 teststorage [ 145.632571] ABORT_TASK: Found referenced qla2xxx task_tag: 1177932 > Jul 31 16:28:57 teststorage [ 145.897299] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 1177932 > Jul 31 16:29:00 teststorage [ 148.388569] ABORT_TASK: Found referenced qla2xxx task_tag: 1164160 > Jul 31 16:32:00 teststorage [ 328.848344] se_cmd waiting CDB: 0x89 tag: 1164160 t_state: 5 se_cmd_flags: 0x01c80109 transport_state: 0x00000033 Perfect. This is very helpful to understand state of the I/O triggering ESX host ABORTs, and that blocks waiting for completion from backend storage. Here's what the above debug translates to: CDB: 0x89 = COMPARE_AND_WRITE (eg: VAAI AtomicTestandSet) t_state: 5 = TRANSPORT_PROCESSING se_cmd_flags: 0x01c80109 = SCF_SUPPORTED_SAM_OPCODE | SCF_SCSI_DATA_CDB | SCF_SE_LUN_CMD | SCF_COMPARE_AND_WRITE | SCF_ACK_KREF | SCF_USE_CPUID | SCF_TASK_ATTR_SET transport_state: 0x00000033 = CMD_T_ABORTED | CMD_T_ACTIVE | CMD_T_SENT | CMD_T_STOP Note COMPARE_AND_WRITE wrt to backend I/O consists of two phases. First phase is a READ and comparison against the first part of the received buffer. If the comparison matches, then the second phase sends a WRITE with the second part of the received buffer to complete the update. So according to the above output, the first phase of COMPARE_AND_WRITE dispatched the READ into backend storage. The READ did not complete, and subsequently ESX sent an ABORT_TASK for it (tag: 1164160). The ABORT_TASK successfully found the I/O still outstanding to backend storage, and blocks waiting for it to complete.. The debug printing 'se_cmd waiting ..' message shows every 180 seconds, so it's clear I/O completion from backend storage is not happening back to IBLOCK -> target-core for an extended amount of time. Btw, I've been simulating this same scenario by delaying backend storage I/O completion, and thus far once the first phase READ of COMPARE_AND_WRITE eventually returns back to IBLOCK -> target-core, then ABORT_TASK is able to complete and everything else continues to function as expected. That said, thus far the evidence is pointing to a backend device issue, where that first READ phase of the COMPARE_AND_WRITE is (never..?) completed back to IBLOCK -> target-core. More below on how to verify this. > Jul 31 16:32:41 teststorage [ 369.808484] INFO: task kworker/1:93:2802 blocked for more than 120 seconds. > Jul 31 16:32:41 teststorage [ 369.808493] Not tainted 4.12.3p1 #3 > Jul 31 16:32:41 teststorage [ 369.808496] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > Jul 31 16:32:41 teststorage [ 369.808499] kworker/1:93 D 0 2802 2 0x00000080 > Jul 31 16:32:41 teststorage [ 369.808515] Workqueue: events qlt_free_session_done [qla2xxx] > Jul 31 16:32:41 teststorage [ 369.808518] Call Trace: > Jul 31 16:32:41 teststorage [ 369.808524] __schedule+0x285/0x840 > Jul 31 16:32:41 teststorage [ 369.808528] schedule+0x36/0x80 > Jul 31 16:32:41 teststorage [ 369.808531] schedule_timeout+0x242/0x2f0 > Jul 31 16:32:41 teststorage [ 369.808537] ? ttwu_do_activate+0x77/0x80 > Jul 31 16:32:41 teststorage [ 369.808540] wait_for_completion+0x111/0x170 > Jul 31 16:32:41 teststorage [ 369.808544] ? wake_up_q+0x80/0x80 > Jul 31 16:32:41 teststorage [ 369.808555] target_wait_for_sess_cmds+0x49/0x170 [target_core_mod] > Jul 31 16:32:41 teststorage [ 369.808560] tcm_qla2xxx_free_session+0x45/0x90 [tcm_qla2xxx] > Jul 31 16:32:41 teststorage [ 369.808572] qlt_free_session_done+0x109/0x5a0 [qla2xxx] > Jul 31 16:32:41 teststorage [ 369.808579] ? qlt_unreg_sess+0xe7/0xf0 [qla2xxx] > Jul 31 16:32:41 teststorage [ 369.808583] ? tcm_qla2xxx_put_sess+0x3c/0x50 [tcm_qla2xxx] > Jul 31 16:32:41 teststorage [ 369.808587] process_one_work+0x144/0x370 > Jul 31 16:32:41 teststorage [ 369.808591] worker_thread+0x4d/0x3c0 > Jul 31 16:32:41 teststorage [ 369.808594] kthread+0x109/0x140 > Jul 31 16:32:41 teststorage [ 369.808597] ? rescuer_thread+0x360/0x360 > Jul 31 16:32:41 teststorage [ 369.808600] ? kthread_park+0x60/0x60 > Jul 31 16:32:41 teststorage [ 369.808720] ret_from_fork+0x25/0x30 > Jul 31 16:32:41 teststorage [ 369.809016] NMI backtrace for cpu 3 > Jul 31 16:32:41 teststorage [ 369.809446] CPU: 3 PID: 107 Comm: khungtaskd Not tainted 4.12.3p1 #3 > Jul 31 16:32:41 teststorage [ 369.809886] Hardware name: ASUS All Series/X99-E WS/USB 3.1, BIOS 3402 11/14/2016 > Jul 31 16:32:41 teststorage [ 369.810326] Call Trace: > Jul 31 16:32:41 teststorage [ 369.810764] dump_stack+0x63/0x8e > Jul 31 16:32:41 teststorage [ 369.811190] nmi_cpu_backtrace+0x9b/0xa0 > Jul 31 16:32:41 teststorage [ 369.811613] ? irq_force_complete_move+0xf0/0xf0 > Jul 31 16:32:41 teststorage [ 369.812036] nmi_trigger_cpumask_backtrace+0x99/0xd0 > Jul 31 16:32:41 teststorage [ 369.812461] arch_trigger_cpumask_backtrace+0x19/0x20 > Jul 31 16:32:41 teststorage [ 369.812890] watchdog+0x2c7/0x340 > Jul 31 16:32:41 teststorage [ 369.813321] kthread+0x109/0x140 > Jul 31 16:32:41 teststorage [ 369.813748] ? reset_hung_task_detector+0x20/0x20 > Jul 31 16:32:41 teststorage [ 369.814171] ? kthread_park+0x60/0x60 > Jul 31 16:32:41 teststorage [ 369.814595] ? kthread_park+0x60/0x60 > Jul 31 16:32:41 teststorage [ 369.815013] ret_from_fork+0x25/0x30 > Jul 31 16:32:41 teststorage [ 369.815428] Sending NMI from CPU 3 to CPUs 0-2,4-11: > Jul 31 16:32:41 teststorage [ 369.815846] NMI backtrace for cpu 0 skipped: idling at pc 0xffffffff818b5ac6 > Jul 31 16:32:41 teststorage [ 369.815946] NMI backtrace for cpu 1 skipped: idling at pc 0xffffffff818b5ac6 > Jul 31 16:32:41 teststorage [ 369.816045] NMI backtrace for cpu 2 skipped: idling at pc 0xffffffff818b5ac6 > Jul 31 16:32:41 teststorage [ 369.816146] NMI backtrace for cpu 4 skipped: idling at pc 0xffffffff818b5ac6 > Jul 31 16:32:41 teststorage [ 369.816247] NMI backtrace for cpu 5 skipped: idling at pc 0xffffffff818b5ac6 > Jul 31 16:32:41 teststorage [ 369.816352] NMI backtrace for cpu 6 skipped: idling at pc 0xffffffff818b5ac6 > Jul 31 16:32:41 teststorage [ 369.816446] NMI backtrace for cpu 7 skipped: idling at pc 0xffffffff818b5ac6 > Jul 31 16:32:41 teststorage [ 369.816547] NMI backtrace for cpu 8 skipped: idling at pc 0xffffffff818b5ac6 > Jul 31 16:32:41 teststorage [ 369.816648] NMI backtrace for cpu 9 skipped: idling at pc 0xffffffff818b5ac6 > Jul 31 16:32:41 teststorage [ 369.816747] NMI backtrace for cpu 10 skipped: idling at pc 0xffffffff818b5ac6 > Jul 31 16:32:41 teststorage [ 369.816851] NMI backtrace for cpu 11 skipped: idling at pc 0xffffffff818b5ac6 > Jul 31 16:32:41 teststorage [ 369.816855] Kernel panic - not syncing: hung_task: blocked tasks > Jul 31 16:32:41 teststorage [ 369.816857] CPU: 3 PID: 107 Comm: khungtaskd Not tainted 4.12.3p1 #3 > Jul 31 16:32:41 teststorage [ 369.816857] Hardware name: ASUS All Series/X99-E WS/USB 3.1, BIOS 3402 11/14/2016 > Jul 31 16:32:41 teststorage [ 369.816857] Call Trace: > Jul 31 16:32:41 teststorage [ 369.816859] dump_stack+0x63/0x8e > Jul 31 16:32:41 teststorage [ 369.816861] panic+0xeb/0x239 > Jul 31 16:32:41 teststorage [ 369.816863] watchdog+0x2d5/0x340 > Jul 31 16:32:41 teststorage [ 369.816864] kthread+0x109/0x140 > Jul 31 16:32:41 teststorage [ 369.816864] ? reset_hung_task_detector+0x20/0x20 > Jul 31 16:32:41 teststorage [ 369.816865] ? kthread_park+0x60/0x60 > Jul 31 16:32:41 teststorage [ 369.816866] ? kthread_park+0x60/0x60 > Jul 31 16:32:41 teststorage [ 369.816866] ret_from_fork+0x25/0x30 > Btw, it looks like this system is running with: CONFIG_BOOTPARAM_HUNG_TASK_PANIC_VALUE=1 which triggers a crash as soon as any hung task is detected. I'd recommend to set to this '0' for now, to monitor what's going on while hung tasks are occurring. Also, this can be set at boot time via hung_task_panic=0. > 8<===== > > > Also, some more details on what your backend device storage behind > > target-core looks like would be helpful. > > > > Physical HBAs, device information, and any MD/RAID/LVM/etc on top.. > > > I have adaptec 7805 HW RAID (6x SAS 600G 15k hdd in RAID0) - exporting 1 > device to system (sdc), > LVM sits on this device and LV (vgtest/test) is used as LIO block > backstore and is exported to ESXs. > (for system I have separate 2 SATA HDDs in software raid1 + LVM) > > [root@TESTSTORAGE ~]# lsblk > NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT > sdb 8:16 0 465,8G 0 disk > └─md126 9:126 0 465,8G 0 raid1 > ├─md126p3 259:2 0 314G 0 md > │ ├─vgsystem-swap 253:1 0 64G 0 lvm > │ ├─vgsystem-root 253:0 0 100G 0 lvm / > │ └─vgsystem-home 253:3 0 200G 0 lvm /home > ├─md126p1 259:0 0 200M 0 md /boot/efi > ├─md126p4 259:3 0 150,6G 0 md > │ └─vgsystem-root 253:0 0 100G 0 lvm / > └─md126p2 259:1 0 1G 0 md /boot > sdc 8:32 1 3,3T 0 disk > └─vgtest-test 253:2 0 3,3T 0 lvm > sda 8:0 0 465,8G 0 disk > └─md126 9:126 0 465,8G 0 raid1 > ├─md126p3 259:2 0 314G 0 md > │ ├─vgsystem-swap 253:1 0 64G 0 lvm > │ ├─vgsystem-root 253:0 0 100G 0 lvm / > │ └─vgsystem-home 253:3 0 200G 0 lvm /home > ├─md126p1 259:0 0 200M 0 md /boot/efi > ├─md126p4 259:3 0 150,6G 0 md > │ └─vgsystem-root 253:0 0 100G 0 lvm / > └─md126p2 259:1 0 1G 0 md /boot > > 8<========= > OK, the vgtest-test LVM is backed by a single SCSI device (sdc), there is a simple test to see if the SCSI device is holding onto I/Os. Once the hung task triggers, check out: chead /sys/block/sdc/inflight This will show how many I/Os are outstanding to the underlying SCSI device. IIRC, md and lvm devices don't use 'inflight' counters, so this is only useful for verifying the number of I/Os outstanding to the underlying SCSI device. So if it's possible, I'd recommend verifying with the raw SCSI device with BLOCK export as well, to isolate if it's something specific to LVM that's causing I/Os not to be completed. Thanks again for your detailed bug-report. -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Nicolas, W dniu 07.08.2017 o 00:28, Nicholas A. Bellinger pisze: > Hi Lukasz, > > (Adding Pascal CC', as he's been seeing something similar with MD > backend storage) > > Thanks alot for the follow up. Comments below. > > On Mon, 2017-07-31 at 17:15 +0200, Łukasz Engel wrote: >> Hi Nicolas, >> >> W dniu 31.07.2017 o 02:00, Nicholas A. Bellinger pisze: [...] >> Jul 31 16:28:54 teststorage [ 142.431045] ABORT_TASK: Found >> referenced qla2xxx task_tag: 1189328 >> Jul 31 16:28:54 teststorage [ 142.431059] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 1189328 >> Jul 31 16:28:57 teststorage [ 145.632571] ABORT_TASK: Found referenced qla2xxx task_tag: 1177932 >> Jul 31 16:28:57 teststorage [ 145.897299] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 1177932 >> Jul 31 16:29:00 teststorage [ 148.388569] ABORT_TASK: Found referenced qla2xxx task_tag: 1164160 >> Jul 31 16:32:00 teststorage [ 328.848344] se_cmd waiting CDB: 0x89 tag: 1164160 t_state: 5 se_cmd_flags: 0x01c80109 transport_state: 0x00000033 > Perfect. This is very helpful to understand state of the I/O triggering > ESX host ABORTs, and that blocks waiting for completion from backend > storage. > > Here's what the above debug translates to: > > CDB: 0x89 = COMPARE_AND_WRITE (eg: VAAI AtomicTestandSet) > t_state: 5 = TRANSPORT_PROCESSING > > se_cmd_flags: 0x01c80109 = > > SCF_SUPPORTED_SAM_OPCODE | SCF_SCSI_DATA_CDB | SCF_SE_LUN_CMD | > SCF_COMPARE_AND_WRITE | SCF_ACK_KREF | SCF_USE_CPUID | SCF_TASK_ATTR_SET > > transport_state: 0x00000033 = > > CMD_T_ABORTED | CMD_T_ACTIVE | CMD_T_SENT | CMD_T_STOP > > Note COMPARE_AND_WRITE wrt to backend I/O consists of two phases. First > phase is a READ and comparison against the first part of the received > buffer. If the comparison matches, then the second phase sends a WRITE > with the second part of the received buffer to complete the update. > > So according to the above output, the first phase of COMPARE_AND_WRITE > dispatched the READ into backend storage. The READ did not complete, > and subsequently ESX sent an ABORT_TASK for it (tag: 1164160). The > ABORT_TASK successfully found the I/O still outstanding to backend > storage, and blocks waiting for it to complete.. > > The debug printing 'se_cmd waiting ..' message shows every 180 seconds, > so it's clear I/O completion from backend storage is not happening back > to IBLOCK -> target-core for an extended amount of time. > > Btw, I've been simulating this same scenario by delaying backend storage > I/O completion, and thus far once the first phase READ of > COMPARE_AND_WRITE eventually returns back to IBLOCK -> target-core, then > ABORT_TASK is able to complete and everything else continues to function > as expected. > > That said, thus far the evidence is pointing to a backend device issue, > where that first READ phase of the COMPARE_AND_WRITE is (never..?) > completed back to IBLOCK -> target-core. > > More below on how to verify this. [...] > Btw, it looks like this system is running with: > > CONFIG_BOOTPARAM_HUNG_TASK_PANIC_VALUE=1 > > which triggers a crash as soon as any hung task is detected. > > I'd recommend to set to this '0' for now, to monitor what's going on > while hung tasks are occurring. > > Also, this can be set at boot time via hung_task_panic=0. Yes, it was planned. Of course for checking "after hung-task" situation I've disabled this now... > >>> Also, some more details on what your backend device storage behind >>> target-core looks like would be helpful. >>> >>> Physical HBAs, device information, and any MD/RAID/LVM/etc on top.. >>> >> I have adaptec 7805 HW RAID (6x SAS 600G 15k hdd in RAID0) - exporting 1 >> device to system (sdc), >> LVM sits on this device and LV (vgtest/test) is used as LIO block >> backstore and is exported to ESXs. >> (for system I have separate 2 SATA HDDs in software raid1 + LVM) [...] >> OK, the vgtest-test LVM is backed by a single SCSI device (sdc), there >> is a simple test to see if the SCSI device is holding onto I/Os. >> >> Once the hung task triggers, check out: >> >> chead /sys/block/sdc/inflight >> >> This will show how many I/Os are outstanding to the underlying SCSI >> device. >> >> IIRC, md and lvm devices don't use 'inflight' counters, so this is only >> useful for verifying the number of I/Os outstanding to the underlying >> SCSI device. So if it's possible, I'd recommend verifying with the raw >> SCSI device with BLOCK export as well, to isolate if it's something >> specific to LVM that's causing I/Os not to be completed. >> >> Thanks again for your detailed bug-report. >> I re-run my test without LVM (sdc directly exported as target's backstore). Result is the same as with LVM - kworker blocked waiting for i/o completion, with identical "se_cmd waiting" report (excluding tag value of course). (My personal feeling is it took a little longer to block than with lvm) I checked inflight counters for sdc - There were some non-zero values during i/o activity (as excected) and in final state (when "se_cmd waiting ..." or hung task is reported) there are zeros there. Tomorrow I plan to retest using different sas controller which I have on shelf (LSI). We will see if it's replicable in that configuration, but as Pascal got this for quite different setup than mine I expect to got similar results. Personally for me (If I properly understand your description) it looks like in some situation i/o completion notification for read part of TAS operation "got lost" somewhere in block/target layer. (And then timeout for operation kicks in esx and it sends ABORT_TASK for this operation) (BTW - I saw some non-zero values also for dm-* inflight counters during i/o activity, so it looks lvm currently (4.12) provide inflight stats ) I will let you know tomorrow abount my findings
Hi Nicolas, W dniu 08.08.2017 o 23:00, Łukasz Engel pisze: > Hi Nicolas, > > [...] > Tomorrow I plan to retest using different sas controller which I have > on shelf (LSI). > We will see if it's replicable in that configuration, but as Pascal > got this for quite different setup than mine I expect to got similar > results. > > Personally for me (If I properly understand your description) it looks > like in some situation i/o completion notification for read part of > TAS operation "got lost" somewhere in block/target layer. (And then > timeout for operation kicks in esx and it sends ABORT_TASK for this > operation) > > > (BTW - I saw some non-zero values also for dm-* inflight counters > during i/o activity, so it looks lvm currently (4.12) provide > inflight stats ) > > I will let you know tomorrow abount my findings > With LSI controller (HW RAID0 built on same 6 HDDs ), without LVM - same results: Aug 9 15:49:32 TESTSTORAGE kernel: Detected MISCOMPARE for addr: ffff881f901e0000 buf: ffff881f83c5c800 Aug 9 15:49:32 TESTSTORAGE kernel: Target/iblock: Send MISCOMPARE check condition and sense Aug 9 15:49:32 TESTSTORAGE kernel: Detected MISCOMPARE for addr: ffff881f83e57000 buf: ffff881f83c5c800 Aug 9 15:49:32 TESTSTORAGE kernel: Target/iblock: Send MISCOMPARE check condition and sense Aug 9 15:49:32 TESTSTORAGE kernel: Detected MISCOMPARE for addr: ffff881f7ee51000 buf: ffff881f83c5c800 Aug 9 15:49:32 TESTSTORAGE kernel: Target/iblock: Send MISCOMPARE check condition and sense Aug 9 15:49:32 TESTSTORAGE kernel: Detected MISCOMPARE for addr: ffff881f81745000 buf: ffff881f83c5c800 Aug 9 15:49:32 TESTSTORAGE kernel: Target/iblock: Send MISCOMPARE check condition and sense Aug 9 15:49:33 TESTSTORAGE kernel: Detected MISCOMPARE for addr: ffff881f7ef99000 buf: ffff881f83c5c800 Aug 9 15:49:33 TESTSTORAGE kernel: Target/iblock: Send MISCOMPARE check condition and sense Aug 9 15:49:33 TESTSTORAGE kernel: Detected MISCOMPARE for addr: ffff881f7e56a000 buf: ffff881f83c5c800 Aug 9 15:49:33 TESTSTORAGE kernel: Target/iblock: Send MISCOMPARE check condition and sense Aug 9 15:49:33 TESTSTORAGE kernel: Detected MISCOMPARE for addr: ffff881f7edf1000 buf: ffff881f83c5c800 Aug 9 15:49:33 TESTSTORAGE kernel: Target/iblock: Send MISCOMPARE check condition and sense Aug 9 15:49:33 TESTSTORAGE kernel: Detected MISCOMPARE for addr: ffff881f8faaa000 buf: ffff881f83c5c800 Aug 9 15:49:33 TESTSTORAGE kernel: Target/iblock: Send MISCOMPARE check condition and sense Aug 9 15:49:37 TESTSTORAGE kernel: Detected MISCOMPARE for addr: ffff881f815d1000 buf: ffff881f83c5fa00 Aug 9 15:49:37 TESTSTORAGE kernel: Target/iblock: Send MISCOMPARE check condition and sense Aug 9 15:49:37 TESTSTORAGE kernel: Detected MISCOMPARE for addr: ffff881f8146e000 buf: ffff881f83c5f200 Aug 9 15:49:37 TESTSTORAGE kernel: Target/iblock: Send MISCOMPARE check condition and sense Aug 9 15:49:37 TESTSTORAGE kernel: ABORT_TASK: Found referenced qla2xxx task_tag: 1138156 Aug 9 15:49:37 TESTSTORAGE kernel: ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 1138156 Aug 9 15:49:37 TESTSTORAGE kernel: ABORT_TASK: Found referenced qla2xxx task_tag: 1138200 Aug 9 15:49:37 TESTSTORAGE kernel: ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 1138200 Aug 9 15:49:50 TESTSTORAGE kernel: ABORT_TASK: Found referenced qla2xxx task_tag: 1193992 Aug 9 15:49:52 TESTSTORAGE kernel: ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 1193992 Aug 9 15:49:52 TESTSTORAGE kernel: ABORT_TASK: Found referenced qla2xxx task_tag: 1194036 Aug 9 15:49:57 TESTSTORAGE kernel: TARGET_CORE[qla2xxx]: Detected NON_EXISTENT_LUN Access for 0x00000001 Aug 9 15:49:57 TESTSTORAGE kernel: TARGET_CORE[qla2xxx]: Detected NON_EXISTENT_LUN Access for 0x00000002 [...] <probably periodic LUN scan issued by esx> Aug 9 15:49:57 TESTSTORAGE kernel: TARGET_CORE[qla2xxx]: Detected NON_EXISTENT_LUN Access for 0x000000ff Aug 9 15:52:09 TESTSTORAGE kernel: INFO: task kworker/u24:0:3001 blocked for more than 120 seconds. Aug 9 15:52:09 TESTSTORAGE kernel: Not tainted 4.12.3p1 #3 Aug 9 15:52:09 TESTSTORAGE kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Aug 9 15:52:09 TESTSTORAGE kernel: kworker/u24:0 D 0 3001 2 0x00000080 Aug 9 15:52:09 TESTSTORAGE kernel: Workqueue: tmr-iblock target_tmr_work [target_core_mod] Aug 9 15:52:09 TESTSTORAGE kernel: Call Trace: Aug 9 15:52:09 TESTSTORAGE kernel: __schedule+0x285/0x840 Aug 9 15:52:09 TESTSTORAGE kernel: schedule+0x36/0x80 Aug 9 15:52:09 TESTSTORAGE kernel: schedule_timeout+0x173/0x2f0 Aug 9 15:52:09 TESTSTORAGE kernel: ? call_timer_fn+0x140/0x140 Aug 9 15:52:09 TESTSTORAGE kernel: ? get_work_pool+0x2d/0x40 Aug 9 15:52:09 TESTSTORAGE kernel: wait_for_completion_timeout+0x12c/0x190 Aug 9 15:52:09 TESTSTORAGE kernel: ? wake_up_q+0x80/0x80 Aug 9 15:52:09 TESTSTORAGE kernel: __transport_wait_for_tasks+0xd9/0x1b0 [target_core_mod] Aug 9 15:52:09 TESTSTORAGE kernel: transport_wait_for_tasks+0x53/0x90 [target_core_mod] Aug 9 15:52:09 TESTSTORAGE kernel: core_tmr_abort_task+0x10e/0x190 [target_core_mod] Aug 9 15:52:09 TESTSTORAGE kernel: target_tmr_work+0x111/0x120 [target_core_mod] Aug 9 15:52:09 TESTSTORAGE kernel: process_one_work+0x144/0x370 Aug 9 15:52:09 TESTSTORAGE kernel: worker_thread+0x4d/0x3c0 Aug 9 15:52:09 TESTSTORAGE kernel: kthread+0x109/0x140 Aug 9 15:52:09 TESTSTORAGE kernel: ? rescuer_thread+0x360/0x360 Aug 9 15:52:09 TESTSTORAGE kernel: ? kthread_park+0x60/0x60 Aug 9 15:52:09 TESTSTORAGE kernel: ret_from_fork+0x25/0x30 Aug 9 15:52:54 TESTSTORAGE kernel: se_cmd waiting CDB: 0x89 tag: 1194036 t_state: 5se_cmd_flags: 0x01c80109 transport_state: 0x00000033 At this point inflight counters for sdc are "0 0". Later there are more "hung task" notifications, but probably they are result of the thirst one... Aug 9 15:54:12 TESTSTORAGE kernel: INFO: task kworker/1:75:3172 blocked for more than 120 seconds. Aug 9 15:54:12 TESTSTORAGE kernel: Not tainted 4.12.3p1 #3 Aug 9 15:54:12 TESTSTORAGE kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Aug 9 15:54:12 TESTSTORAGE kernel: kworker/1:75 D 0 3172 2 0x00000080 Aug 9 15:54:12 TESTSTORAGE kernel: Workqueue: events qlt_free_session_done [qla2xxx] Aug 9 15:54:12 TESTSTORAGE kernel: Call Trace: Aug 9 15:54:12 TESTSTORAGE kernel: __schedule+0x285/0x840 Aug 9 15:54:12 TESTSTORAGE kernel: schedule+0x36/0x80 Aug 9 15:54:12 TESTSTORAGE kernel: schedule_timeout+0x242/0x2f0 Aug 9 15:54:12 TESTSTORAGE kernel: ? ttwu_do_activate+0x77/0x80 Aug 9 15:54:12 TESTSTORAGE kernel: wait_for_completion+0x111/0x170 Aug 9 15:54:12 TESTSTORAGE kernel: ? wake_up_q+0x80/0x80 Aug 9 15:54:12 TESTSTORAGE kernel: target_wait_for_sess_cmds+0x49/0x170 [target_core_mod] Aug 9 15:54:12 TESTSTORAGE kernel: tcm_qla2xxx_free_session+0x45/0x90 [tcm_qla2xxx] Aug 9 15:54:12 TESTSTORAGE kernel: qlt_free_session_done+0x109/0x5a0 [qla2xxx] Aug 9 15:54:12 TESTSTORAGE kernel: ? qlt_unreg_sess+0xe7/0xf0 [qla2xxx] Aug 9 15:54:12 TESTSTORAGE kernel: ? tcm_qla2xxx_put_sess+0x3c/0x50 [tcm_qla2xxx] Aug 9 15:54:12 TESTSTORAGE kernel: process_one_work+0x144/0x370 Aug 9 15:54:12 TESTSTORAGE kernel: worker_thread+0x4d/0x3c0 Aug 9 15:54:12 TESTSTORAGE kernel: kthread+0x109/0x140 Aug 9 15:54:12 TESTSTORAGE kernel: ? rescuer_thread+0x360/0x360 Aug 9 15:54:12 TESTSTORAGE kernel: ? kthread_park+0x60/0x60 Aug 9 15:54:12 TESTSTORAGE kernel: ret_from_fork+0x25/0x30 Aug 9 15:54:12 TESTSTORAGE kernel: INFO: task kworker/1:82:3184 blocked for more than 120 seconds. Aug 9 15:54:12 TESTSTORAGE kernel: Not tainted 4.12.3p1 #3 Aug 9 15:54:12 TESTSTORAGE kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Aug 9 15:54:12 TESTSTORAGE kernel: kworker/1:82 D 0 3184 2 0x00000080 Aug 9 15:54:12 TESTSTORAGE kernel: Workqueue: events qlt_free_session_done [qla2xxx] Aug 9 15:54:12 TESTSTORAGE kernel: Call Trace: Aug 9 15:54:12 TESTSTORAGE kernel: __schedule+0x285/0x840 Aug 9 15:54:12 TESTSTORAGE kernel: schedule+0x36/0x80 Aug 9 15:54:12 TESTSTORAGE kernel: schedule_timeout+0x242/0x2f0 Aug 9 15:54:12 TESTSTORAGE kernel: ? ttwu_do_activate+0x77/0x80 Aug 9 15:54:12 TESTSTORAGE kernel: wait_for_completion+0x111/0x170 Aug 9 15:54:12 TESTSTORAGE kernel: ? wake_up_q+0x80/0x80 Aug 9 15:54:12 TESTSTORAGE kernel: target_wait_for_sess_cmds+0x49/0x170 [target_core_mod] Aug 9 15:54:12 TESTSTORAGE kernel: tcm_qla2xxx_free_session+0x45/0x90 [tcm_qla2xxx] Aug 9 15:54:12 TESTSTORAGE kernel: qlt_free_session_done+0x109/0x5a0 [qla2xxx] Aug 9 15:54:12 TESTSTORAGE kernel: ? qlt_unreg_sess+0xe7/0xf0 [qla2xxx] Aug 9 15:54:12 TESTSTORAGE kernel: ? tcm_qla2xxx_put_sess+0x3c/0x50 [tcm_qla2xxx] Aug 9 15:54:12 TESTSTORAGE kernel: process_one_work+0x144/0x370 Aug 9 15:54:12 TESTSTORAGE kernel: worker_thread+0x4d/0x3c0 Aug 9 15:54:12 TESTSTORAGE kernel: kthread+0x109/0x140 Aug 9 15:54:12 TESTSTORAGE kernel: ? rescuer_thread+0x360/0x360 Aug 9 15:54:12 TESTSTORAGE kernel: ? kthread_park+0x60/0x60 Aug 9 15:54:12 TESTSTORAGE kernel: ret_from_fork+0x25/0x30
diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c index 0197635..0bd8c64 100644 --- a/drivers/target/target_core_transport.c +++ b/drivers/target/target_core_transport.c @@ -2827,7 +2827,13 @@ void transport_clear_lun_ref(struct se_lun *lun) spin_unlock_irqrestore(&cmd->t_state_lock, *flags); - wait_for_completion(&cmd->t_transport_stop_comp); + while (!wait_for_completion_timeout(&cmd->t_transport_stop_comp, + 180 * HZ)) { + pr_warn_ratelimited("se_cmd waiting CDB: 0x%02x tag: %llu t_state: %d" + "se_cmd_flags: 0x%08x transport_state: 0x%08x\n", + cmd->t_task_cdb[0], cmd->tag, cmd->t_state, + cmd->se_cmd_flags, cmd->transport_state); + } spin_lock_irqsave(&cmd->t_state_lock, *flags); cmd->transport_state &= ~(CMD_T_ACTIVE | CMD_T_STOP);