diff mbox series

[PATCHv3,2/2] nvme: cancel requests for real

Message ID 20200528153441.3501777-2-kbusch@kernel.org (mailing list archive)
State New, archived
Headers show
Series [PATCHv3,1/2] blk-mq: export __blk_mq_complete_request | expand

Commit Message

Keith Busch May 28, 2020, 3:34 p.m. UTC
Once the driver decides to cancel requests, the concept of those
requests timing out ceases to exist. Use __blk_mq_complete_request() to
bypass fake timeout error injection so that request reclaim may
proceed.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
---
 drivers/nvme/host/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Alan Adamson May 28, 2020, 6:39 p.m. UTC | #1
Still seeing a hang.

Alan


[   92.034813] FAULT_INJECTION: forcing a failure.
                name fail_io_timeout, interval 1, probability 100, space 
0, times 1000
[   92.034816] CPU: 54 PID: 0 Comm: swapper/54 Not tainted 5.7.0-rc7+ #2
[   92.034817] Hardware name: Oracle Corporation ORACLE SERVER 
X6-2/ASM,MOTHERBOARD,1U, BIOS 38050100 08/30/2016
[   92.034818] Call Trace:
[   92.034819]  <IRQ>
[   92.034827]  dump_stack+0x6d/0x9a
[   92.034830]  should_fail.cold.5+0x32/0x42
[   92.034833]  blk_should_fake_timeout+0x26/0x30
[   92.034835]  blk_mq_complete_request+0x15/0x30
[   92.034840]  nvme_irq+0xd9/0x1f0 [nvme]
[   92.034846]  __handle_irq_event_percpu+0x44/0x190
[   92.034848]  handle_irq_event_percpu+0x32/0x80
[   92.034849]  handle_irq_event+0x3b/0x5a
[   92.034851]  handle_edge_irq+0x87/0x190
[   92.034855]  do_IRQ+0x54/0xe0
[   92.034859]  common_interrupt+0xf/0xf
[   92.034859]  </IRQ>
[   92.034865] RIP: 0010:cpuidle_enter_state+0xc1/0x400
[   92.034866] Code: ff e8 e3 41 93 ff 80 7d c7 00 74 17 9c 58 0f 1f 44 
00 00 f6 c4 02 0f 85 d2 02 00 00 31 ff e8 16 c3 99 ff fb 66 0f 1f 44 00 
00 <45> 85 e4 0f 88 3d 02 00 00 49 63 c4 48 8d 14 40 48 8d 0c c5 00 00
[   92.034867] RSP: 0018:ffffae4ecc6d3e40 EFLAGS: 00000246 ORIG_RAX: 
ffffffffffffffdd
[   92.034869] RAX: ffff9b537fcacc40 RBX: ffffce4ebfc83200 RCX: 
000000000000001f
[   92.034869] RDX: 000000156db3b7a5 RSI: 0000000031573862 RDI: 
0000000000000000
[   92.034870] RBP: ffffae4ecc6d3e80 R08: 0000000000000002 R09: 
000000000002c4c0
[   92.034871] R10: 011c6513a4e6709e R11: ffff9b537fcabb44 R12: 
0000000000000002
[   92.034871] R13: ffffffffa374c120 R14: ffffffffa374c208 R15: 
ffffffffa374c1f0
[   92.034874]  cpuidle_enter+0x2e/0x40
[   92.034877]  call_cpuidle+0x23/0x40
[   92.034878]  do_idle+0x230/0x270
[   92.034880]  cpu_startup_entry+0x1d/0x20
[   92.034885]  start_secondary+0x170/0x1c0
[   92.034889]  secondary_startup_64+0xb6/0xc0
[  122.473526] nvme nvme0: I/O 384 QID 27 timeout, aborting
[  122.473586] nvme nvme0: Abort status: 0x0
[  152.678470] nvme nvme0: I/O 384 QID 27 timeout, reset controller
[  152.697122] blk_update_request: I/O error, dev nvme0n1, sector 0 op 
0x0:(READ) flags 0x80700 phys_seg 4 prio class 0
[  152.699687] nvme nvme0: Shutdown timeout set to 10 seconds
[  152.703131] nvme nvme0: 56/0/0 default/read/poll queues
[  152.703290] FAULT_INJECTION: forcing a failure.
                name fail_io_timeout, interval 1, probability 100, space 
0, times 999
[  152.703293] CPU: 53 PID: 0 Comm: swapper/53 Not tainted 5.7.0-rc7+ #2
[  152.703293] Hardware name: Oracle Corporation ORACLE SERVER 
X6-2/ASM,MOTHERBOARD,1U, BIOS 38050100 08/30/2016
[  152.703294] Call Trace:
[  152.703295]  <IRQ>
[  152.703301]  dump_stack+0x6d/0x9a
[  152.703302]  should_fail.cold.5+0x32/0x42
[  152.703305]  blk_should_fake_timeout+0x26/0x30
[  152.703306]  blk_mq_complete_request+0x15/0x30
[  152.703311]  nvme_irq+0xd9/0x1f0 [nvme]
[  152.703316]  __handle_irq_event_percpu+0x44/0x190
[  152.703317]  handle_irq_event_percpu+0x32/0x80
[  152.703319]  handle_irq_event+0x3b/0x5a
[  152.703321]  handle_edge_irq+0x87/0x190
[  152.703323]  do_IRQ+0x54/0xe0
[  152.703326]  common_interrupt+0xf/0xf
[  152.703327]  </IRQ>
[  152.703330] RIP: 0010:cpuidle_enter_state+0xc1/0x400
[  152.703332] Code: ff e8 e3 41 93 ff 80 7d c7 00 74 17 9c 58 0f 1f 44 
00 00 f6 c4 02 0f 85 d2 02 00 00 31 ff e8 16 c3 99 ff fb 66 0f 1f 44 00 
00 <45> 85 e4 0f 88 3d 02 00 00 49 63 c4 48 8d 14 40 48 8d 0c c5 00 00
[  152.703333] RSP: 0018:ffffae4ecc6cbe40 EFLAGS: 00000246 ORIG_RAX: 
ffffffffffffffdd
[  152.703334] RAX: ffff9b537fc6cc40 RBX: ffffce4ebfc43200 RCX: 
000000000000001f
[  152.703334] RDX: 000000238dd33ffa RSI: 0000000031573862 RDI: 
0000000000000000
[  152.703335] RBP: ffffae4ecc6cbe80 R08: 0000000000000002 R09: 
000000000002c4c0
[  152.703336] R10: 011c653849df4957 R11: ffff9b537fc6bb44 R12: 
0000000000000004
[  152.703336] R13: ffffffffa374c120 R14: ffffffffa374c2d8 R15: 
ffffffffa374c2c0
[  152.703339]  cpuidle_enter+0x2e/0x40
[  152.703342]  call_cpuidle+0x23/0x40
[  152.703343]  do_idle+0x230/0x270
[  152.703344]  cpu_startup_entry+0x1d/0x20
[  152.703348]  start_secondary+0x170/0x1c0
[  152.703352]  secondary_startup_64+0xb6/0xc0
[  182.883552] nvme nvme0: I/O 960 QID 26 timeout, aborting
[  182.883608] nvme nvme0: Abort status: 0x0
[  213.088659] nvme nvme0: I/O 960 QID 26 timeout, reset controller
[  213.109414] nvme nvme0: Shutdown timeout set to 10 seconds
[  213.112785] nvme nvme0: 56/0/0 default/read/poll queues
[  213.112880] FAULT_INJECTION: forcing a failure.
                name fail_io_timeout, interval 1, probability 100, space 
0, times 998
[  213.112882] CPU: 53 PID: 0 Comm: swapper/53 Not tainted 5.7.0-rc7+ #2
[  213.112883] Hardware name: Oracle Corporation ORACLE SERVER 
X6-2/ASM,MOTHERBOARD,1U, BIOS 38050100 08/30/2016
[  213.112883] Call Trace:
[  213.112884]  <IRQ>
[  213.112887]  dump_stack+0x6d/0x9a
[  213.112888]  should_fail.cold.5+0x32/0x42
[  213.112889]  blk_should_fake_timeout+0x26/0x30
[  213.112890]  blk_mq_complete_request+0x15/0x30
[  213.112893]  nvme_irq+0xd9/0x1f0 [nvme]
[  213.112896]  __handle_irq_event_percpu+0x44/0x190
[  213.112897]  handle_irq_event_percpu+0x32/0x80
[  213.112898]  handle_irq_event+0x3b/0x5a
[  213.112900]  handle_edge_irq+0x87/0x190
[  213.112901]  do_IRQ+0x54/0xe0
[  213.112903]  common_interrupt+0xf/0xf
[  213.112903]  </IRQ>
[  213.112905] RIP: 0010:cpuidle_enter_state+0xc1/0x400
[  213.112906] Code: ff e8 e3 41 93 ff 80 7d c7 00 74 17 9c 58 0f 1f 44 
00 00 f6 c4 02 0f 85 d2 02 00 00 31 ff e8 16 c3 99 ff fb 66 0f 1f 44 00 
00 <45> 85 e4 0f 88 3d 02 00 00 49 63 c4 48 8d 14 40 48 8d 0c c5 00 00
[  213.112907] RSP: 0018:ffffae4ecc6cbe40 EFLAGS: 00000246 ORIG_RAX: 
ffffffffffffffdd
[  213.112908] RAX: ffff9b537fc6cc40 RBX: ffffce4ebfc43200 RCX: 
000000000000001f
[  213.112909] RDX: 000000319e847448 RSI: 0000000031573862 RDI: 
0000000000000000
[  213.112909] RBP: ffffae4ecc6cbe80 R08: 0000000000000002 R09: 
000000000002c4c0
[  213.112910] R10: 011c655cc6d029f4 R11: ffff9b537fc6bb44 R12: 
0000000000000002
[  213.112911] R13: ffffffffa374c120 R14: ffffffffa374c208 R15: 
ffffffffa374c1f0
[  213.112913]  cpuidle_enter+0x2e/0x40
[  213.112915]  call_cpuidle+0x23/0x40
[  213.112916]  do_idle+0x230/0x270
[  213.112917]  cpu_startup_entry+0x1d/0x20
[  213.112919]  start_secondary+0x170/0x1c0
[  213.112920]  secondary_startup_64+0xb6/0xc0
[  243.293996] nvme nvme0: I/O 960 QID 26 timeout, aborting
[  243.294050] nvme nvme0: Abort status: 0x0
[  273.498939] nvme nvme0: I/O 960 QID 26 timeout, reset controller
[  273.509832] nvme nvme0: new error during reset
[  273.519648] nvme nvme0: Shutdown timeout set to 10 seconds
[  273.523310] nvme nvme0: 56/0/0 default/read/poll queues
[  273.523397] FAULT_INJECTION: forcing a failure.
                name fail_io_timeout, interval 1, probability 100, space 
0, times 997
[  273.523399] CPU: 53 PID: 0 Comm: swapper/53 Not tainted 5.7.0-rc7+ #2
[  273.523400] Hardware name: Oracle Corporation ORACLE SERVER 
X6-2/ASM,MOTHERBOARD,1U, BIOS 38050100 08/30/2016
[  273.523400] Call Trace:
[  273.523401]  <IRQ>
[  273.523404]  dump_stack+0x6d/0x9a
[  273.523405]  should_fail.cold.5+0x32/0x42
[  273.523406]  blk_should_fake_timeout+0x26/0x30
[  273.523407]  blk_mq_complete_request+0x15/0x30
[  273.523410]  nvme_irq+0xd9/0x1f0 [nvme]
[  273.523412]  __handle_irq_event_percpu+0x44/0x190
[  273.523413]  handle_irq_event_percpu+0x32/0x80
[  273.523415]  handle_irq_event+0x3b/0x5a
[  273.523416]  handle_edge_irq+0x87/0x190
[  273.523417]  do_IRQ+0x54/0xe0
[  273.523419]  common_interrupt+0xf/0xf
[  273.523420]  </IRQ>
[  273.523422] RIP: 0010:cpuidle_enter_state+0xc1/0x400
[  273.523423] Code: ff e8 e3 41 93 ff 80 7d c7 00 74 17 9c 58 0f 1f 44 
00 00 f6 c4 02 0f 85 d2 02 00 00 31 ff e8 16 c3 99 ff fb 66 0f 1f 44 00 
00 <45> 85 e4 0f 88 3d 02 00 00 49 63 c4 48 8d 14 40 48 8d 0c c5 00 00
[  273.523424] RSP: 0018:ffffae4ecc6cbe40 EFLAGS: 00000246 ORIG_RAX: 
ffffffffffffffdd
[  273.523425] RAX: ffff9b537fc6cc40 RBX: ffffce4ebfc43200 RCX: 
000000000000001f
[  273.523425] RDX: 0000003faf43ca28 RSI: 0000000031573862 RDI: 
0000000000000000
[  273.523426] RBP: ffffae4ecc6cbe80 R08: 0000000000000002 R09: 
000000000002c4c0
[  273.523426] R10: 011c658143e5b34c R11: ffff9b537fc6bb44 R12: 
0000000000000002
[  273.523427] R13: ffffffffa374c120 R14: ffffffffa374c208 R15: 
ffffffffa374c1f0
[  273.523429]  cpuidle_enter+0x2e/0x40
[  273.523431]  call_cpuidle+0x23/0x40
[  273.523432]  do_idle+0x230/0x270
[  273.523433]  cpu_startup_entry+0x1d/0x20
[  273.523435]  start_secondary+0x170/0x1c0
[  273.523436]  secondary_startup_64+0xb6/0xc0
[  303.704204] nvme nvme0: I/O 960 QID 26 timeout, aborting
[  303.704245] nvme nvme0: Abort status: 0x0
[  333.909140] nvme nvme0: I/O 960 QID 26 timeout, reset controller
[  333.928010] nvme nvme0: new error during reset
[  333.934087] nvme nvme0: Shutdown timeout set to 10 seconds
[  333.937470] nvme nvme0: 56/0/0 default/read/poll queues
[  333.937571] FAULT_INJECTION: forcing a failure.
                name fail_io_timeout, interval 1, probability 100, space 
0, times 996
[  333.937588] CPU: 53 PID: 0 Comm: swapper/53 Not tainted 5.7.0-rc7+ #2
[  333.937588] Hardware name: Oracle Corporation ORACLE SERVER 
X6-2/ASM,MOTHERBOARD,1U, BIOS 38050100 08/30/2016
[  333.937589] Call Trace:
[  333.937590]  <IRQ>
[  333.937592]  dump_stack+0x6d/0x9a
[  333.937593]  should_fail.cold.5+0x32/0x42
[  333.937594]  blk_should_fake_timeout+0x26/0x30
[  333.937595]  blk_mq_complete_request+0x15/0x30
[  333.937598]  nvme_irq+0xd9/0x1f0 [nvme]
[  333.937600]  __handle_irq_event_percpu+0x44/0x190
[  333.937602]  handle_irq_event_percpu+0x32/0x80
[  333.937603]  handle_irq_event+0x3b/0x5a
[  333.937604]  handle_edge_irq+0x87/0x190
[  333.937606]  do_IRQ+0x54/0xe0
[  333.937607]  common_interrupt+0xf/0xf
[  333.937608]  </IRQ>
[  333.937610] RIP: 0010:cpuidle_enter_state+0xc1/0x400
[  333.937611] Code: ff e8 e3 41 93 ff 80 7d c7 00 74 17 9c 58 0f 1f 44 
00 00 f6 c4 02 0f 85 d2 02 00 00 31 ff e8 16 c3 99 ff fb 66 0f 1f 44 00 
00 <45> 85 e4 0f 88 3d 02 00 00 49 63 c4 48 8d 14 40 48 8d 0c c5 00 00
[  333.937612] RSP: 0018:ffffae4ecc6cbe40 EFLAGS: 00000246 ORIG_RAX: 
ffffffffffffffdd
[  333.937613] RAX: ffff9b537fc6cc40 RBX: ffffce4ebfc43200 RCX: 
000000000000001f
[  333.937613] RDX: 0000004dc03adbda RSI: 0000000031573862 RDI: 
0000000000000000
[  333.937614] RBP: ffffae4ecc6cbe80 R08: 0000000000000002 R09: 
000000000002c4c0
[  333.937615] R10: 011c65a5c18bd252 R11: ffff9b537fc6bb44 R12: 
0000000000000002
[  333.937615] R13: ffffffffa374c120 R14: ffffffffa374c208 R15: 
ffffffffa374c1f0
[  333.937617]  cpuidle_enter+0x2e/0x40
[  333.937619]  call_cpuidle+0x23/0x40
[  333.937620]  do_idle+0x230/0x270
[  333.937621]  cpu_startup_entry+0x1d/0x20
[  333.937623]  start_secondary+0x170/0x1c0
[  333.937624]  secondary_startup_64+0xb6/0xc0
[  364.114200] nvme nvme0: I/O 960 QID 26 timeout, aborting
[  364.114241] nvme nvme0: Abort status: 0x0
[  369.745713] INFO: task kworker/u114:2:1182 blocked for more than 122 
seconds.
[  369.745758]       Not tainted 5.7.0-rc7+ #2
[  369.745787] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
[  369.745808] kworker/u114:2  D    0  1182      2 0x80004000
[  369.745817] Workqueue: nvme-wq nvme_scan_work [nvme_core]
[  369.745818] Call Trace:
[  369.745823]  __schedule+0x2dc/0x710
[  369.745828]  ? __kfifo_to_user_r+0x90/0x90
[  369.745829]  schedule+0x44/0xb0
[  369.745831]  blk_mq_freeze_queue_wait+0x4b/0xb0
[  369.745833]  ? finish_wait+0x80/0x80
[  369.745834]  blk_mq_freeze_queue+0x1a/0x20
[  369.745837]  nvme_update_disk_info+0x62/0x3b0 [nvme_core]
[  369.745839]  __nvme_revalidate_disk+0x8d/0x140 [nvme_core]
[  369.745842]  nvme_revalidate_disk+0xa4/0x140 [nvme_core]
[  369.745843]  ? blk_mq_run_hw_queue+0xba/0x100
[  369.745847]  revalidate_disk+0x2b/0xa0
[  369.745850]  nvme_validate_ns+0x46/0x5b0 [nvme_core]
[  369.745852]  ? __nvme_submit_sync_cmd+0xe0/0x1b0 [nvme_core]
[  369.745855]  nvme_scan_work+0x25a/0x310 [nvme_core]
[  369.745859]  process_one_work+0x1ab/0x380
[  369.745861]  worker_thread+0x37/0x3b0
[  369.745862]  kthread+0x120/0x140
[  369.745863]  ? create_worker+0x1b0/0x1b0
[  369.745864]  ? kthread_park+0x90/0x90
[  369.745867]  ret_from_fork+0x35/0x40


On 5/28/20 8:34 AM, Keith Busch wrote:
> Once the driver decides to cancel requests, the concept of those
> requests timing out ceases to exist. Use __blk_mq_complete_request() to
> bypass fake timeout error injection so that request reclaim may
> proceed.
>
> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
> Signed-off-by: Keith Busch <kbusch@kernel.org>
> ---
>   drivers/nvme/host/core.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index ba860efd250d..f65a0b6cd988 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -310,7 +310,7 @@ bool nvme_cancel_request(struct request *req, void *data, bool reserved)
>   		return true;
>   
>   	nvme_req(req)->status = NVME_SC_HOST_ABORTED_CMD;
> -	blk_mq_complete_request(req);
> +	__blk_mq_complete_request(req);
>   	return true;
>   }
>   EXPORT_SYMBOL_GPL(nvme_cancel_request);
Keith Busch May 28, 2020, 7:11 p.m. UTC | #2
On Thu, May 28, 2020 at 11:39:10AM -0700, Alan Adamson wrote:
> Still seeing a hang.
 
Is it really stuck or are you just not waiting long enough?

The default timeout is 30 seconds, and we wait that long once after
submission, and again after abort for a total of 60 seconds.

But then there's also the retry logic, which by default will retry a
non-permanenet error 5 times (a cancelled request normally qualifies for
retries).

An top of all that, you have to add the time it takes to initialize your
controller. Your controller appears to init quite fast, so that part is
probably negligable for you.

You can expect your fake timeout with 100% probablility to take 150
seconds minimum to return a retryable request. Your "stuck" task has
only been waiting 122 seconds.
 
> [  182.883552] nvme nvme0: I/O 960 QID 26 timeout, aborting
> [  182.883608] nvme nvme0: Abort status: 0x0
> [  213.088659] nvme nvme0: I/O 960 QID 26 timeout, reset controller
> [  213.109414] nvme nvme0: Shutdown timeout set to 10 seconds
> [  213.112785] nvme nvme0: 56/0/0 default/read/poll queues
> [  213.112880] FAULT_INJECTION: forcing a failure.
>                name fail_io_timeout, interval 1, probability 100, space 0,
> times 998
> [  213.112882] CPU: 53 PID: 0 Comm: swapper/53 Not tainted 5.7.0-rc7+ #2
> [  213.112883] Hardware name: Oracle Corporation ORACLE SERVER
> X6-2/ASM,MOTHERBOARD,1U, BIOS 38050100 08/30/2016
> [  213.112883] Call Trace:
> [  213.112884]  <IRQ>
> [  213.112887]  dump_stack+0x6d/0x9a
> [  213.112888]  should_fail.cold.5+0x32/0x42
> [  213.112889]  blk_should_fake_timeout+0x26/0x30
> [  213.112890]  blk_mq_complete_request+0x15/0x30
> [  213.112893]  nvme_irq+0xd9/0x1f0 [nvme]
> [  213.112896]  __handle_irq_event_percpu+0x44/0x190
> [  213.112897]  handle_irq_event_percpu+0x32/0x80
> [  213.112898]  handle_irq_event+0x3b/0x5a
> [  213.112900]  handle_edge_irq+0x87/0x190
> [  213.112901]  do_IRQ+0x54/0xe0
> [  213.112903]  common_interrupt+0xf/0xf
> [  213.112903]  </IRQ>
> [  213.112905] RIP: 0010:cpuidle_enter_state+0xc1/0x400
> [  213.112906] Code: ff e8 e3 41 93 ff 80 7d c7 00 74 17 9c 58 0f 1f 44 00
> 00 f6 c4 02 0f 85 d2 02 00 00 31 ff e8 16 c3 99 ff fb 66 0f 1f 44 00 00 <45>
> 85 e4 0f 88 3d 02 00 00 49 63 c4 48 8d 14 40 48 8d 0c c5 00 00
> [  213.112907] RSP: 0018:ffffae4ecc6cbe40 EFLAGS: 00000246 ORIG_RAX:
> ffffffffffffffdd
> [  213.112908] RAX: ffff9b537fc6cc40 RBX: ffffce4ebfc43200 RCX:
> 000000000000001f
> [  213.112909] RDX: 000000319e847448 RSI: 0000000031573862 RDI:
> 0000000000000000
> [  213.112909] RBP: ffffae4ecc6cbe80 R08: 0000000000000002 R09:
> 000000000002c4c0
> [  213.112910] R10: 011c655cc6d029f4 R11: ffff9b537fc6bb44 R12:
> 0000000000000002
> [  213.112911] R13: ffffffffa374c120 R14: ffffffffa374c208 R15:
> ffffffffa374c1f0
> [  213.112913]  cpuidle_enter+0x2e/0x40
> [  213.112915]  call_cpuidle+0x23/0x40
> [  213.112916]  do_idle+0x230/0x270
> [  213.112917]  cpu_startup_entry+0x1d/0x20
> [  213.112919]  start_secondary+0x170/0x1c0
> [  213.112920]  secondary_startup_64+0xb6/0xc0
> [  243.293996] nvme nvme0: I/O 960 QID 26 timeout, aborting
> [  243.294050] nvme nvme0: Abort status: 0x0
> [  273.498939] nvme nvme0: I/O 960 QID 26 timeout, reset controller
> [  273.509832] nvme nvme0: new error during reset
> [  273.519648] nvme nvme0: Shutdown timeout set to 10 seconds
> [  273.523310] nvme nvme0: 56/0/0 default/read/poll queues
> [  273.523397] FAULT_INJECTION: forcing a failure.
>                name fail_io_timeout, interval 1, probability 100, space 0,
> times 997
> [  273.523399] CPU: 53 PID: 0 Comm: swapper/53 Not tainted 5.7.0-rc7+ #2
> [  273.523400] Hardware name: Oracle Corporation ORACLE SERVER
> X6-2/ASM,MOTHERBOARD,1U, BIOS 38050100 08/30/2016
> [  273.523400] Call Trace:
> [  273.523401]  <IRQ>
> [  273.523404]  dump_stack+0x6d/0x9a
> [  273.523405]  should_fail.cold.5+0x32/0x42
> [  273.523406]  blk_should_fake_timeout+0x26/0x30
> [  273.523407]  blk_mq_complete_request+0x15/0x30
> [  273.523410]  nvme_irq+0xd9/0x1f0 [nvme]
> [  273.523412]  __handle_irq_event_percpu+0x44/0x190
> [  273.523413]  handle_irq_event_percpu+0x32/0x80
> [  273.523415]  handle_irq_event+0x3b/0x5a
> [  273.523416]  handle_edge_irq+0x87/0x190
> [  273.523417]  do_IRQ+0x54/0xe0
> [  273.523419]  common_interrupt+0xf/0xf
> [  273.523420]  </IRQ>
> [  273.523422] RIP: 0010:cpuidle_enter_state+0xc1/0x400
> [  273.523423] Code: ff e8 e3 41 93 ff 80 7d c7 00 74 17 9c 58 0f 1f 44 00
> 00 f6 c4 02 0f 85 d2 02 00 00 31 ff e8 16 c3 99 ff fb 66 0f 1f 44 00 00 <45>
> 85 e4 0f 88 3d 02 00 00 49 63 c4 48 8d 14 40 48 8d 0c c5 00 00
> [  273.523424] RSP: 0018:ffffae4ecc6cbe40 EFLAGS: 00000246 ORIG_RAX:
> ffffffffffffffdd
> [  273.523425] RAX: ffff9b537fc6cc40 RBX: ffffce4ebfc43200 RCX:
> 000000000000001f
> [  273.523425] RDX: 0000003faf43ca28 RSI: 0000000031573862 RDI:
> 0000000000000000
> [  273.523426] RBP: ffffae4ecc6cbe80 R08: 0000000000000002 R09:
> 000000000002c4c0
> [  273.523426] R10: 011c658143e5b34c R11: ffff9b537fc6bb44 R12:
> 0000000000000002
> [  273.523427] R13: ffffffffa374c120 R14: ffffffffa374c208 R15:
> ffffffffa374c1f0
> [  273.523429]  cpuidle_enter+0x2e/0x40
> [  273.523431]  call_cpuidle+0x23/0x40
> [  273.523432]  do_idle+0x230/0x270
> [  273.523433]  cpu_startup_entry+0x1d/0x20
> [  273.523435]  start_secondary+0x170/0x1c0
> [  273.523436]  secondary_startup_64+0xb6/0xc0
> [  303.704204] nvme nvme0: I/O 960 QID 26 timeout, aborting
> [  303.704245] nvme nvme0: Abort status: 0x0
> [  333.909140] nvme nvme0: I/O 960 QID 26 timeout, reset controller
> [  333.928010] nvme nvme0: new error during reset
> [  333.934087] nvme nvme0: Shutdown timeout set to 10 seconds
> [  333.937470] nvme nvme0: 56/0/0 default/read/poll queues
> [  333.937571] FAULT_INJECTION: forcing a failure.
>                name fail_io_timeout, interval 1, probability 100, space 0,
> times 996
> [  333.937588] CPU: 53 PID: 0 Comm: swapper/53 Not tainted 5.7.0-rc7+ #2
> [  333.937588] Hardware name: Oracle Corporation ORACLE SERVER
> X6-2/ASM,MOTHERBOARD,1U, BIOS 38050100 08/30/2016
> [  333.937589] Call Trace:
> [  333.937590]  <IRQ>
> [  333.937592]  dump_stack+0x6d/0x9a
> [  333.937593]  should_fail.cold.5+0x32/0x42
> [  333.937594]  blk_should_fake_timeout+0x26/0x30
> [  333.937595]  blk_mq_complete_request+0x15/0x30
> [  333.937598]  nvme_irq+0xd9/0x1f0 [nvme]
> [  333.937600]  __handle_irq_event_percpu+0x44/0x190
> [  333.937602]  handle_irq_event_percpu+0x32/0x80
> [  333.937603]  handle_irq_event+0x3b/0x5a
> [  333.937604]  handle_edge_irq+0x87/0x190
> [  333.937606]  do_IRQ+0x54/0xe0
> [  333.937607]  common_interrupt+0xf/0xf
> [  333.937608]  </IRQ>
> [  333.937610] RIP: 0010:cpuidle_enter_state+0xc1/0x400
> [  333.937611] Code: ff e8 e3 41 93 ff 80 7d c7 00 74 17 9c 58 0f 1f 44 00
> 00 f6 c4 02 0f 85 d2 02 00 00 31 ff e8 16 c3 99 ff fb 66 0f 1f 44 00 00 <45>
> 85 e4 0f 88 3d 02 00 00 49 63 c4 48 8d 14 40 48 8d 0c c5 00 00
> [  333.937612] RSP: 0018:ffffae4ecc6cbe40 EFLAGS: 00000246 ORIG_RAX:
> ffffffffffffffdd
> [  333.937613] RAX: ffff9b537fc6cc40 RBX: ffffce4ebfc43200 RCX:
> 000000000000001f
> [  333.937613] RDX: 0000004dc03adbda RSI: 0000000031573862 RDI:
> 0000000000000000
> [  333.937614] RBP: ffffae4ecc6cbe80 R08: 0000000000000002 R09:
> 000000000002c4c0
> [  333.937615] R10: 011c65a5c18bd252 R11: ffff9b537fc6bb44 R12:
> 0000000000000002
> [  333.937615] R13: ffffffffa374c120 R14: ffffffffa374c208 R15:
> ffffffffa374c1f0
> [  333.937617]  cpuidle_enter+0x2e/0x40
> [  333.937619]  call_cpuidle+0x23/0x40
> [  333.937620]  do_idle+0x230/0x270
> [  333.937621]  cpu_startup_entry+0x1d/0x20
> [  333.937623]  start_secondary+0x170/0x1c0
> [  333.937624]  secondary_startup_64+0xb6/0xc0
> [  364.114200] nvme nvme0: I/O 960 QID 26 timeout, aborting
> [  364.114241] nvme nvme0: Abort status: 0x0
> [  369.745713] INFO: task kworker/u114:2:1182 blocked for more than 122
> seconds.
> [  369.745758]       Not tainted 5.7.0-rc7+ #2
> [  369.745787] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [  369.745808] kworker/u114:2  D    0  1182      2 0x80004000
> [  369.745817] Workqueue: nvme-wq nvme_scan_work [nvme_core]
> [  369.745818] Call Trace:
> [  369.745823]  __schedule+0x2dc/0x710
> [  369.745828]  ? __kfifo_to_user_r+0x90/0x90
> [  369.745829]  schedule+0x44/0xb0
> [  369.745831]  blk_mq_freeze_queue_wait+0x4b/0xb0
> [  369.745833]  ? finish_wait+0x80/0x80
> [  369.745834]  blk_mq_freeze_queue+0x1a/0x20
> [  369.745837]  nvme_update_disk_info+0x62/0x3b0 [nvme_core]
> [  369.745839]  __nvme_revalidate_disk+0x8d/0x140 [nvme_core]
> [  369.745842]  nvme_revalidate_disk+0xa4/0x140 [nvme_core]
> [  369.745843]  ? blk_mq_run_hw_queue+0xba/0x100
> [  369.745847]  revalidate_disk+0x2b/0xa0
> [  369.745850]  nvme_validate_ns+0x46/0x5b0 [nvme_core]
> [  369.745852]  ? __nvme_submit_sync_cmd+0xe0/0x1b0 [nvme_core]
> [  369.745855]  nvme_scan_work+0x25a/0x310 [nvme_core]
> [  369.745859]  process_one_work+0x1ab/0x380
> [  369.745861]  worker_thread+0x37/0x3b0
> [  369.745862]  kthread+0x120/0x140
> [  369.745863]  ? create_worker+0x1b0/0x1b0
> [  369.745864]  ? kthread_park+0x90/0x90
> [  369.745867]  ret_from_fork+0x35/0x40
Keith Busch May 28, 2020, 7:14 p.m. UTC | #3
On Thu, May 28, 2020 at 12:11:18PM -0700, Keith Busch wrote:
> You can expect your fake timeout with 100% probablility to take 150
> seconds minimum to return a retryable request.

Sorry, I should have said 300 seconds. We restart the timer for aborts,
so 5 requeues * 30 seconds * 2 timer starts = 300 seconds.
Alan Adamson May 28, 2020, 8:21 p.m. UTC | #4
It did come back.  Just took a little longer.

Alan

[  394.319378] nvme nvme0: I/O 960 QID 26 timeout, reset controller
[  394.334228] nvme nvme0: new error during reset
[  394.340125] nvme nvme0: Shutdown timeout set to 10 seconds
[  394.343452] nvme nvme0: 56/0/0 default/read/poll queues
[  394.343557] FAULT_INJECTION: forcing a failure.
                name fail_io_timeout, interval 1, probability 100, space 
0, times 995
[  394.343559] CPU: 53 PID: 0 Comm: swapper/53 Not tainted 5.7.0-rc7+ #2
[  394.343559] Hardware name: Oracle Corporation ORACLE SERVER 
X6-2/ASM,MOTHERBOARD,1U, BIOS 38050100 08/30/2016
[  394.343560] Call Trace:
[  394.343561]  <IRQ>
[  394.343563]  dump_stack+0x6d/0x9a
[  394.343564]  should_fail.cold.5+0x32/0x42
[  394.343566]  blk_should_fake_timeout+0x26/0x30
[  394.343566]  blk_mq_complete_request+0x15/0x30
[  394.343569]  nvme_irq+0xd9/0x1f0 [nvme]
[  394.343571]  __handle_irq_event_percpu+0x44/0x190
[  394.343573]  handle_irq_event_percpu+0x32/0x80
[  394.343574]  handle_irq_event+0x3b/0x5a
[  394.343575]  handle_edge_irq+0x87/0x190
[  394.343577]  do_IRQ+0x54/0xe0
[  394.343578]  common_interrupt+0xf/0xf
[  394.343579]  </IRQ>
[  394.343581] RIP: 0010:cpuidle_enter_state+0xc1/0x400
[  394.343582] Code: ff e8 e3 41 93 ff 80 7d c7 00 74 17 9c 58 0f 1f 44 
00 00 f6 c4 02 0f 85 d2 02 00 00 31 ff e8 16 c3 99 ff fb 66 0f 1f 44 00 
00 <45> 85 e4 0f 88 3d 02 00 00 49 63 c4 48 8d 14 40 48 8d 0c c5 00 00
[  394.343583] RSP: 0018:ffffae4ecc6cbe40 EFLAGS: 00000246 ORIG_RAX: 
ffffffffffffffdd
[  394.343584] RAX: ffff9b537fc6cc40 RBX: ffffce4ebfc43200 RCX: 
000000000000001f
[  394.343585] RDX: 0000005bd0b51f04 RSI: 0000000031573862 RDI: 
0000000000000000
[  394.343585] RBP: ffffae4ecc6cbe80 R08: 0000000000000002 R09: 
000000000002c4c0
[  394.343586] R10: 011c65ca3dee2b0c R11: ffff9b537fc6bb44 R12: 
0000000000000002
[  394.343586] R13: ffffffffa374c120 R14: ffffffffa374c208 R15: 
ffffffffa374c1f0
[  394.343588]  cpuidle_enter+0x2e/0x40
[  394.343590]  call_cpuidle+0x23/0x40
[  394.343591]  do_idle+0x230/0x270
[  394.343592]  cpu_startup_entry+0x1d/0x20
[  394.343594]  start_secondary+0x170/0x1c0
[  394.343596]  secondary_startup_64+0xb6/0xc0
[  424.524341] nvme nvme0: I/O 960 QID 26 timeout, aborting
[  424.524389] nvme nvme0: Abort status: 0x0
[  454.729622] nvme nvme0: I/O 960 QID 26 timeout, reset controller
[  454.740442] nvme nvme0: new error during reset
[  454.750403] nvme nvme0: Shutdown timeout set to 10 seconds
[  454.753508] nvme nvme0: 56/0/0 default/read/poll queues
[  454.753596] FAULT_INJECTION: forcing a failure.
                name fail_io_timeout, interval 1, probability 100, space 
0, times 994
[  454.753598] CPU: 53 PID: 0 Comm: swapper/53 Not tainted 5.7.0-rc7+ #2
[  454.753599] Hardware name: Oracle Corporation ORACLE SERVER 
X6-2/ASM,MOTHERBOARD,1U, BIOS 38050100 08/30/2016
[  454.753599] Call Trace:
[  454.753600]  <IRQ>
[  454.753603]  dump_stack+0x6d/0x9a
[  454.753604]  should_fail.cold.5+0x32/0x42
[  454.753605]  blk_should_fake_timeout+0x26/0x30
[  454.753606]  blk_mq_complete_request+0x15/0x30
[  454.753609]  nvme_irq+0xd9/0x1f0 [nvme]
[  454.753612]  __handle_irq_event_percpu+0x44/0x190
[  454.753613]  handle_irq_event_percpu+0x32/0x80
[  454.753615]  handle_irq_event+0x3b/0x5a
[  454.753616]  handle_edge_irq+0x87/0x190
[  454.753617]  do_IRQ+0x54/0xe0
[  454.753619]  common_interrupt+0xf/0xf
[  454.753620]  </IRQ>
[  454.753622] RIP: 0010:cpuidle_enter_state+0xc1/0x400
[  454.753623] Code: ff e8 e3 41 93 ff 80 7d c7 00 74 17 9c 58 0f 1f 44 
00 00 f6 c4 02 0f 85 d2 02 00 00 31 ff e8 16 c3 99 ff fb 66 0f 1f 44 00 
00 <45> 85 e4 0f 88 3d 02 00 00 49 63 c4 48 8d 14 40 48 8d 0c c5 00 00
[  454.753624] RSP: 0018:ffffae4ecc6cbe40 EFLAGS: 00000246 ORIG_RAX: 
ffffffffffffffdd
[  454.753625] RAX: ffff9b537fc6cc40 RBX: ffffce4ebfc43200 RCX: 
000000000000001f
[  454.753626] RDX: 00000069e16d295b RSI: 0000000031573862 RDI: 
0000000000000000
[  454.753626] RBP: ffffae4ecc6cbe80 R08: 0000000000000002 R09: 
000000000002c4c0
[  454.753627] R10: 011c65eebaf0c79e R11: ffff9b537fc6bb44 R12: 
0000000000000002
[  454.753627] R13: ffffffffa374c120 R14: ffffffffa374c208 R15: 
ffffffffa374c1f0
[  454.753630]  cpuidle_enter+0x2e/0x40
[  454.753632]  call_cpuidle+0x23/0x40
[  454.753633]  do_idle+0x230/0x270
[  454.753634]  cpu_startup_entry+0x1d/0x20
[  454.753636]  start_secondary+0x170/0x1c0
[  454.753637]  secondary_startup_64+0xb6/0xc0
[  484.934587] nvme nvme0: I/O 960 QID 26 timeout, aborting
[  484.934641] nvme nvme0: Abort status: 0x0
[  492.613861] INFO: task kworker/u114:2:1182 blocked for more than 245 
seconds.
[  492.613884]       Not tainted 5.7.0-rc7+ #2
[  492.613912] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
disables this message.
[  492.613948] kworker/u114:2  D    0  1182      2 0x80004000
[  492.613953] Workqueue: nvme-wq nvme_scan_work [nvme_core]
[  492.613954] Call Trace:
[  492.613956]  __schedule+0x2dc/0x710
[  492.613958]  ? __kfifo_to_user_r+0x90/0x90
[  492.613960]  schedule+0x44/0xb0
[  492.613961]  blk_mq_freeze_queue_wait+0x4b/0xb0
[  492.613962]  ? finish_wait+0x80/0x80
[  492.613963]  blk_mq_freeze_queue+0x1a/0x20
[  492.613966]  nvme_update_disk_info+0x62/0x3b0 [nvme_core]
[  492.613968]  __nvme_revalidate_disk+0x8d/0x140 [nvme_core]
[  492.613970]  nvme_revalidate_disk+0xa4/0x140 [nvme_core]
[  492.613971]  ? blk_mq_run_hw_queue+0xba/0x100
[  492.613973]  revalidate_disk+0x2b/0xa0
[  492.613975]  nvme_validate_ns+0x46/0x5b0 [nvme_core]
[  492.613978]  ? __nvme_submit_sync_cmd+0xe0/0x1b0 [nvme_core]
[  492.613980]  nvme_scan_work+0x25a/0x310 [nvme_core]
[  492.613983]  process_one_work+0x1ab/0x380
[  492.613984]  worker_thread+0x37/0x3b0
[  492.613985]  kthread+0x120/0x140
[  492.613990]  ? create_worker+0x1b0/0x1b0
[  492.613992]  ? kthread_park+0x90/0x90
[  492.614011]  ret_from_fork+0x35/0x40
[  515.139879] nvme nvme0: I/O 960 QID 26 timeout, reset controller
[  515.150634] nvme nvme0: new error during reset
[  515.158310] blk_update_request: I/O error, dev nvme0n1, sector 0 op 
0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[  515.158346] Buffer I/O error on dev nvme0n1, logical block 0, async 
page read
[  515.160800] nvme nvme0: Shutdown timeout set to 10 seconds
[  515.163954] nvme nvme0: 56/0/0 default/read/poll queues
[root@localhost ~]#



On 5/28/20 12:14 PM, Keith Busch wrote:
> On Thu, May 28, 2020 at 12:11:18PM -0700, Keith Busch wrote:
>> You can expect your fake timeout with 100% probablility to take 150
>> seconds minimum to return a retryable request.
> Sorry, I should have said 300 seconds. We restart the timer for aborts,
> so 5 requeues * 30 seconds * 2 timer starts = 300 seconds.
Ming Lei May 29, 2020, 3:39 a.m. UTC | #5
On Fri, May 29, 2020 at 4:19 AM Alan Adamson <alan.adamson@oracle.com> wrote:
>
> It did come back.  Just took a little longer.
>
> Alan
>
> [  394.319378] nvme nvme0: I/O 960 QID 26 timeout, reset controller
> [  394.334228] nvme nvme0: new error during reset
> [  394.340125] nvme nvme0: Shutdown timeout set to 10 seconds
> [  394.343452] nvme nvme0: 56/0/0 default/read/poll queues
> [  394.343557] FAULT_INJECTION: forcing a failure.
>                 name fail_io_timeout, interval 1, probability 100, space
> 0, times 995
> [  394.343559] CPU: 53 PID: 0 Comm: swapper/53 Not tainted 5.7.0-rc7+ #2
> [  394.343559] Hardware name: Oracle Corporation ORACLE SERVER
> X6-2/ASM,MOTHERBOARD,1U, BIOS 38050100 08/30/2016
> [  394.343560] Call Trace:
> [  394.343561]  <IRQ>
> [  394.343563]  dump_stack+0x6d/0x9a
> [  394.343564]  should_fail.cold.5+0x32/0x42
> [  394.343566]  blk_should_fake_timeout+0x26/0x30
> [  394.343566]  blk_mq_complete_request+0x15/0x30
> [  394.343569]  nvme_irq+0xd9/0x1f0 [nvme]
> [  394.343571]  __handle_irq_event_percpu+0x44/0x190
> [  394.343573]  handle_irq_event_percpu+0x32/0x80
> [  394.343574]  handle_irq_event+0x3b/0x5a
> [  394.343575]  handle_edge_irq+0x87/0x190
> [  394.343577]  do_IRQ+0x54/0xe0
> [  394.343578]  common_interrupt+0xf/0xf
> [  394.343579]  </IRQ>
> [  394.343581] RIP: 0010:cpuidle_enter_state+0xc1/0x400
> [  394.343582] Code: ff e8 e3 41 93 ff 80 7d c7 00 74 17 9c 58 0f 1f 44
> 00 00 f6 c4 02 0f 85 d2 02 00 00 31 ff e8 16 c3 99 ff fb 66 0f 1f 44 00
> 00 <45> 85 e4 0f 88 3d 02 00 00 49 63 c4 48 8d 14 40 48 8d 0c c5 00 00
> [  394.343583] RSP: 0018:ffffae4ecc6cbe40 EFLAGS: 00000246 ORIG_RAX:
> ffffffffffffffdd
> [  394.343584] RAX: ffff9b537fc6cc40 RBX: ffffce4ebfc43200 RCX:
> 000000000000001f
> [  394.343585] RDX: 0000005bd0b51f04 RSI: 0000000031573862 RDI:
> 0000000000000000
> [  394.343585] RBP: ffffae4ecc6cbe80 R08: 0000000000000002 R09:
> 000000000002c4c0
> [  394.343586] R10: 011c65ca3dee2b0c R11: ffff9b537fc6bb44 R12:
> 0000000000000002
> [  394.343586] R13: ffffffffa374c120 R14: ffffffffa374c208 R15:
> ffffffffa374c1f0
> [  394.343588]  cpuidle_enter+0x2e/0x40
> [  394.343590]  call_cpuidle+0x23/0x40
> [  394.343591]  do_idle+0x230/0x270
> [  394.343592]  cpu_startup_entry+0x1d/0x20
> [  394.343594]  start_secondary+0x170/0x1c0
> [  394.343596]  secondary_startup_64+0xb6/0xc0
> [  424.524341] nvme nvme0: I/O 960 QID 26 timeout, aborting
> [  424.524389] nvme nvme0: Abort status: 0x0
> [  454.729622] nvme nvme0: I/O 960 QID 26 timeout, reset controller
> [  454.740442] nvme nvme0: new error during reset
> [  454.750403] nvme nvme0: Shutdown timeout set to 10 seconds
> [  454.753508] nvme nvme0: 56/0/0 default/read/poll queues
> [  454.753596] FAULT_INJECTION: forcing a failure.
>                 name fail_io_timeout, interval 1, probability 100, space
> 0, times 994
> [  454.753598] CPU: 53 PID: 0 Comm: swapper/53 Not tainted 5.7.0-rc7+ #2
> [  454.753599] Hardware name: Oracle Corporation ORACLE SERVER
> X6-2/ASM,MOTHERBOARD,1U, BIOS 38050100 08/30/2016
> [  454.753599] Call Trace:
> [  454.753600]  <IRQ>
> [  454.753603]  dump_stack+0x6d/0x9a
> [  454.753604]  should_fail.cold.5+0x32/0x42
> [  454.753605]  blk_should_fake_timeout+0x26/0x30
> [  454.753606]  blk_mq_complete_request+0x15/0x30
> [  454.753609]  nvme_irq+0xd9/0x1f0 [nvme]
> [  454.753612]  __handle_irq_event_percpu+0x44/0x190
> [  454.753613]  handle_irq_event_percpu+0x32/0x80
> [  454.753615]  handle_irq_event+0x3b/0x5a
> [  454.753616]  handle_edge_irq+0x87/0x190
> [  454.753617]  do_IRQ+0x54/0xe0
> [  454.753619]  common_interrupt+0xf/0xf
> [  454.753620]  </IRQ>
> [  454.753622] RIP: 0010:cpuidle_enter_state+0xc1/0x400
> [  454.753623] Code: ff e8 e3 41 93 ff 80 7d c7 00 74 17 9c 58 0f 1f 44
> 00 00 f6 c4 02 0f 85 d2 02 00 00 31 ff e8 16 c3 99 ff fb 66 0f 1f 44 00
> 00 <45> 85 e4 0f 88 3d 02 00 00 49 63 c4 48 8d 14 40 48 8d 0c c5 00 00
> [  454.753624] RSP: 0018:ffffae4ecc6cbe40 EFLAGS: 00000246 ORIG_RAX:
> ffffffffffffffdd
> [  454.753625] RAX: ffff9b537fc6cc40 RBX: ffffce4ebfc43200 RCX:
> 000000000000001f
> [  454.753626] RDX: 00000069e16d295b RSI: 0000000031573862 RDI:
> 0000000000000000
> [  454.753626] RBP: ffffae4ecc6cbe80 R08: 0000000000000002 R09:
> 000000000002c4c0
> [  454.753627] R10: 011c65eebaf0c79e R11: ffff9b537fc6bb44 R12:
> 0000000000000002
> [  454.753627] R13: ffffffffa374c120 R14: ffffffffa374c208 R15:
> ffffffffa374c1f0
> [  454.753630]  cpuidle_enter+0x2e/0x40
> [  454.753632]  call_cpuidle+0x23/0x40
> [  454.753633]  do_idle+0x230/0x270
> [  454.753634]  cpu_startup_entry+0x1d/0x20
> [  454.753636]  start_secondary+0x170/0x1c0
> [  454.753637]  secondary_startup_64+0xb6/0xc0
> [  484.934587] nvme nvme0: I/O 960 QID 26 timeout, aborting
> [  484.934641] nvme nvme0: Abort status: 0x0
> [  492.613861] INFO: task kworker/u114:2:1182 blocked for more than 245
> seconds.
> [  492.613884]       Not tainted 5.7.0-rc7+ #2
> [  492.613912] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [  492.613948] kworker/u114:2  D    0  1182      2 0x80004000
> [  492.613953] Workqueue: nvme-wq nvme_scan_work [nvme_core]
> [  492.613954] Call Trace:
> [  492.613956]  __schedule+0x2dc/0x710
> [  492.613958]  ? __kfifo_to_user_r+0x90/0x90
> [  492.613960]  schedule+0x44/0xb0
> [  492.613961]  blk_mq_freeze_queue_wait+0x4b/0xb0
> [  492.613962]  ? finish_wait+0x80/0x80
> [  492.613963]  blk_mq_freeze_queue+0x1a/0x20
> [  492.613966]  nvme_update_disk_info+0x62/0x3b0 [nvme_core]
> [  492.613968]  __nvme_revalidate_disk+0x8d/0x140 [nvme_core]
> [  492.613970]  nvme_revalidate_disk+0xa4/0x140 [nvme_core]
> [  492.613971]  ? blk_mq_run_hw_queue+0xba/0x100
> [  492.613973]  revalidate_disk+0x2b/0xa0
> [  492.613975]  nvme_validate_ns+0x46/0x5b0 [nvme_core]
> [  492.613978]  ? __nvme_submit_sync_cmd+0xe0/0x1b0 [nvme_core]
> [  492.613980]  nvme_scan_work+0x25a/0x310 [nvme_core]
> [  492.613983]  process_one_work+0x1ab/0x380
> [  492.613984]  worker_thread+0x37/0x3b0
> [  492.613985]  kthread+0x120/0x140
> [  492.613990]  ? create_worker+0x1b0/0x1b0
> [  492.613992]  ? kthread_park+0x90/0x90
> [  492.614011]  ret_from_fork+0x35/0x40
> [  515.139879] nvme nvme0: I/O 960 QID 26 timeout, reset controller
> [  515.150634] nvme nvme0: new error during reset

Looks your kernel has applied patchset of '[PATCH 0/3] blk-mq/nvme:
improve nvme-pci reset handler',
and seems it works well.

> [  515.158310] blk_update_request: I/O error, dev nvme0n1, sector 0 op
> 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
> [  515.158346] Buffer I/O error on dev nvme0n1, logical block 0, async
> page read
> [  515.160800] nvme nvme0: Shutdown timeout set to 10 seconds
> [  515.163954] nvme nvme0: 56/0/0 default/read/poll queues

The controller is reset to normal state even though timeout is
triggered during reset.

The hang from scan work context just means that blk_mq_freeze_queue() waits for
too long(245sec) becuase IO request times out and is retried for 5
times, and that may
not be avoided if you fail all IO requets for 1000 times.

As I suggested to you in another thread, if you stop to fail IO
requests after injecting failure
for some time, the scan work will run to end and everything will be
fine. That said NVMe's
error handling becomes better after applying the patchs of '[PATCH
0/3] blk-mq/nvme: improve
nvme-pci reset handler'.

Thanks,
Ming Lei
Ming Lei May 29, 2020, 3:43 a.m. UTC | #6
On Thu, May 28, 2020 at 11:35 PM Keith Busch <kbusch@kernel.org> wrote:
>
> Once the driver decides to cancel requests, the concept of those
> requests timing out ceases to exist. Use __blk_mq_complete_request() to
> bypass fake timeout error injection so that request reclaim may
> proceed.
>
> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
> Signed-off-by: Keith Busch <kbusch@kernel.org>
> ---
>  drivers/nvme/host/core.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index ba860efd250d..f65a0b6cd988 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -310,7 +310,7 @@ bool nvme_cancel_request(struct request *req, void *data, bool reserved)
>                 return true;
>
>         nvme_req(req)->status = NVME_SC_HOST_ABORTED_CMD;
> -       blk_mq_complete_request(req);
> +       __blk_mq_complete_request(req);
>         return true;
>  }
>  EXPORT_SYMBOL_GPL(nvme_cancel_request);

Looks reaonable,

Reviewed-by: Ming Lei <ming.lei@redhat.com>
Keith Busch May 29, 2020, 1:22 p.m. UTC | #7
On Fri, May 29, 2020 at 11:39:46AM +0800, Ming Lei wrote:
> On Fri, May 29, 2020 at 4:19 AM Alan Adamson <alan.adamson@oracle.com> wrote:
> That said NVMe's
> error handling becomes better after applying the patchs of '[PATCH
> 0/3] blk-mq/nvme: improve
> nvme-pci reset handler'.

I think that's a bit debatable. Alan is synthesizing a truly broken
controller. The current code will abandon this controller after about 30
seconds. Your series will reset that broken controller indefinitely.
Which of those options is better?

I think degrading to an admin-only mode at some point would be preferable.
Ming Lei May 29, 2020, 10:23 p.m. UTC | #8
On Fri, May 29, 2020 at 9:22 PM Keith Busch <kbusch@kernel.org> wrote:
>
> On Fri, May 29, 2020 at 11:39:46AM +0800, Ming Lei wrote:
> > On Fri, May 29, 2020 at 4:19 AM Alan Adamson <alan.adamson@oracle.com> wrote:
> > That said NVMe's
> > error handling becomes better after applying the patchs of '[PATCH
> > 0/3] blk-mq/nvme: improve
> > nvme-pci reset handler'.
>
> I think that's a bit debatable. Alan is synthesizing a truly broken
> controller. The current code will abandon this controller after about 30

Not sure it can be thought as a truly broken controller. When waiting
on nvme_wait_freeze()
during reset, the controller has been in normal state.  There is still chance to
trigger timeout by any occasional event, just like any other timeout,
which isn't
special enough for us to have to kill the controller.

> seconds. Your series will reset that broken controller indefinitely.
> Which of those options is better?

Removing controller is very horrible, because it becomes a brick
basically, together
with data loss. And we should retry enough before killing the controller.

Mys series doesn't reset indefinitely since every request is just
retried limited
times(default is 5), at least chance should be provided to retry
claimed times for IO
requests.

>
> I think degrading to an admin-only mode at some point would be preferable.

If the timeout event is occasional, this way gives up too early and
doesn't retry
claimed times, then peopele may complain for data loss.


Thanks,
Ming Lei
Keith Busch May 29, 2020, 10:32 p.m. UTC | #9
On Sat, May 30, 2020 at 06:23:08AM +0800, Ming Lei wrote:
> On Fri, May 29, 2020 at 9:22 PM Keith Busch <kbusch@kernel.org> wrote:
> > seconds. Your series will reset that broken controller indefinitely.
> > Which of those options is better?
> 
> Removing controller is very horrible, because it becomes a brick
> basically, together
> with data loss. And we should retry enough before killing the controller.
> 
> Mys series doesn't reset indefinitely since every request is just
> retried limited
> times(default is 5), at least chance should be provided to retry
> claimed times for IO
> requests.

Once the 5th retry is abandoned for all IO in the scheduled scan_work,
the reset will succeed and schedule scan_work, which will revalidate
disks, which will send new IO, which will timeout, then reset and
repeat...
Ming Lei May 29, 2020, 10:46 p.m. UTC | #10
On Sat, May 30, 2020 at 6:32 AM Keith Busch <kbusch@kernel.org> wrote:
>
> On Sat, May 30, 2020 at 06:23:08AM +0800, Ming Lei wrote:
> > On Fri, May 29, 2020 at 9:22 PM Keith Busch <kbusch@kernel.org> wrote:
> > > seconds. Your series will reset that broken controller indefinitely.
> > > Which of those options is better?
> >
> > Removing controller is very horrible, because it becomes a brick
> > basically, together
> > with data loss. And we should retry enough before killing the controller.
> >
> > Mys series doesn't reset indefinitely since every request is just
> > retried limited
> > times(default is 5), at least chance should be provided to retry
> > claimed times for IO
> > requests.
>
> Once the 5th retry is abandoned for all IO in the scheduled scan_work,
> the reset will succeed and schedule scan_work, which will revalidate
> disks, which will send new IO, which will timeout, then reset and
> repeat...

Firstly, we can recoganize this situation easily during reset, and give up
after we have retried claimed times, will do that in V2.

Secondly, not sure revalidate will send new IO since all previous IOs have
been failed.

Thanks,
Ming Lei
diff mbox series

Patch

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index ba860efd250d..f65a0b6cd988 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -310,7 +310,7 @@  bool nvme_cancel_request(struct request *req, void *data, bool reserved)
 		return true;
 
 	nvme_req(req)->status = NVME_SC_HOST_ABORTED_CMD;
-	blk_mq_complete_request(req);
+	__blk_mq_complete_request(req);
 	return true;
 }
 EXPORT_SYMBOL_GPL(nvme_cancel_request);