mbox series

[FIXES/IMPROVEMENTS,0/7] block, bfq: preserve control, boost throughput, fix bugs

Message ID 20210619140948.98712-1-paolo.valente@linaro.org (mailing list archive)
Headers show
Series block, bfq: preserve control, boost throughput, fix bugs | expand

Message

Paolo Valente June 19, 2021, 2:09 p.m. UTC
Hi Jens,
this series contains an already proposed patch by Luca, plus six new
patches. The goals of these patches are summarized in the subject of
this cover letter. I'm including Luca's patch here, because it enabled
the actual use of stable merge, and, as such, triggered an otherwise
silent bug. This series contains also the fix for that bug ("block,
bfq: avoid delayed merge of async queues"), tested by Holger [1].

Thanks,
Paolo

[1] https://lkml.org/lkml/2021/5/18/384

Luca Mariotti (1):
  block, bfq: fix delayed stable merge check

Paolo Valente (5):
  block, bfq: let also stably merged queues enjoy weight raising
  block, bfq: consider also creation time in delayed stable merge
  block, bfq: avoid delayed merge of async queues
  block, bfq: check waker only for queues with no in-flight I/O
  block, bfq: reset waker pointer with shared queues

Pietro Pedroni (1):
  block, bfq: boost throughput by extending queue-merging times

 block/bfq-iosched.c | 68 +++++++++++++++++++++++++++++++++++----------
 1 file changed, 53 insertions(+), 15 deletions(-)

--
2.20.1

Comments

Jens Axboe June 21, 2021, 4:08 p.m. UTC | #1
On 6/19/21 8:09 AM, Paolo Valente wrote:
> Hi Jens,
> this series contains an already proposed patch by Luca, plus six new
> patches. The goals of these patches are summarized in the subject of
> this cover letter. I'm including Luca's patch here, because it enabled
> the actual use of stable merge, and, as such, triggered an otherwise
> silent bug. This series contains also the fix for that bug ("block,
> bfq: avoid delayed merge of async queues"), tested by Holger [1].

Applied, thanks.
Oleksandr Natalenko June 21, 2021, 7:55 p.m. UTC | #2
Hello.

On sobota 19. června 2021 16:09:41 CEST Paolo Valente wrote:
> Hi Jens,
> this series contains an already proposed patch by Luca, plus six new
> patches. The goals of these patches are summarized in the subject of
> this cover letter. I'm including Luca's patch here, because it enabled
> the actual use of stable merge, and, as such, triggered an otherwise
> silent bug. This series contains also the fix for that bug ("block,
> bfq: avoid delayed merge of async queues"), tested by Holger [1].
> 
> Thanks,
> Paolo
> 
> [1] https://lkml.org/lkml/2021/5/18/384
> 
> Luca Mariotti (1):
>   block, bfq: fix delayed stable merge check
> 
> Paolo Valente (5):
>   block, bfq: let also stably merged queues enjoy weight raising
>   block, bfq: consider also creation time in delayed stable merge
>   block, bfq: avoid delayed merge of async queues
>   block, bfq: check waker only for queues with no in-flight I/O
>   block, bfq: reset waker pointer with shared queues
> 
> Pietro Pedroni (1):
>   block, bfq: boost throughput by extending queue-merging times
> 
>  block/bfq-iosched.c | 68 +++++++++++++++++++++++++++++++++++----------
>  1 file changed, 53 insertions(+), 15 deletions(-)
> 
> --
> 2.20.1

Not sure everything goes fine here. After applying this series on top of the 
latest stable 5.12 kernel I got this:

```
[16730.963248] kernel BUG at block/elevator.c:236!
[16730.963254] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[16730.963257] CPU: 11 PID: 109170 Comm: kworker/u64:5 Tainted: G        W         
5.12.0-pf7 #1
[16730.963260] Hardware name: ASUS System Product Name/Pro WS X570-ACE, BIOS 
3601 05/26/2021
[16730.963263] Workqueue: dm-thin do_worker [dm_thin_pool]
[16730.963270] RIP: 0010:elv_rqhash_find+0xcc/0xd0
[16730.963274] Code: 41 89 f0 81 e2 00 40 06 00 41 81 e0 1a 00 04 00 44 09 c2 
75 a9 be 09 00 00 00 c4 e2 4b f7 50 28 48 03 50 30 48 39 fa 75 c6 c3 <0f> 0b 
66 90 0f 1f 44 00 00 41 56 41 55 41 54 55 53 48 8b 47 68 48
[16730.963276] RSP: 0018:ffffa558d13b7af8 EFLAGS: 00010046
[16730.963279] RAX: ffff8a0007782d00 RBX: ffff8a0014b93000 RCX: ffffa558d13b7b78
[16730.963281] RDX: ffff8a0014b93000 RSI: 0000000000063082 RDI: 000000001e0fdc00
[16730.963283] RBP: ffff8a000731c770 R08: ffff8a000731c770 R09: fffffff0ffffddfb
[16730.963284] R10: 0000000000000000 R11: 0000000000000400 R12: ffff8a0330365c00
[16730.963286] R13: ffffa558d13b7b30 R14: 0000000000000000 R15: ffff8a0212fc4000
[16730.963288] FS:  0000000000000000(0000) GS:ffff8a070ecc0000(0000) knlGS:
0000000000000000
[16730.963290] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[16730.963292] CR2: 00007f1514d90f4c CR3: 0000000315952000 CR4: 
0000000000350ee0
[16730.963294] Call Trace:
[16730.963297]  elv_merge+0x96/0x120
[16730.963300]  blk_mq_sched_try_merge+0x3e/0x370
[16730.963303]  bfq_bio_merge+0xd3/0x130
[16730.963306]  blk_mq_submit_bio+0x11e/0x6c0
[16730.963309]  submit_bio_noacct+0x457/0x530
[16730.963312]  raid10_unplug+0x13f/0x1a0 [raid10]
[16730.963316]  blk_flush_plug_list+0xa9/0x110
[16730.963319]  blk_finish_plug+0x21/0x30
[16730.963322]  process_prepared_discard_passdown_pt1+0x204/0x2d0 
[dm_thin_pool]
[16730.963327]  do_worker+0x18e/0xce0 [dm_thin_pool]
[16730.963335]  process_one_work+0x217/0x3e0
[16730.963338]  worker_thread+0x4d/0x470
[16730.963343]  kthread+0x182/0x1b0
[16730.963349]  ret_from_fork+0x22/0x30
…
[16730.963419] ---[ end trace dd7e037f2028257b ]---
[16730.963524] RIP: 0010:elv_rqhash_find+0xcc/0xd0
…
[16730.963547] note: kworker/u64:5[109170] exited with preempt_count 1
[16747.948467] [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for 
fences timed out!
```

Which is:

```
229 struct request *elv_rqhash_find(struct request_queue *q, sector_t offset)
230 {
…
235     hash_for_each_possible_safe(e->hash, rq, next, hash, offset) {
236         BUG_ON(!ELV_ON_HASH(rq));
…
```

Yes, I carry some extra patches besides this series (the list is against v5.12 
GA):

```
block, bfq: reset waker pointer with shared queues
block, bfq: check waker only for queues with no in-flight I/O
block, bfq: avoid delayed merge of async queues
block, bfq: boost throughput by extending queue-merging times
block, bfq: consider also creation time in delayed stable merge
block, bfq: fix delayed stable merge check
block, bfq: let also stably merged queues enjoy weight raising
block: Do not pull requests from the scheduler when we cannot dispatch them
blk: Fix lock inversion between ioc lock and bfqd lock
bfq: Remove merged request already in bfq_requests_merged()
block, bfq: avoid circular stable merges
bfq: remove unnecessary BFQ_DEFAULT_GRP_IOPRIO
bfq: reset entity->prio_changed in bfq_init_entity()
bfq: optimize the calculation of bfq_weight_to_ioprio()
bfq: remove unnecessary initialization logic
bfq: keep the minimun bandwidth for CLASS_BE
bfq: limit the IO depth of CLASS_IDLE to 1
bfq: convert the type of bfq_group.bfqd to bfq_data*
bfq: introduce bfq_entity_to_bfqg helper method
bfq/mq-deadline: remove redundant check for passthrough request
blk-mq: bypass IO scheduler's limit_depth for passthrough request
block,bfq: fix the timeout calculation in bfq_bfqq_charge_time
block, bfq: merge bursts of newly-created queues
block, bfq: keep shared queues out of the waker mechanism
block, bfq: fix weight-raising resume with !low_latency
block, bfq: make shared queues inherit wakers
block, bfq: put reqs of waker and woken in dispatch list
block, bfq: always inject I/O of queues blocked by wakers
```

but nothing from there triggered this for quite some time.

Paolo, what do you think?
Piotr Górski June 21, 2021, 8:03 p.m. UTC | #3
I have tested this with myself and this error does not occur with me
and I have not noticed any regressions. I have applied almost exactly
the same patches as Oleksandr.

pon., 21 cze 2021 o 21:55 Oleksandr Natalenko
<oleksandr@natalenko.name> napisał(a):
>
> Hello.
>
> On sobota 19. června 2021 16:09:41 CEST Paolo Valente wrote:
> > Hi Jens,
> > this series contains an already proposed patch by Luca, plus six new
> > patches. The goals of these patches are summarized in the subject of
> > this cover letter. I'm including Luca's patch here, because it enabled
> > the actual use of stable merge, and, as such, triggered an otherwise
> > silent bug. This series contains also the fix for that bug ("block,
> > bfq: avoid delayed merge of async queues"), tested by Holger [1].
> >
> > Thanks,
> > Paolo
> >
> > [1] https://lkml.org/lkml/2021/5/18/384
> >
> > Luca Mariotti (1):
> >   block, bfq: fix delayed stable merge check
> >
> > Paolo Valente (5):
> >   block, bfq: let also stably merged queues enjoy weight raising
> >   block, bfq: consider also creation time in delayed stable merge
> >   block, bfq: avoid delayed merge of async queues
> >   block, bfq: check waker only for queues with no in-flight I/O
> >   block, bfq: reset waker pointer with shared queues
> >
> > Pietro Pedroni (1):
> >   block, bfq: boost throughput by extending queue-merging times
> >
> >  block/bfq-iosched.c | 68 +++++++++++++++++++++++++++++++++++----------
> >  1 file changed, 53 insertions(+), 15 deletions(-)
> >
> > --
> > 2.20.1
>
> Not sure everything goes fine here. After applying this series on top of the
> latest stable 5.12 kernel I got this:
>
> ```
> [16730.963248] kernel BUG at block/elevator.c:236!
> [16730.963254] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
> [16730.963257] CPU: 11 PID: 109170 Comm: kworker/u64:5 Tainted: G        W
> 5.12.0-pf7 #1
> [16730.963260] Hardware name: ASUS System Product Name/Pro WS X570-ACE, BIOS
> 3601 05/26/2021
> [16730.963263] Workqueue: dm-thin do_worker [dm_thin_pool]
> [16730.963270] RIP: 0010:elv_rqhash_find+0xcc/0xd0
> [16730.963274] Code: 41 89 f0 81 e2 00 40 06 00 41 81 e0 1a 00 04 00 44 09 c2
> 75 a9 be 09 00 00 00 c4 e2 4b f7 50 28 48 03 50 30 48 39 fa 75 c6 c3 <0f> 0b
> 66 90 0f 1f 44 00 00 41 56 41 55 41 54 55 53 48 8b 47 68 48
> [16730.963276] RSP: 0018:ffffa558d13b7af8 EFLAGS: 00010046
> [16730.963279] RAX: ffff8a0007782d00 RBX: ffff8a0014b93000 RCX: ffffa558d13b7b78
> [16730.963281] RDX: ffff8a0014b93000 RSI: 0000000000063082 RDI: 000000001e0fdc00
> [16730.963283] RBP: ffff8a000731c770 R08: ffff8a000731c770 R09: fffffff0ffffddfb
> [16730.963284] R10: 0000000000000000 R11: 0000000000000400 R12: ffff8a0330365c00
> [16730.963286] R13: ffffa558d13b7b30 R14: 0000000000000000 R15: ffff8a0212fc4000
> [16730.963288] FS:  0000000000000000(0000) GS:ffff8a070ecc0000(0000) knlGS:
> 0000000000000000
> [16730.963290] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [16730.963292] CR2: 00007f1514d90f4c CR3: 0000000315952000 CR4:
> 0000000000350ee0
> [16730.963294] Call Trace:
> [16730.963297]  elv_merge+0x96/0x120
> [16730.963300]  blk_mq_sched_try_merge+0x3e/0x370
> [16730.963303]  bfq_bio_merge+0xd3/0x130
> [16730.963306]  blk_mq_submit_bio+0x11e/0x6c0
> [16730.963309]  submit_bio_noacct+0x457/0x530
> [16730.963312]  raid10_unplug+0x13f/0x1a0 [raid10]
> [16730.963316]  blk_flush_plug_list+0xa9/0x110
> [16730.963319]  blk_finish_plug+0x21/0x30
> [16730.963322]  process_prepared_discard_passdown_pt1+0x204/0x2d0
> [dm_thin_pool]
> [16730.963327]  do_worker+0x18e/0xce0 [dm_thin_pool]
> [16730.963335]  process_one_work+0x217/0x3e0
> [16730.963338]  worker_thread+0x4d/0x470
> [16730.963343]  kthread+0x182/0x1b0
> [16730.963349]  ret_from_fork+0x22/0x30
> …
> [16730.963419] ---[ end trace dd7e037f2028257b ]---
> [16730.963524] RIP: 0010:elv_rqhash_find+0xcc/0xd0
> …
> [16730.963547] note: kworker/u64:5[109170] exited with preempt_count 1
> [16747.948467] [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for
> fences timed out!
> ```
>
> Which is:
>
> ```
> 229 struct request *elv_rqhash_find(struct request_queue *q, sector_t offset)
> 230 {
> …
> 235     hash_for_each_possible_safe(e->hash, rq, next, hash, offset) {
> 236         BUG_ON(!ELV_ON_HASH(rq));
> …
> ```
>
> Yes, I carry some extra patches besides this series (the list is against v5.12
> GA):
>
> ```
> block, bfq: reset waker pointer with shared queues
> block, bfq: check waker only for queues with no in-flight I/O
> block, bfq: avoid delayed merge of async queues
> block, bfq: boost throughput by extending queue-merging times
> block, bfq: consider also creation time in delayed stable merge
> block, bfq: fix delayed stable merge check
> block, bfq: let also stably merged queues enjoy weight raising
> block: Do not pull requests from the scheduler when we cannot dispatch them
> blk: Fix lock inversion between ioc lock and bfqd lock
> bfq: Remove merged request already in bfq_requests_merged()
> block, bfq: avoid circular stable merges
> bfq: remove unnecessary BFQ_DEFAULT_GRP_IOPRIO
> bfq: reset entity->prio_changed in bfq_init_entity()
> bfq: optimize the calculation of bfq_weight_to_ioprio()
> bfq: remove unnecessary initialization logic
> bfq: keep the minimun bandwidth for CLASS_BE
> bfq: limit the IO depth of CLASS_IDLE to 1
> bfq: convert the type of bfq_group.bfqd to bfq_data*
> bfq: introduce bfq_entity_to_bfqg helper method
> bfq/mq-deadline: remove redundant check for passthrough request
> blk-mq: bypass IO scheduler's limit_depth for passthrough request
> block,bfq: fix the timeout calculation in bfq_bfqq_charge_time
> block, bfq: merge bursts of newly-created queues
> block, bfq: keep shared queues out of the waker mechanism
> block, bfq: fix weight-raising resume with !low_latency
> block, bfq: make shared queues inherit wakers
> block, bfq: put reqs of waker and woken in dispatch list
> block, bfq: always inject I/O of queues blocked by wakers
> ```
>
> but nothing from there triggered this for quite some time.
>
> Paolo, what do you think?
>
> --
> Oleksandr Natalenko (post-factum)
>
>
Paolo Valente June 22, 2021, 7:08 a.m. UTC | #4
Hi,
CCing also Jan and Khazhy, because in your commit log I see also the
commit on bfq_requests_merged().

Is this OOPS reproducible for you?

Thanks,
Paolo


> Il giorno 21 giu 2021, alle ore 21:55, Oleksandr Natalenko <oleksandr@natalenko.name> ha scritto:
> 
> Hello.
> 
> On sobota 19. června 2021 16:09:41 CEST Paolo Valente wrote:
>> Hi Jens,
>> this series contains an already proposed patch by Luca, plus six new
>> patches. The goals of these patches are summarized in the subject of
>> this cover letter. I'm including Luca's patch here, because it enabled
>> the actual use of stable merge, and, as such, triggered an otherwise
>> silent bug. This series contains also the fix for that bug ("block,
>> bfq: avoid delayed merge of async queues"), tested by Holger [1].
>> 
>> Thanks,
>> Paolo
>> 
>> [1] https://lkml.org/lkml/2021/5/18/384
>> 
>> Luca Mariotti (1):
>>  block, bfq: fix delayed stable merge check
>> 
>> Paolo Valente (5):
>>  block, bfq: let also stably merged queues enjoy weight raising
>>  block, bfq: consider also creation time in delayed stable merge
>>  block, bfq: avoid delayed merge of async queues
>>  block, bfq: check waker only for queues with no in-flight I/O
>>  block, bfq: reset waker pointer with shared queues
>> 
>> Pietro Pedroni (1):
>>  block, bfq: boost throughput by extending queue-merging times
>> 
>> block/bfq-iosched.c | 68 +++++++++++++++++++++++++++++++++++----------
>> 1 file changed, 53 insertions(+), 15 deletions(-)
>> 
>> --
>> 2.20.1
> 
> Not sure everything goes fine here. After applying this series on top of the 
> latest stable 5.12 kernel I got this:
> 
> ```
> [16730.963248] kernel BUG at block/elevator.c:236!
> [16730.963254] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
> [16730.963257] CPU: 11 PID: 109170 Comm: kworker/u64:5 Tainted: G        W         
> 5.12.0-pf7 #1
> [16730.963260] Hardware name: ASUS System Product Name/Pro WS X570-ACE, BIOS 
> 3601 05/26/2021
> [16730.963263] Workqueue: dm-thin do_worker [dm_thin_pool]
> [16730.963270] RIP: 0010:elv_rqhash_find+0xcc/0xd0
> [16730.963274] Code: 41 89 f0 81 e2 00 40 06 00 41 81 e0 1a 00 04 00 44 09 c2 
> 75 a9 be 09 00 00 00 c4 e2 4b f7 50 28 48 03 50 30 48 39 fa 75 c6 c3 <0f> 0b 
> 66 90 0f 1f 44 00 00 41 56 41 55 41 54 55 53 48 8b 47 68 48
> [16730.963276] RSP: 0018:ffffa558d13b7af8 EFLAGS: 00010046
> [16730.963279] RAX: ffff8a0007782d00 RBX: ffff8a0014b93000 RCX: ffffa558d13b7b78
> [16730.963281] RDX: ffff8a0014b93000 RSI: 0000000000063082 RDI: 000000001e0fdc00
> [16730.963283] RBP: ffff8a000731c770 R08: ffff8a000731c770 R09: fffffff0ffffddfb
> [16730.963284] R10: 0000000000000000 R11: 0000000000000400 R12: ffff8a0330365c00
> [16730.963286] R13: ffffa558d13b7b30 R14: 0000000000000000 R15: ffff8a0212fc4000
> [16730.963288] FS:  0000000000000000(0000) GS:ffff8a070ecc0000(0000) knlGS:
> 0000000000000000
> [16730.963290] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [16730.963292] CR2: 00007f1514d90f4c CR3: 0000000315952000 CR4: 
> 0000000000350ee0
> [16730.963294] Call Trace:
> [16730.963297]  elv_merge+0x96/0x120
> [16730.963300]  blk_mq_sched_try_merge+0x3e/0x370
> [16730.963303]  bfq_bio_merge+0xd3/0x130
> [16730.963306]  blk_mq_submit_bio+0x11e/0x6c0
> [16730.963309]  submit_bio_noacct+0x457/0x530
> [16730.963312]  raid10_unplug+0x13f/0x1a0 [raid10]
> [16730.963316]  blk_flush_plug_list+0xa9/0x110
> [16730.963319]  blk_finish_plug+0x21/0x30
> [16730.963322]  process_prepared_discard_passdown_pt1+0x204/0x2d0 
> [dm_thin_pool]
> [16730.963327]  do_worker+0x18e/0xce0 [dm_thin_pool]
> [16730.963335]  process_one_work+0x217/0x3e0
> [16730.963338]  worker_thread+0x4d/0x470
> [16730.963343]  kthread+0x182/0x1b0
> [16730.963349]  ret_from_fork+0x22/0x30
> …
> [16730.963419] ---[ end trace dd7e037f2028257b ]---
> [16730.963524] RIP: 0010:elv_rqhash_find+0xcc/0xd0
> …
> [16730.963547] note: kworker/u64:5[109170] exited with preempt_count 1
> [16747.948467] [drm:amdgpu_dm_atomic_commit_tail [amdgpu]] *ERROR* Waiting for 
> fences timed out!
> ```
> 
> Which is:
> 
> ```
> 229 struct request *elv_rqhash_find(struct request_queue *q, sector_t offset)
> 230 {
> …
> 235     hash_for_each_possible_safe(e->hash, rq, next, hash, offset) {
> 236         BUG_ON(!ELV_ON_HASH(rq));
> …
> ```
> 
> Yes, I carry some extra patches besides this series (the list is against v5.12 
> GA):
> 
> ```
> block, bfq: reset waker pointer with shared queues
> block, bfq: check waker only for queues with no in-flight I/O
> block, bfq: avoid delayed merge of async queues
> block, bfq: boost throughput by extending queue-merging times
> block, bfq: consider also creation time in delayed stable merge
> block, bfq: fix delayed stable merge check
> block, bfq: let also stably merged queues enjoy weight raising
> block: Do not pull requests from the scheduler when we cannot dispatch them
> blk: Fix lock inversion between ioc lock and bfqd lock
> bfq: Remove merged request already in bfq_requests_merged()
> block, bfq: avoid circular stable merges
> bfq: remove unnecessary BFQ_DEFAULT_GRP_IOPRIO
> bfq: reset entity->prio_changed in bfq_init_entity()
> bfq: optimize the calculation of bfq_weight_to_ioprio()
> bfq: remove unnecessary initialization logic
> bfq: keep the minimun bandwidth for CLASS_BE
> bfq: limit the IO depth of CLASS_IDLE to 1
> bfq: convert the type of bfq_group.bfqd to bfq_data*
> bfq: introduce bfq_entity_to_bfqg helper method
> bfq/mq-deadline: remove redundant check for passthrough request
> blk-mq: bypass IO scheduler's limit_depth for passthrough request
> block,bfq: fix the timeout calculation in bfq_bfqq_charge_time
> block, bfq: merge bursts of newly-created queues
> block, bfq: keep shared queues out of the waker mechanism
> block, bfq: fix weight-raising resume with !low_latency
> block, bfq: make shared queues inherit wakers
> block, bfq: put reqs of waker and woken in dispatch list
> block, bfq: always inject I/O of queues blocked by wakers
> ```
> 
> but nothing from there triggered this for quite some time.
> 
> Paolo, what do you think?
> 
> -- 
> Oleksandr Natalenko (post-factum)
> 
>
Oleksandr Natalenko June 22, 2021, 7:35 a.m. UTC | #5
Hello.

On úterý 22. června 2021 9:08:43 CEST Paolo Valente wrote:
> CCing also Jan and Khazhy, because in your commit log I see also the
> commit on bfq_requests_merged().
> 
> Is this OOPS reproducible for you?

No, I haven't found a reproducer, at least yet. It took half a day of uptime 
to hit this, so might not be that easy.
Jan Kara June 22, 2021, 4:29 p.m. UTC | #6
On Tue 22-06-21 09:35:05, Oleksandr Natalenko wrote:
> Hello.
> 
> On úterý 22. června 2021 9:08:43 CEST Paolo Valente wrote:
> > CCing also Jan and Khazhy, because in your commit log I see also the
> > commit on bfq_requests_merged().
> > 
> > Is this OOPS reproducible for you?
> 
> No, I haven't found a reproducer, at least yet. It took half a day of uptime 
> to hit this, so might not be that easy.

Hum, if you can acquire a crash dump it would be the easiest I guess. We'd
need to find out more about the request we crash on - whether it's
otherwise valid, in what state it is etc...

								Honza
Oleksandr Natalenko June 22, 2021, 5:26 p.m. UTC | #7
Ahoj.

On úterý 22. června 2021 18:29:48 CEST Jan Kara wrote:
> On Tue 22-06-21 09:35:05, Oleksandr Natalenko wrote:
> > On úterý 22. června 2021 9:08:43 CEST Paolo Valente wrote:
> > > CCing also Jan and Khazhy, because in your commit log I see also the
> > > commit on bfq_requests_merged().
> > > 
> > > Is this OOPS reproducible for you?
> > 
> > No, I haven't found a reproducer, at least yet. It took half a day of
> > uptime to hit this, so might not be that easy.
> 
> Hum, if you can acquire a crash dump it would be the easiest I guess. We'd
> need to find out more about the request we crash on - whether it's
> otherwise valid, in what state it is etc...

Yes, I understand that. Once (and if) I reproduce it reliably, of course.

Thanks.
Oleksandr Natalenko July 2, 2021, 10:07 p.m. UTC | #8
Hello.

On úterý 22. června 2021 18:29:48 CEST Jan Kara wrote:
> On Tue 22-06-21 09:35:05, Oleksandr Natalenko wrote:
> > On úterý 22. června 2021 9:08:43 CEST Paolo Valente wrote:
> > > CCing also Jan and Khazhy, because in your commit log I see also the
> > > commit on bfq_requests_merged().
> > > 
> > > Is this OOPS reproducible for you?
> > 
> > No, I haven't found a reproducer, at least yet. It took half a day of
> > uptime to hit this, so might not be that easy.
> 
> Hum, if you can acquire a crash dump it would be the easiest I guess. We'd
> need to find out more about the request we crash on - whether it's
> otherwise valid, in what state it is etc...

Still have no reliable reproducer and no vmcore, however I'm running v5.13 
with the following patches applied on top of it:

```
blk: Fix lock inversion between ioc lock and bfqd lock
bfq: Remove merged request already in bfq_requests_merged()
block: Remove unnecessary elevator operation checks
block: Do not pull requests from the scheduler when we cannot dispatch them
block, bfq: reset waker pointer with shared queues
block, bfq: check waker only for queues with no in-flight I/O
block, bfq: avoid delayed merge of async queues
block, bfq: boost throughput by extending queue-merging times
block, bfq: consider also creation time in delayed stable merge
block, bfq: fix delayed stable merge check
block, bfq: let also stably merged queues enjoy weight raising
```

and just got the following crash:

```
[60313.522570] ------------[ cut here ]------------
[60313.522579] WARNING: CPU: 20 PID: 388 at arch/x86/include/asm/kfence.h:44 
kfence_protect_page+0x39/0xc0
[60313.522586] Modules linked in: sctp ip6_udp_tunnel udp_tunnel uinput 
netconsole blocklayoutdriver rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver 
nfs lockd grace sunrpc fscache netfs rfcomm nft_ct nf_conntrack nf_defrag_ipv6 
nf_defrag_ipv4 cmac algif_hash algif_skcipher nf_tables af_alg bnep tun 
nfnetlink nls_iso8859_1 intel_rapl_msr vfat intel_rapl_common iwlmvm fat 
mac80211 edac_mce_amd libarc4 btusb eeepc_wmi btrtl asus_wmi iwlwifi btbcm 
snd_usb_audio sparse_keymap kvm_amd video wmi_bmof mxm_wmi btintel uvcvideo 
snd_hda_codec_realtek videobuf2_vmalloc videobuf2_memops snd_usbmidi_lib kvm 
snd_hda_codec_generic bluetooth videobuf2_v4l2 ledtrig_audio 
snd_hda_codec_hdmi joydev snd_rawmidi ecdh_generic irqbypass ecc snd_hda_intel 
mousedev pl2303 cfg80211 snd_seq_device videobuf2_common crc16 rapl k10temp 
snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec igb r8169 sp5100_tco 
snd_hda_core realtek i2c_piix4 ipmi_devintf dca snd_hwdep mdio_devres rfkill 
snd_pcm libphy ipmi_msghandler wmi
[60313.522630]  pinctrl_amd mac_hid acpi_cpufreq tcp_bbr2 vhost_vsock 
vmw_vsock_virtio_transport_common vhost vhost_iotlb vsock v4l2loopback 
videodev mc snd_hrtimer snd_timer snd soundcore nct6775 hwmon_vid crypto_user 
fuse ip_tables x_tables xfs dm_thin_pool dm_persistent_data dm_bio_prison 
dm_bufio libcrc32c crc32c_generic dm_crypt cbc encrypted_keys trusted 
asn1_encoder tee hid_logitech_hidpp hid_logitech_dj usbhid dm_mod 
crct10dif_pclmul crc32_pclmul crc32c_intel raid10 ghash_clmulni_intel 
aesni_intel md_mod crypto_simd cryptd amdgpu ccp xhci_pci xhci_pci_renesas 
tpm_crb tpm_tis tpm_tis_core tpm rng_core drm_ttm_helper ttm gpu_sched 
i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec 
drm agpgart
[60313.522665] CPU: 20 PID: 388 Comm: kworker/20:1H Tainted: G        W         
5.13.0-pf2 #1
[60313.522668] Hardware name: ASUS System Product Name/Pro WS X570-ACE, BIOS 
3601 05/26/2021
[60313.522671] Workqueue: kblockd blk_mq_run_work_fn
[60313.522675] RIP: 0010:kfence_protect_page+0x39/0xc0
[60313.522679] Code: 04 65 48 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 c7 44 
24 04 00 00 00 00 e8 83 20 d5 ff 48 85 c0 74 07 83 7c 24 04 01 74 06 <0f> 0b 31 
c0 eb 4c 48 8b 38 48 89 c2 84 db 75 59 48 89 f8 0f 1f 40
[60313.522682] RSP: 0018:ffffb559c0affb28 EFLAGS: 00010046
[60313.522684] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffb559c0affb2c
[60313.522687] RDX: ffffb559c0affb2c RSI: 0000000000000000 RDI: 0000000000000000
[60313.522690] RBP: 0000000000000000 R08: 0000000000000000 R09: 
0000000000000000
[60313.522692] R10: 0000000000000000 R11: 0000000000000000 R12: 
0000000000000002
[60313.522694] R13: ffffb559c0affc28 R14: 00000000c0affc01 R15: 0000000000000000
[60313.522696] FS:  0000000000000000(0000) GS:ffff8cf44ef00000(0000) knlGS:
0000000000000000
[60313.522698] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[60313.522700] CR2: 0000000000000120 CR3: 000000013ebce000 CR4: 
0000000000350ee0
[60313.522702] Call Trace:
[60313.522707]  kfence_handle_page_fault+0xa6/0x280
[60313.522710]  page_fault_oops+0x9d/0x2d0
[60313.522714]  exc_page_fault+0x78/0x180
[60313.522718]  asm_exc_page_fault+0x1e/0x30
[60313.522721] RIP: 0010:bfq_dispatch_request+0x4c3/0x1280
[60313.522725] Code: 4c 89 e7 e8 ef da ff ff 4c 89 ff 89 c6 e8 75 64 00 00 4c 39 
bb a0 00 00 00 0f 84 86 04 00 00 49 8b 84 24 90 00 00 00 48 8b 33 <ff> 80 20 01 
00 00 48 89 34 24 48 8b 46 08 4c 8b 58 08 4c 89 5c 24
[60313.522727] RSP: 0018:ffffb559c0affcd0 EFLAGS: 00010046
[60313.522729] RAX: 0000000000000000 RBX: ffff8ced4d6a1000 RCX: 0000000000000000
[60313.522731] RDX: 0000000000000000 RSI: ffff8ced4ad90000 RDI: ffff8ced52fc9f40
[60313.522733] RBP: 0000000000000000 R08: 0000000000000001 R09: 
0000000000000000
[60313.522735] R10: 000000000000003f R11: 0000000000000000 R12: ffff8cf20e5a5400
[60313.522737] R13: ffff8cf0e7e91c70 R14: ffff8ced4d6a1420 R15: ffff8cf0e7e91c70
[60313.522741]  ? mod_delayed_work_on+0x71/0xe0
[60313.522745]  ? __sbitmap_get_word+0x30/0x80
[60313.522748]  __blk_mq_do_dispatch_sched+0x218/0x320
[60313.522752]  __blk_mq_sched_dispatch_requests+0x107/0x150
[60313.522755]  blk_mq_sched_dispatch_requests+0x2f/0x60
[60313.522758]  blk_mq_run_work_fn+0x43/0xc0
[60313.522761]  process_one_work+0x24e/0x430
[60313.522765]  worker_thread+0x54/0x4d0
[60313.522767]  ? process_one_work+0x430/0x430
[60313.522770]  kthread+0x182/0x1b0
[60313.522773]  ? __kthread_init_worker+0x50/0x50
[60313.522776]  ret_from_fork+0x22/0x30
[60313.522781] ---[ end trace 55ef262e614b59af ]---
[60313.522786] ------------[ cut here ]------------
[60313.522787] WARNING: CPU: 20 PID: 388 at mm/kfence/core.c:135 
kfence_handle_page_fault+0xaa/0x280
[60313.522791] Modules linked in: sctp ip6_udp_tunnel udp_tunnel uinput 
netconsole blocklayoutdriver rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver 
nfs lockd grace sunrpc fscache netfs rfcomm nft_ct nf_conntrack nf_defrag_ipv6 
nf_defrag_ipv4 cmac algif_hash algif_skcipher nf_tables af_alg bnep tun 
nfnetlink nls_iso8859_1 intel_rapl_msr vfat intel_rapl_common iwlmvm fat 
mac80211 edac_mce_amd libarc4 btusb eeepc_wmi btrtl asus_wmi iwlwifi btbcm 
snd_usb_audio sparse_keymap kvm_amd video wmi_bmof mxm_wmi btintel uvcvideo 
snd_hda_codec_realtek videobuf2_vmalloc videobuf2_memops snd_usbmidi_lib kvm 
snd_hda_codec_generic bluetooth videobuf2_v4l2 ledtrig_audio 
snd_hda_codec_hdmi joydev snd_rawmidi ecdh_generic irqbypass ecc snd_hda_intel 
mousedev pl2303 cfg80211 snd_seq_device videobuf2_common crc16 rapl k10temp 
snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec igb r8169 sp5100_tco 
snd_hda_core realtek i2c_piix4 ipmi_devintf dca snd_hwdep mdio_devres rfkill 
snd_pcm libphy ipmi_msghandler wmi
[60313.522817]  pinctrl_amd mac_hid acpi_cpufreq tcp_bbr2 vhost_vsock 
vmw_vsock_virtio_transport_common vhost vhost_iotlb vsock v4l2loopback 
videodev mc snd_hrtimer snd_timer snd soundcore nct6775 hwmon_vid crypto_user 
fuse ip_tables x_tables xfs dm_thin_pool dm_persistent_data dm_bio_prison 
dm_bufio libcrc32c crc32c_generic dm_crypt cbc encrypted_keys trusted 
asn1_encoder tee hid_logitech_hidpp hid_logitech_dj usbhid dm_mod 
crct10dif_pclmul crc32_pclmul crc32c_intel raid10 ghash_clmulni_intel 
aesni_intel md_mod crypto_simd cryptd amdgpu ccp xhci_pci xhci_pci_renesas 
tpm_crb tpm_tis tpm_tis_core tpm rng_core drm_ttm_helper ttm gpu_sched 
i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec 
drm agpgart
[60313.522840] CPU: 20 PID: 388 Comm: kworker/20:1H Tainted: G        W         
5.13.0-pf2 #1
[60313.522843] Hardware name: ASUS System Product Name/Pro WS X570-ACE, BIOS 
3601 05/26/2021
[60313.522845] Workqueue: kblockd blk_mq_run_work_fn
[60313.522848] RIP: 0010:kfence_handle_page_fault+0xaa/0x280
[60313.522851] Code: 0f 86 d4 00 00 00 0f b6 f3 41 b8 03 00 00 00 31 c9 4c 89 
ea 48 89 ef e8 e4 05 00 00 31 f6 4c 89 ff e8 6a f5 ff ff 84 c0 75 8d <0f> 0b c6 
05 7d fd 6b 01 00 45 31 f6 e9 7c ff ff ff 48 8b 0d 36 a0
[60313.522853] RSP: 0018:ffffb559c0affb50 EFLAGS: 00010046
[60313.522855] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffb559c0affb2c
[60313.522857] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 
0000000000000000
[60313.522859] RBP: 0000000000000120 R08: 0000000000000000 R09: 
0000000000000000
[60313.522860] R10: 0000000000000000 R11: 0000000000000000 R12: 
0000000000000002
[60313.522862] R13: ffffb559c0affc28 R14: 00000000c0affc01 R15: 0000000000000000
[60313.522864] FS:  0000000000000000(0000) GS:ffff8cf44ef00000(0000) knlGS:
0000000000000000
[60313.522866] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[60313.522868] CR2: 0000000000000120 CR3: 000000013ebce000 CR4: 
0000000000350ee0
[60313.522870] Call Trace:
[60313.522872]  page_fault_oops+0x9d/0x2d0
[60313.522875]  exc_page_fault+0x78/0x180
[60313.522878]  asm_exc_page_fault+0x1e/0x30
[60313.522880] RIP: 0010:bfq_dispatch_request+0x4c3/0x1280
[60313.522883] Code: 4c 89 e7 e8 ef da ff ff 4c 89 ff 89 c6 e8 75 64 00 00 4c 39 
bb a0 00 00 00 0f 84 86 04 00 00 49 8b 84 24 90 00 00 00 48 8b 33 <ff> 80 20 01 
00 00 48 89 34 24 48 8b 46 08 4c 8b 58 08 4c 89 5c 24
[60313.522885] RSP: 0018:ffffb559c0affcd0 EFLAGS: 00010046
[60313.522887] RAX: 0000000000000000 RBX: ffff8ced4d6a1000 RCX: 0000000000000000
[60313.522889] RDX: 0000000000000000 RSI: ffff8ced4ad90000 RDI: ffff8ced52fc9f40
[60313.522890] RBP: 0000000000000000 R08: 0000000000000001 R09: 
0000000000000000
[60313.522892] R10: 000000000000003f R11: 0000000000000000 R12: ffff8cf20e5a5400
[60313.523148] R13: ffff8cf0e7e91c70 R14: ffff8ced4d6a1420 R15: ffff8cf0e7e91c70
[60313.523150]  ? mod_delayed_work_on+0x71/0xe0
[60313.523153]  ? __sbitmap_get_word+0x30/0x80
[60313.523157]  __blk_mq_do_dispatch_sched+0x218/0x320
[60313.523161]  __blk_mq_sched_dispatch_requests+0x107/0x150
[60313.523165]  blk_mq_sched_dispatch_requests+0x2f/0x60
[60313.523167]  blk_mq_run_work_fn+0x43/0xc0
[60313.523170]  process_one_work+0x24e/0x430
[60313.523173]  worker_thread+0x54/0x4d0
[60313.523176]  ? process_one_work+0x430/0x430
[60313.523178]  kthread+0x182/0x1b0
[60313.523181]  ? __kthread_init_worker+0x50/0x50
[60313.523183]  ret_from_fork+0x22/0x30
[60313.523187] ---[ end trace 55ef262e614b59b0 ]---
[60313.523189] BUG: kernel NULL pointer dereference, address: 0000000000000120
[60313.523191] #PF: supervisor write access in kernel mode
[60313.523193] #PF: error_code(0x0002) - not-present page
[60313.523195] PGD 0 P4D 0 
[60313.523197] Oops: 0002 [#1] PREEMPT SMP NOPTI
[60313.523200] CPU: 20 PID: 388 Comm: kworker/20:1H Tainted: G        W         
5.13.0-pf2 #1
[60313.523202] Hardware name: ASUS System Product Name/Pro WS X570-ACE, BIOS 
3601 05/26/2021
[60313.523204] Workqueue: kblockd blk_mq_run_work_fn
[60313.523207] RIP: 0010:bfq_dispatch_request+0x4c3/0x1280
[60313.523210] Code: 4c 89 e7 e8 ef da ff ff 4c 89 ff 89 c6 e8 75 64 00 00 4c 39 
bb a0 00 00 00 0f 84 86 04 00 00 49 8b 84 24 90 00 00 00 48 8b 33 <ff> 80 20 01 
00 00 48 89 34 24 48 8b 46 08 4c 8b 58 08 4c 89 5c 24
[60313.523213] RSP: 0018:ffffb559c0affcd0 EFLAGS: 00010046
[60313.523215] RAX: 0000000000000000 RBX: ffff8ced4d6a1000 RCX: 0000000000000000
[60313.523216] RDX: 0000000000000000 RSI: ffff8ced4ad90000 RDI: ffff8ced52fc9f40
[60313.523218] RBP: 0000000000000000 R08: 0000000000000001 R09: 
0000000000000000
[60313.523220] R10: 000000000000003f R11: 0000000000000000 R12: ffff8cf20e5a5400
[60313.523221] R13: ffff8cf0e7e91c70 R14: ffff8ced4d6a1420 R15: ffff8cf0e7e91c70
[60313.523223] FS:  0000000000000000(0000) GS:ffff8cf44ef00000(0000) knlGS:
0000000000000000
[60313.523225] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[60313.523227] CR2: 0000000000000120 CR3: 000000013ebce000 CR4: 
0000000000350ee0
[60313.523229] Call Trace:
[60313.523231]  ? mod_delayed_work_on+0x71/0xe0
[60313.523233]  ? __sbitmap_get_word+0x30/0x80
[60313.523237]  __blk_mq_do_dispatch_sched+0x218/0x320
[60313.523240]  __blk_mq_sched_dispatch_requests+0x107/0x150
[60313.523243]  blk_mq_sched_dispatch_requests+0x2f/0x60
[60313.523246]  blk_mq_run_work_fn+0x43/0xc0
[60313.523249]  process_one_work+0x24e/0x430
[60313.523251]  worker_thread+0x54/0x4d0
[60313.523254]  ? process_one_work+0x430/0x430
[60313.523257]  kthread+0x182/0x1b0
[60313.523259]  ? __kthread_init_worker+0x50/0x50
[60313.523261]  ret_from_fork+0x22/0x30
[60313.523265] Modules linked in: sctp ip6_udp_tunnel udp_tunnel uinput 
netconsole blocklayoutdriver rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver 
nfs lockd grace sunrpc fscache netfs rfcomm nft_ct nf_conntrack nf_defrag_ipv6 
nf_defrag_ipv4 cmac algif_hash algif_skcipher nf_tables af_alg bnep tun 
nfnetlink nls_iso8859_1 intel_rapl_msr vfat intel_rapl_common iwlmvm fat 
mac80211 edac_mce_amd libarc4 btusb eeepc_wmi btrtl asus_wmi iwlwifi btbcm 
snd_usb_audio sparse_keymap kvm_amd video wmi_bmof mxm_wmi btintel uvcvideo 
snd_hda_codec_realtek videobuf2_vmalloc videobuf2_memops snd_usbmidi_lib kvm 
snd_hda_codec_generic bluetooth videobuf2_v4l2 ledtrig_audio 
snd_hda_codec_hdmi joydev snd_rawmidi ecdh_generic irqbypass ecc snd_hda_intel 
mousedev pl2303 cfg80211 snd_seq_device videobuf2_common crc16 rapl k10temp 
snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec igb r8169 sp5100_tco 
snd_hda_core realtek i2c_piix4 ipmi_devintf dca snd_hwdep mdio_devres rfkill 
snd_pcm libphy ipmi_msghandler wmi
[60313.523290]  pinctrl_amd mac_hid acpi_cpufreq tcp_bbr2 vhost_vsock 
vmw_vsock_virtio_transport_common vhost vhost_iotlb vsock v4l2loopback 
videodev mc snd_hrtimer snd_timer snd soundcore nct6775 hwmon_vid crypto_user 
fuse ip_tables x_tables xfs dm_thin_pool dm_persistent_data dm_bio_prison 
dm_bufio libcrc32c crc32c_generic dm_crypt cbc encrypted_keys trusted 
asn1_encoder tee hid_logitech_hidpp hid_logitech_dj usbhid dm_mod 
crct10dif_pclmul crc32_pclmul crc32c_intel raid10 ghash_clmulni_intel 
aesni_intel md_mod crypto_simd cryptd amdgpu ccp xhci_pci xhci_pci_renesas 
tpm_crb tpm_tis tpm_tis_core tpm rng_core drm_ttm_helper ttm gpu_sched 
i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec 
drm agpgart
[60313.523314] CR2: 0000000000000120
[60313.523316] ---[ end trace 55ef262e614b59b1 ]---
[60313.523318] RIP: 0010:bfq_dispatch_request+0x4c3/0x1280
[60313.523321] Code: 4c 89 e7 e8 ef da ff ff 4c 89 ff 89 c6 e8 75 64 00 00 4c 39 
bb a0 00 00 00 0f 84 86 04 00 00 49 8b 84 24 90 00 00 00 48 8b 33 <ff> 80 20 01 
00 00 48 89 34 24 48 8b 46 08 4c 8b 58 08 4c 89 5c 24
[60313.523323] RSP: 0018:ffffb559c0affcd0 EFLAGS: 00010046
[60313.523325] RAX: 0000000000000000 RBX: ffff8ced4d6a1000 RCX: 0000000000000000
[60313.523326] RDX: 0000000000000000 RSI: ffff8ced4ad90000 RDI: ffff8ced52fc9f40
[60313.523328] RBP: 0000000000000000 R08: 0000000000000001 R09: 
0000000000000000
[60313.523330] R10: 000000000000003f R11: 0000000000000000 R12: ffff8cf20e5a5400
[60313.523332] R13: ffff8cf0e7e91c70 R14: ffff8ced4d6a1420 R15: ffff8cf0e7e91c70
[60313.523334] FS:  0000000000000000(0000) GS:ffff8cf44ef00000(0000) knlGS:
0000000000000000
[60313.523336] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[60313.523338] CR2: 0000000000000120 CR3: 000000013ebce000 CR4: 
0000000000350ee0
[60313.523339] note: kworker/20:1H[388] exited with preempt_count 1
```
Oleksandr Natalenko Aug. 2, 2021, 8:40 p.m. UTC | #9
Hello.

On sobota 3. července 2021 0:07:53 CEST Oleksandr Natalenko wrote:
> On úterý 22. června 2021 18:29:48 CEST Jan Kara wrote:
> > On Tue 22-06-21 09:35:05, Oleksandr Natalenko wrote:
> > > On úterý 22. června 2021 9:08:43 CEST Paolo Valente wrote:
> > > > CCing also Jan and Khazhy, because in your commit log I see also the
> > > > commit on bfq_requests_merged().
> > > > 
> > > > Is this OOPS reproducible for you?
> > > 
> > > No, I haven't found a reproducer, at least yet. It took half a day of
> > > uptime to hit this, so might not be that easy.
> > 
> > Hum, if you can acquire a crash dump it would be the easiest I guess. We'd
> > need to find out more about the request we crash on - whether it's
> > otherwise valid, in what state it is etc...
> 
> Still have no reliable reproducer and no vmcore, however I'm running v5.13
> with the following patches applied on top of it:
> 
> ```
> blk: Fix lock inversion between ioc lock and bfqd lock
> bfq: Remove merged request already in bfq_requests_merged()
> block: Remove unnecessary elevator operation checks
> block: Do not pull requests from the scheduler when we cannot dispatch them
> block, bfq: reset waker pointer with shared queues
> block, bfq: check waker only for queues with no in-flight I/O
> block, bfq: avoid delayed merge of async queues
> block, bfq: boost throughput by extending queue-merging times
> block, bfq: consider also creation time in delayed stable merge
> block, bfq: fix delayed stable merge check
> block, bfq: let also stably merged queues enjoy weight raising
> ```
> 
> and just got the following crash:
> 
> ```
> [60313.522570] ------------[ cut here ]------------
> [60313.522579] WARNING: CPU: 20 PID: 388 at arch/x86/include/asm/kfence.h:44
> kfence_protect_page+0x39/0xc0
> [60313.522586] Modules linked in: sctp ip6_udp_tunnel udp_tunnel uinput
> netconsole blocklayoutdriver rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver
> nfs lockd grace sunrpc fscache netfs rfcomm nft_ct nf_conntrack
> nf_defrag_ipv6 nf_defrag_ipv4 cmac algif_hash algif_skcipher nf_tables
> af_alg bnep tun nfnetlink nls_iso8859_1 intel_rapl_msr vfat
> intel_rapl_common iwlmvm fat mac80211 edac_mce_amd libarc4 btusb eeepc_wmi
> btrtl asus_wmi iwlwifi btbcm snd_usb_audio sparse_keymap kvm_amd video
> wmi_bmof mxm_wmi btintel uvcvideo snd_hda_codec_realtek videobuf2_vmalloc
> videobuf2_memops snd_usbmidi_lib kvm snd_hda_codec_generic bluetooth
> videobuf2_v4l2 ledtrig_audio
> snd_hda_codec_hdmi joydev snd_rawmidi ecdh_generic irqbypass ecc
> snd_hda_intel mousedev pl2303 cfg80211 snd_seq_device videobuf2_common
> crc16 rapl k10temp snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec igb
> r8169 sp5100_tco snd_hda_core realtek i2c_piix4 ipmi_devintf dca snd_hwdep
> mdio_devres rfkill snd_pcm libphy ipmi_msghandler wmi
> [60313.522630]  pinctrl_amd mac_hid acpi_cpufreq tcp_bbr2 vhost_vsock
> vmw_vsock_virtio_transport_common vhost vhost_iotlb vsock v4l2loopback
> videodev mc snd_hrtimer snd_timer snd soundcore nct6775 hwmon_vid
> crypto_user fuse ip_tables x_tables xfs dm_thin_pool dm_persistent_data
> dm_bio_prison dm_bufio libcrc32c crc32c_generic dm_crypt cbc encrypted_keys
> trusted asn1_encoder tee hid_logitech_hidpp hid_logitech_dj usbhid dm_mod
> crct10dif_pclmul crc32_pclmul crc32c_intel raid10 ghash_clmulni_intel
> aesni_intel md_mod crypto_simd cryptd amdgpu ccp xhci_pci xhci_pci_renesas
> tpm_crb tpm_tis tpm_tis_core tpm rng_core drm_ttm_helper ttm gpu_sched
> i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops
> cec drm agpgart
> [60313.522665] CPU: 20 PID: 388 Comm: kworker/20:1H Tainted: G        W
> 5.13.0-pf2 #1
> [60313.522668] Hardware name: ASUS System Product Name/Pro WS X570-ACE, BIOS
> 3601 05/26/2021
> [60313.522671] Workqueue: kblockd blk_mq_run_work_fn
> [60313.522675] RIP: 0010:kfence_protect_page+0x39/0xc0
> [60313.522679] Code: 04 65 48 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 c7
> 44 24 04 00 00 00 00 e8 83 20 d5 ff 48 85 c0 74 07 83 7c 24 04 01 74 06
> <0f> 0b 31 c0 eb 4c 48 8b 38 48 89 c2 84 db 75 59 48 89 f8 0f 1f 40
> [60313.522682] RSP: 0018:ffffb559c0affb28 EFLAGS: 00010046
> [60313.522684] RAX: 0000000000000000 RBX: 0000000000000000 RCX:
> ffffb559c0affb2c [60313.522687] RDX: ffffb559c0affb2c RSI: 0000000000000000
> RDI: 0000000000000000 [60313.522690] RBP: 0000000000000000 R08:
> 0000000000000000 R09:
> 0000000000000000
> [60313.522692] R10: 0000000000000000 R11: 0000000000000000 R12:
> 0000000000000002
> [60313.522694] R13: ffffb559c0affc28 R14: 00000000c0affc01 R15:
> 0000000000000000 [60313.522696] FS:  0000000000000000(0000)
> GS:ffff8cf44ef00000(0000) knlGS: 0000000000000000
> [60313.522698] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [60313.522700] CR2: 0000000000000120 CR3: 000000013ebce000 CR4:
> 0000000000350ee0
> [60313.522702] Call Trace:
> [60313.522707]  kfence_handle_page_fault+0xa6/0x280
> [60313.522710]  page_fault_oops+0x9d/0x2d0
> [60313.522714]  exc_page_fault+0x78/0x180
> [60313.522718]  asm_exc_page_fault+0x1e/0x30
> [60313.522721] RIP: 0010:bfq_dispatch_request+0x4c3/0x1280
> [60313.522725] Code: 4c 89 e7 e8 ef da ff ff 4c 89 ff 89 c6 e8 75 64 00 00
> 4c 39 bb a0 00 00 00 0f 84 86 04 00 00 49 8b 84 24 90 00 00 00 48 8b 33
> <ff> 80 20 01 00 00 48 89 34 24 48 8b 46 08 4c 8b 58 08 4c 89 5c 24
> [60313.522727] RSP: 0018:ffffb559c0affcd0 EFLAGS: 00010046
> [60313.522729] RAX: 0000000000000000 RBX: ffff8ced4d6a1000 RCX:
> 0000000000000000 [60313.522731] RDX: 0000000000000000 RSI: ffff8ced4ad90000
> RDI: ffff8ced52fc9f40 [60313.522733] RBP: 0000000000000000 R08:
> 0000000000000001 R09:
> 0000000000000000
> [60313.522735] R10: 000000000000003f R11: 0000000000000000 R12:
> ffff8cf20e5a5400 [60313.522737] R13: ffff8cf0e7e91c70 R14: ffff8ced4d6a1420
> R15: ffff8cf0e7e91c70 [60313.522741]  ? mod_delayed_work_on+0x71/0xe0
> [60313.522745]  ? __sbitmap_get_word+0x30/0x80
> [60313.522748]  __blk_mq_do_dispatch_sched+0x218/0x320
> [60313.522752]  __blk_mq_sched_dispatch_requests+0x107/0x150
> [60313.522755]  blk_mq_sched_dispatch_requests+0x2f/0x60
> [60313.522758]  blk_mq_run_work_fn+0x43/0xc0
> [60313.522761]  process_one_work+0x24e/0x430
> [60313.522765]  worker_thread+0x54/0x4d0
> [60313.522767]  ? process_one_work+0x430/0x430
> [60313.522770]  kthread+0x182/0x1b0
> [60313.522773]  ? __kthread_init_worker+0x50/0x50
> [60313.522776]  ret_from_fork+0x22/0x30
> [60313.522781] ---[ end trace 55ef262e614b59af ]---
> [60313.522786] ------------[ cut here ]------------
> [60313.522787] WARNING: CPU: 20 PID: 388 at mm/kfence/core.c:135
> kfence_handle_page_fault+0xaa/0x280
> [60313.522791] Modules linked in: sctp ip6_udp_tunnel udp_tunnel uinput
> netconsole blocklayoutdriver rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver
> nfs lockd grace sunrpc fscache netfs rfcomm nft_ct nf_conntrack
> nf_defrag_ipv6 nf_defrag_ipv4 cmac algif_hash algif_skcipher nf_tables
> af_alg bnep tun nfnetlink nls_iso8859_1 intel_rapl_msr vfat
> intel_rapl_common iwlmvm fat mac80211 edac_mce_amd libarc4 btusb eeepc_wmi
> btrtl asus_wmi iwlwifi btbcm snd_usb_audio sparse_keymap kvm_amd video
> wmi_bmof mxm_wmi btintel uvcvideo snd_hda_codec_realtek videobuf2_vmalloc
> videobuf2_memops snd_usbmidi_lib kvm snd_hda_codec_generic bluetooth
> videobuf2_v4l2 ledtrig_audio
> snd_hda_codec_hdmi joydev snd_rawmidi ecdh_generic irqbypass ecc
> snd_hda_intel mousedev pl2303 cfg80211 snd_seq_device videobuf2_common
> crc16 rapl k10temp snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec igb
> r8169 sp5100_tco snd_hda_core realtek i2c_piix4 ipmi_devintf dca snd_hwdep
> mdio_devres rfkill snd_pcm libphy ipmi_msghandler wmi
> [60313.522817]  pinctrl_amd mac_hid acpi_cpufreq tcp_bbr2 vhost_vsock
> vmw_vsock_virtio_transport_common vhost vhost_iotlb vsock v4l2loopback
> videodev mc snd_hrtimer snd_timer snd soundcore nct6775 hwmon_vid
> crypto_user fuse ip_tables x_tables xfs dm_thin_pool dm_persistent_data
> dm_bio_prison dm_bufio libcrc32c crc32c_generic dm_crypt cbc encrypted_keys
> trusted asn1_encoder tee hid_logitech_hidpp hid_logitech_dj usbhid dm_mod
> crct10dif_pclmul crc32_pclmul crc32c_intel raid10 ghash_clmulni_intel
> aesni_intel md_mod crypto_simd cryptd amdgpu ccp xhci_pci xhci_pci_renesas
> tpm_crb tpm_tis tpm_tis_core tpm rng_core drm_ttm_helper ttm gpu_sched
> i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops
> cec drm agpgart
> [60313.522840] CPU: 20 PID: 388 Comm: kworker/20:1H Tainted: G        W
> 5.13.0-pf2 #1
> [60313.522843] Hardware name: ASUS System Product Name/Pro WS X570-ACE, BIOS
> 3601 05/26/2021
> [60313.522845] Workqueue: kblockd blk_mq_run_work_fn
> [60313.522848] RIP: 0010:kfence_handle_page_fault+0xaa/0x280
> [60313.522851] Code: 0f 86 d4 00 00 00 0f b6 f3 41 b8 03 00 00 00 31 c9 4c
> 89 ea 48 89 ef e8 e4 05 00 00 31 f6 4c 89 ff e8 6a f5 ff ff 84 c0 75 8d
> <0f> 0b c6 05 7d fd 6b 01 00 45 31 f6 e9 7c ff ff ff 48 8b 0d 36 a0
> [60313.522853] RSP: 0018:ffffb559c0affb50 EFLAGS: 00010046
> [60313.522855] RAX: 0000000000000000 RBX: 0000000000000000 RCX:
> ffffb559c0affb2c [60313.522857] RDX: 0000000000000000 RSI: 0000000000000000
> RDI:
> 0000000000000000
> [60313.522859] RBP: 0000000000000120 R08: 0000000000000000 R09:
> 0000000000000000
> [60313.522860] R10: 0000000000000000 R11: 0000000000000000 R12:
> 0000000000000002
> [60313.522862] R13: ffffb559c0affc28 R14: 00000000c0affc01 R15:
> 0000000000000000 [60313.522864] FS:  0000000000000000(0000)
> GS:ffff8cf44ef00000(0000) knlGS: 0000000000000000
> [60313.522866] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [60313.522868] CR2: 0000000000000120 CR3: 000000013ebce000 CR4:
> 0000000000350ee0
> [60313.522870] Call Trace:
> [60313.522872]  page_fault_oops+0x9d/0x2d0
> [60313.522875]  exc_page_fault+0x78/0x180
> [60313.522878]  asm_exc_page_fault+0x1e/0x30
> [60313.522880] RIP: 0010:bfq_dispatch_request+0x4c3/0x1280
> [60313.522883] Code: 4c 89 e7 e8 ef da ff ff 4c 89 ff 89 c6 e8 75 64 00 00
> 4c 39 bb a0 00 00 00 0f 84 86 04 00 00 49 8b 84 24 90 00 00 00 48 8b 33
> <ff> 80 20 01 00 00 48 89 34 24 48 8b 46 08 4c 8b 58 08 4c 89 5c 24
> [60313.522885] RSP: 0018:ffffb559c0affcd0 EFLAGS: 00010046
> [60313.522887] RAX: 0000000000000000 RBX: ffff8ced4d6a1000 RCX:
> 0000000000000000 [60313.522889] RDX: 0000000000000000 RSI: ffff8ced4ad90000
> RDI: ffff8ced52fc9f40 [60313.522890] RBP: 0000000000000000 R08:
> 0000000000000001 R09:
> 0000000000000000
> [60313.522892] R10: 000000000000003f R11: 0000000000000000 R12:
> ffff8cf20e5a5400 [60313.523148] R13: ffff8cf0e7e91c70 R14: ffff8ced4d6a1420
> R15: ffff8cf0e7e91c70 [60313.523150]  ? mod_delayed_work_on+0x71/0xe0
> [60313.523153]  ? __sbitmap_get_word+0x30/0x80
> [60313.523157]  __blk_mq_do_dispatch_sched+0x218/0x320
> [60313.523161]  __blk_mq_sched_dispatch_requests+0x107/0x150
> [60313.523165]  blk_mq_sched_dispatch_requests+0x2f/0x60
> [60313.523167]  blk_mq_run_work_fn+0x43/0xc0
> [60313.523170]  process_one_work+0x24e/0x430
> [60313.523173]  worker_thread+0x54/0x4d0
> [60313.523176]  ? process_one_work+0x430/0x430
> [60313.523178]  kthread+0x182/0x1b0
> [60313.523181]  ? __kthread_init_worker+0x50/0x50
> [60313.523183]  ret_from_fork+0x22/0x30
> [60313.523187] ---[ end trace 55ef262e614b59b0 ]---
> [60313.523189] BUG: kernel NULL pointer dereference, address:
> 0000000000000120 [60313.523191] #PF: supervisor write access in kernel mode
> [60313.523193] #PF: error_code(0x0002) - not-present page
> [60313.523195] PGD 0 P4D 0
> [60313.523197] Oops: 0002 [#1] PREEMPT SMP NOPTI
> [60313.523200] CPU: 20 PID: 388 Comm: kworker/20:1H Tainted: G        W
> 5.13.0-pf2 #1
> [60313.523202] Hardware name: ASUS System Product Name/Pro WS X570-ACE, BIOS
> 3601 05/26/2021
> [60313.523204] Workqueue: kblockd blk_mq_run_work_fn
> [60313.523207] RIP: 0010:bfq_dispatch_request+0x4c3/0x1280
> [60313.523210] Code: 4c 89 e7 e8 ef da ff ff 4c 89 ff 89 c6 e8 75 64 00 00
> 4c 39 bb a0 00 00 00 0f 84 86 04 00 00 49 8b 84 24 90 00 00 00 48 8b 33
> <ff> 80 20 01 00 00 48 89 34 24 48 8b 46 08 4c 8b 58 08 4c 89 5c 24
> [60313.523213] RSP: 0018:ffffb559c0affcd0 EFLAGS: 00010046
> [60313.523215] RAX: 0000000000000000 RBX: ffff8ced4d6a1000 RCX:
> 0000000000000000 [60313.523216] RDX: 0000000000000000 RSI: ffff8ced4ad90000
> RDI: ffff8ced52fc9f40 [60313.523218] RBP: 0000000000000000 R08:
> 0000000000000001 R09:
> 0000000000000000
> [60313.523220] R10: 000000000000003f R11: 0000000000000000 R12:
> ffff8cf20e5a5400 [60313.523221] R13: ffff8cf0e7e91c70 R14: ffff8ced4d6a1420
> R15: ffff8cf0e7e91c70 [60313.523223] FS:  0000000000000000(0000)
> GS:ffff8cf44ef00000(0000) knlGS: 0000000000000000
> [60313.523225] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [60313.523227] CR2: 0000000000000120 CR3: 000000013ebce000 CR4:
> 0000000000350ee0
> [60313.523229] Call Trace:
> [60313.523231]  ? mod_delayed_work_on+0x71/0xe0
> [60313.523233]  ? __sbitmap_get_word+0x30/0x80
> [60313.523237]  __blk_mq_do_dispatch_sched+0x218/0x320
> [60313.523240]  __blk_mq_sched_dispatch_requests+0x107/0x150
> [60313.523243]  blk_mq_sched_dispatch_requests+0x2f/0x60
> [60313.523246]  blk_mq_run_work_fn+0x43/0xc0
> [60313.523249]  process_one_work+0x24e/0x430
> [60313.523251]  worker_thread+0x54/0x4d0
> [60313.523254]  ? process_one_work+0x430/0x430
> [60313.523257]  kthread+0x182/0x1b0
> [60313.523259]  ? __kthread_init_worker+0x50/0x50
> [60313.523261]  ret_from_fork+0x22/0x30
> [60313.523265] Modules linked in: sctp ip6_udp_tunnel udp_tunnel uinput
> netconsole blocklayoutdriver rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver
> nfs lockd grace sunrpc fscache netfs rfcomm nft_ct nf_conntrack
> nf_defrag_ipv6 nf_defrag_ipv4 cmac algif_hash algif_skcipher nf_tables
> af_alg bnep tun nfnetlink nls_iso8859_1 intel_rapl_msr vfat
> intel_rapl_common iwlmvm fat mac80211 edac_mce_amd libarc4 btusb eeepc_wmi
> btrtl asus_wmi iwlwifi btbcm snd_usb_audio sparse_keymap kvm_amd video
> wmi_bmof mxm_wmi btintel uvcvideo snd_hda_codec_realtek videobuf2_vmalloc
> videobuf2_memops snd_usbmidi_lib kvm snd_hda_codec_generic bluetooth
> videobuf2_v4l2 ledtrig_audio
> snd_hda_codec_hdmi joydev snd_rawmidi ecdh_generic irqbypass ecc
> snd_hda_intel mousedev pl2303 cfg80211 snd_seq_device videobuf2_common
> crc16 rapl k10temp snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec igb
> r8169 sp5100_tco snd_hda_core realtek i2c_piix4 ipmi_devintf dca snd_hwdep
> mdio_devres rfkill snd_pcm libphy ipmi_msghandler wmi
> [60313.523290]  pinctrl_amd mac_hid acpi_cpufreq tcp_bbr2 vhost_vsock
> vmw_vsock_virtio_transport_common vhost vhost_iotlb vsock v4l2loopback
> videodev mc snd_hrtimer snd_timer snd soundcore nct6775 hwmon_vid
> crypto_user fuse ip_tables x_tables xfs dm_thin_pool dm_persistent_data
> dm_bio_prison dm_bufio libcrc32c crc32c_generic dm_crypt cbc encrypted_keys
> trusted asn1_encoder tee hid_logitech_hidpp hid_logitech_dj usbhid dm_mod
> crct10dif_pclmul crc32_pclmul crc32c_intel raid10 ghash_clmulni_intel
> aesni_intel md_mod crypto_simd cryptd amdgpu ccp xhci_pci xhci_pci_renesas
> tpm_crb tpm_tis tpm_tis_core tpm rng_core drm_ttm_helper ttm gpu_sched
> i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops
> cec drm agpgart
> [60313.523314] CR2: 0000000000000120
> [60313.523316] ---[ end trace 55ef262e614b59b1 ]---
> [60313.523318] RIP: 0010:bfq_dispatch_request+0x4c3/0x1280
> [60313.523321] Code: 4c 89 e7 e8 ef da ff ff 4c 89 ff 89 c6 e8 75 64 00 00
> 4c 39 bb a0 00 00 00 0f 84 86 04 00 00 49 8b 84 24 90 00 00 00 48 8b 33
> <ff> 80 20 01 00 00 48 89 34 24 48 8b 46 08 4c 8b 58 08 4c 89 5c 24
> [60313.523323] RSP: 0018:ffffb559c0affcd0 EFLAGS: 00010046
> [60313.523325] RAX: 0000000000000000 RBX: ffff8ced4d6a1000 RCX:
> 0000000000000000 [60313.523326] RDX: 0000000000000000 RSI: ffff8ced4ad90000
> RDI: ffff8ced52fc9f40 [60313.523328] RBP: 0000000000000000 R08:
> 0000000000000001 R09:
> 0000000000000000
> [60313.523330] R10: 000000000000003f R11: 0000000000000000 R12:
> ffff8cf20e5a5400 [60313.523332] R13: ffff8cf0e7e91c70 R14: ffff8ced4d6a1420
> R15: ffff8cf0e7e91c70 [60313.523334] FS:  0000000000000000(0000)
> GS:ffff8cf44ef00000(0000) knlGS: 0000000000000000
> [60313.523336] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [60313.523338] CR2: 0000000000000120 CR3: 000000013ebce000 CR4:
> 0000000000350ee0
> [60313.523339] note: kworker/20:1H[388] exited with preempt_count 1
> ```

This is just to let you know that I'm running v5.13.7 without any extra 
patches under block/ applied, and the issue is not reproducible.

I'll probably defer investigating this till v5.14 unless it is fixed there 
already.
Jan Kara Aug. 3, 2021, 10:45 a.m. UTC | #10
On Mon 02-08-21 22:40:26, Oleksandr Natalenko wrote:
> Hello.
> 
> On sobota 3. července 2021 0:07:53 CEST Oleksandr Natalenko wrote:
> > On úterý 22. června 2021 18:29:48 CEST Jan Kara wrote:
> > > On Tue 22-06-21 09:35:05, Oleksandr Natalenko wrote:
> > > > On úterý 22. června 2021 9:08:43 CEST Paolo Valente wrote:
> > > > > CCing also Jan and Khazhy, because in your commit log I see also the
> > > > > commit on bfq_requests_merged().
> > > > > 
> > > > > Is this OOPS reproducible for you?
> > > > 
> > > > No, I haven't found a reproducer, at least yet. It took half a day of
> > > > uptime to hit this, so might not be that easy.
> > > 
> > > Hum, if you can acquire a crash dump it would be the easiest I guess. We'd
> > > need to find out more about the request we crash on - whether it's
> > > otherwise valid, in what state it is etc...
> > 
> > Still have no reliable reproducer and no vmcore, however I'm running v5.13
> > with the following patches applied on top of it:
> > 
> > ```
> > blk: Fix lock inversion between ioc lock and bfqd lock
> > bfq: Remove merged request already in bfq_requests_merged()
> > block: Remove unnecessary elevator operation checks
> > block: Do not pull requests from the scheduler when we cannot dispatch them
> > block, bfq: reset waker pointer with shared queues
> > block, bfq: check waker only for queues with no in-flight I/O
> > block, bfq: avoid delayed merge of async queues
> > block, bfq: boost throughput by extending queue-merging times
> > block, bfq: consider also creation time in delayed stable merge
> > block, bfq: fix delayed stable merge check
> > block, bfq: let also stably merged queues enjoy weight raising
> > ```
> > 
> > and just got the following crash:
> > 
> > ```
> > [60313.522570] ------------[ cut here ]------------
> > [60313.522579] WARNING: CPU: 20 PID: 388 at arch/x86/include/asm/kfence.h:44
> > kfence_protect_page+0x39/0xc0

<snip>

> > [60313.522665] CPU: 20 PID: 388 Comm: kworker/20:1H Tainted: G        W
> > 5.13.0-pf2 #1
> > [60313.522668] Hardware name: ASUS System Product Name/Pro WS X570-ACE, BIOS
> > 3601 05/26/2021
> > [60313.522671] Workqueue: kblockd blk_mq_run_work_fn
> > [60313.522675] RIP: 0010:kfence_protect_page+0x39/0xc0
> > [60313.522679] Code: 04 65 48 8b 04 25 28 00 00 00 48 89 44 24 08 31 c0 c7
> > 44 24 04 00 00 00 00 e8 83 20 d5 ff 48 85 c0 74 07 83 7c 24 04 01 74 06
> > <0f> 0b 31 c0 eb 4c 48 8b 38 48 89 c2 84 db 75 59 48 89 f8 0f 1f 40
> > [60313.522682] RSP: 0018:ffffb559c0affb28 EFLAGS: 00010046
> > [60313.522684] RAX: 0000000000000000 RBX: 0000000000000000 RCX:
> > ffffb559c0affb2c [60313.522687] RDX: ffffb559c0affb2c RSI: 0000000000000000
> > RDI: 0000000000000000 [60313.522690] RBP: 0000000000000000 R08:
> > 0000000000000000 R09:
> > 0000000000000000
> > [60313.522692] R10: 0000000000000000 R11: 0000000000000000 R12:
> > 0000000000000002
> > [60313.522694] R13: ffffb559c0affc28 R14: 00000000c0affc01 R15:
> > 0000000000000000 [60313.522696] FS:  0000000000000000(0000)
> > GS:ffff8cf44ef00000(0000) knlGS: 0000000000000000
> > [60313.522698] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [60313.522700] CR2: 0000000000000120 CR3: 000000013ebce000 CR4:
> > 0000000000350ee0
> > [60313.522702] Call Trace:
> > [60313.522707]  kfence_handle_page_fault+0xa6/0x280
> > [60313.522710]  page_fault_oops+0x9d/0x2d0
> > [60313.522714]  exc_page_fault+0x78/0x180
> > [60313.522718]  asm_exc_page_fault+0x1e/0x30
> > [60313.522721] RIP: 0010:bfq_dispatch_request+0x4c3/0x1280
> > [60313.522725] Code: 4c 89 e7 e8 ef da ff ff 4c 89 ff 89 c6 e8 75 64 00 00
> > 4c 39 bb a0 00 00 00 0f 84 86 04 00 00 49 8b 84 24 90 00 00 00 48 8b 33
> > <ff> 80 20 01 00 00 48 89 34 24 48 8b 46 08 4c 8b 58 08 4c 89 5c 24
> > [60313.522727] RSP: 0018:ffffb559c0affcd0 EFLAGS: 00010046
> > [60313.522729] RAX: 0000000000000000 RBX: ffff8ced4d6a1000 RCX:
> > 0000000000000000 [60313.522731] RDX: 0000000000000000 RSI: ffff8ced4ad90000
> > RDI: ffff8ced52fc9f40 [60313.522733] RBP: 0000000000000000 R08:
> > 0000000000000001 R09:
> > 0000000000000000
> > [60313.522735] R10: 000000000000003f R11: 0000000000000000 R12:
> > ffff8cf20e5a5400 [60313.522737] R13: ffff8cf0e7e91c70 R14: ffff8ced4d6a1420
> > R15: ffff8cf0e7e91c70 [60313.522741]  ? mod_delayed_work_on+0x71/0xe0
> > [60313.522745]  ? __sbitmap_get_word+0x30/0x80
> > [60313.522748]  __blk_mq_do_dispatch_sched+0x218/0x320
> > [60313.522752]  __blk_mq_sched_dispatch_requests+0x107/0x150
> > [60313.522755]  blk_mq_sched_dispatch_requests+0x2f/0x60
> > [60313.522758]  blk_mq_run_work_fn+0x43/0xc0
> > [60313.522761]  process_one_work+0x24e/0x430
> > [60313.522765]  worker_thread+0x54/0x4d0
> > [60313.522767]  ? process_one_work+0x430/0x430
> > [60313.522770]  kthread+0x182/0x1b0
> > [60313.522773]  ? __kthread_init_worker+0x50/0x50
> > [60313.522776]  ret_from_fork+0x22/0x30
> > [60313.522781] ---[ end trace 55ef262e614b59af ]---

<snip>

> This is just to let you know that I'm running v5.13.7 without any extra 
> patches under block/ applied, and the issue is not reproducible.
> 
> I'll probably defer investigating this till v5.14 unless it is fixed there 
> already.

Thanks for info! I've looked some more into this. It appears the trapping
instruction is:

static void bfq_dispatch_remove(struct request_queue *q, struct request *rq)
{
        struct bfq_queue *bfqq = RQ_BFQQ(rq);

        /*
         * For consistency, the next instruction should have been
         * executed after removing the request from the queue and
         * dispatching it.  We execute instead this instruction before
         * bfq_remove_request() (and hence introduce a temporary
         * inconsistency), for efficiency.  In fact, should this
         * dispatch occur for a non in-service bfqq, this anticipated
         * increment prevents two counters related to bfqq->dispatched
         * from risking to be, first, uselessly decremented, and then
         * incremented again when the (new) value of bfqq->dispatched
         * happens to be taken into account.
         */
>>>     bfqq->dispatched++;
        bfq_update_peak_rate(q->elevator->elevator_data, rq);

        bfq_remove_request(q, rq);
}

The bfqq is in RAX and apparently is NULL. This function was called from
bfq_dispatch_rq_from_bfqq() and the above can only happen if bfqq->next_rq
was pointing to a request that was no longer attached to any bfq queue
(maybe rq was being freed?). But at this point I have no idea how this
could have possibly happened with current codebase. Can you maybe test with
5.14-rc4 to see whether current upstream has the issue?

								Honza