Message ID | 20180410230240.27709-1-bart.vanassche@wdc.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 4/10/18 5:02 PM, Bart Van Assche wrote: > Because blkcg_exit_queue() is now called from inside blk_cleanup_queue() > it is no longer safe to access cgroup information during or after the > blk_cleanup_queue() call. Hence protect the generic_make_request_checks() > call with blk_queue_enter() / blk_queue_exit(). Looks good, applied.
On Tue, Apr 10, 2018 at 05:02:40PM -0600, Bart Van Assche wrote: > Because blkcg_exit_queue() is now called from inside blk_cleanup_queue() > it is no longer safe to access cgroup information during or after the > blk_cleanup_queue() call. Hence protect the generic_make_request_checks() > call with blk_queue_enter() / blk_queue_exit(). I think the problem is that blkcg does weird things from blk_cleanup_queue. I'd rather fix that root cause than working around it.
On 04/12/18 00:27, Christoph Hellwig wrote: > On Tue, Apr 10, 2018 at 05:02:40PM -0600, Bart Van Assche wrote: >> Because blkcg_exit_queue() is now called from inside blk_cleanup_queue() >> it is no longer safe to access cgroup information during or after the >> blk_cleanup_queue() call. Hence protect the generic_make_request_checks() >> call with blk_queue_enter() / blk_queue_exit(). > > I think the problem is that blkcg does weird things from > blk_cleanup_queue. I'd rather fix that root cause than working around it. Hello Christoph, Can you clarify your comment? generic_make_request_checks() calls blkcg_bio_issue_check() and that function in turn calls blkg_lookup() and other blkcg functions. Hence this patch that avoids that blkcg code is called concurrently with removal of a request queue from blkcg. Thanks, Bart.
On 18/4/11 07:02, Bart Van Assche wrote: > Because blkcg_exit_queue() is now called from inside blk_cleanup_queue() > it is no longer safe to access cgroup information during or after the > blk_cleanup_queue() call. Hence protect the generic_make_request_checks() > call with blk_queue_enter() / blk_queue_exit(). > > Reported-by: Ming Lei <ming.lei@redhat.com> > Fixes: a063057d7c73 ("block: Fix a race between request queue removal and the block cgroup controller") > Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> > Cc: Ming Lei <ming.lei@redhat.com> > Cc: Joseph Qi <joseph.qi@linux.alibaba.com> I've tested using the following steps: 1) start a fio job with buffered write; 2) then remove the scsi device that fio write to: echo "scsi remove-single-device ${dev}" > /proc/scsi/scsi After applying this patch, the reported oops has gone. Tested-by: Joseph Qi <joseph.qi@linux.alibaba.com> > --- > > Changes compared to v2: converted two ternary expressions into if-statements. > > Changes compared to v1: guarded the blk_queue_exit() inside the loop with "if (q)". > > block/blk-core.c | 35 +++++++++++++++++++++++++++++------ > 1 file changed, 29 insertions(+), 6 deletions(-) > > diff --git a/block/blk-core.c b/block/blk-core.c > index 34e2f2227fd9..39308e874ffa 100644 > --- a/block/blk-core.c > +++ b/block/blk-core.c > @@ -2386,8 +2386,20 @@ blk_qc_t generic_make_request(struct bio *bio) > * yet. > */ > struct bio_list bio_list_on_stack[2]; > + blk_mq_req_flags_t flags = 0; > + struct request_queue *q = bio->bi_disk->queue; > blk_qc_t ret = BLK_QC_T_NONE; > > + if (bio->bi_opf & REQ_NOWAIT) > + flags = BLK_MQ_REQ_NOWAIT; > + if (blk_queue_enter(q, flags) < 0) { > + if (!blk_queue_dying(q) && (bio->bi_opf & REQ_NOWAIT)) > + bio_wouldblock_error(bio); > + else > + bio_io_error(bio); > + return ret; > + } > + > if (!generic_make_request_checks(bio)) > goto out; > > @@ -2424,11 +2436,22 @@ blk_qc_t generic_make_request(struct bio *bio) > bio_list_init(&bio_list_on_stack[0]); > current->bio_list = bio_list_on_stack; > do { > - struct request_queue *q = bio->bi_disk->queue; > - blk_mq_req_flags_t flags = bio->bi_opf & REQ_NOWAIT ? > - BLK_MQ_REQ_NOWAIT : 0; > + bool enter_succeeded = true; > + > + if (unlikely(q != bio->bi_disk->queue)) { > + if (q) > + blk_queue_exit(q); > + q = bio->bi_disk->queue; > + flags = 0; > + if (bio->bi_opf & REQ_NOWAIT) > + flags = BLK_MQ_REQ_NOWAIT; > + if (blk_queue_enter(q, flags) < 0) { > + enter_succeeded = false; > + q = NULL; > + } > + } > > - if (likely(blk_queue_enter(q, flags) == 0)) { > + if (enter_succeeded) { > struct bio_list lower, same; > > /* Create a fresh bio_list for all subordinate requests */ > @@ -2436,8 +2459,6 @@ blk_qc_t generic_make_request(struct bio *bio) > bio_list_init(&bio_list_on_stack[0]); > ret = q->make_request_fn(q, bio); > > - blk_queue_exit(q); > - > /* sort new bios into those for a lower level > * and those for the same level > */ > @@ -2464,6 +2485,8 @@ blk_qc_t generic_make_request(struct bio *bio) > current->bio_list = NULL; /* deactivate */ > > out: > + if (q) > + blk_queue_exit(q); > return ret; > } > EXPORT_SYMBOL(generic_make_request); >
diff --git a/block/blk-core.c b/block/blk-core.c index 34e2f2227fd9..39308e874ffa 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -2386,8 +2386,20 @@ blk_qc_t generic_make_request(struct bio *bio) * yet. */ struct bio_list bio_list_on_stack[2]; + blk_mq_req_flags_t flags = 0; + struct request_queue *q = bio->bi_disk->queue; blk_qc_t ret = BLK_QC_T_NONE; + if (bio->bi_opf & REQ_NOWAIT) + flags = BLK_MQ_REQ_NOWAIT; + if (blk_queue_enter(q, flags) < 0) { + if (!blk_queue_dying(q) && (bio->bi_opf & REQ_NOWAIT)) + bio_wouldblock_error(bio); + else + bio_io_error(bio); + return ret; + } + if (!generic_make_request_checks(bio)) goto out; @@ -2424,11 +2436,22 @@ blk_qc_t generic_make_request(struct bio *bio) bio_list_init(&bio_list_on_stack[0]); current->bio_list = bio_list_on_stack; do { - struct request_queue *q = bio->bi_disk->queue; - blk_mq_req_flags_t flags = bio->bi_opf & REQ_NOWAIT ? - BLK_MQ_REQ_NOWAIT : 0; + bool enter_succeeded = true; + + if (unlikely(q != bio->bi_disk->queue)) { + if (q) + blk_queue_exit(q); + q = bio->bi_disk->queue; + flags = 0; + if (bio->bi_opf & REQ_NOWAIT) + flags = BLK_MQ_REQ_NOWAIT; + if (blk_queue_enter(q, flags) < 0) { + enter_succeeded = false; + q = NULL; + } + } - if (likely(blk_queue_enter(q, flags) == 0)) { + if (enter_succeeded) { struct bio_list lower, same; /* Create a fresh bio_list for all subordinate requests */ @@ -2436,8 +2459,6 @@ blk_qc_t generic_make_request(struct bio *bio) bio_list_init(&bio_list_on_stack[0]); ret = q->make_request_fn(q, bio); - blk_queue_exit(q); - /* sort new bios into those for a lower level * and those for the same level */ @@ -2464,6 +2485,8 @@ blk_qc_t generic_make_request(struct bio *bio) current->bio_list = NULL; /* deactivate */ out: + if (q) + blk_queue_exit(q); return ret; } EXPORT_SYMBOL(generic_make_request);
Because blkcg_exit_queue() is now called from inside blk_cleanup_queue() it is no longer safe to access cgroup information during or after the blk_cleanup_queue() call. Hence protect the generic_make_request_checks() call with blk_queue_enter() / blk_queue_exit(). Reported-by: Ming Lei <ming.lei@redhat.com> Fixes: a063057d7c73 ("block: Fix a race between request queue removal and the block cgroup controller") Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Ming Lei <ming.lei@redhat.com> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> --- Changes compared to v2: converted two ternary expressions into if-statements. Changes compared to v1: guarded the blk_queue_exit() inside the loop with "if (q)". block/blk-core.c | 35 +++++++++++++++++++++++++++++------ 1 file changed, 29 insertions(+), 6 deletions(-)