[2/2] block: only return started requests from blk_mq_tag_to_rq()
diff mbox series

Message ID 20200619140159.141905-1-hare@suse.de
State New
Headers show
Series
  • Untitled series #305619
Related show

Commit Message

Hannes Reinecke June 19, 2020, 2:01 p.m. UTC
blk_mq_tag_to_rq() is used from within the driver to map a tag
to a request. As such it should only return requests which are
already started (ie passed to the driver); otherwise the driver
might trip over requests which it has never seen and random
crashes will occur.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 block/blk-mq.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

Comments

Hannes Reinecke June 19, 2020, 2:09 p.m. UTC | #1
On 6/19/20 4:01 PM, Hannes Reinecke wrote:
> blk_mq_tag_to_rq() is used from within the driver to map a tag
> to a request. As such it should only return requests which are
> already started (ie passed to the driver); otherwise the driver
> might trip over requests which it has never seen and random
> crashes will occur.
> 
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> ---
>   block/blk-mq.c | 6 +++++-
>   1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 4f57d27bfa73..f02d18113f9e 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -815,9 +815,13 @@ EXPORT_SYMBOL(blk_mq_delay_kick_requeue_list);
>   
>   struct request *blk_mq_tag_to_rq(struct blk_mq_tags *tags, unsigned int tag)
>   {
> +	struct request *rq;
> +
>   	if (tag < tags->nr_tags) {
>   		prefetch(tags->rqs[tag]);
> -		return tags->rqs[tag];
> +		rq = tags->rqs[tag];
> +		if (blk_mq_request_started(rq))
> +			return rq;
>   	}
>   
>   	return NULL;
> 
This becomes particularly obnoxious for SCSI drivers using 
scsi_host_find_tag() for cleaning up stale commands (ie drivers like 
qla4xxx, fnic, and snic).
All other drivers use it from the completion routine, so one can expect 
a valid (and started) tag here. So for those it shouldn't matter.

But still, if there are objections I could look at fixing it within the 
SCSI stack; although that would most likely mean I'll have to implement 
the above patch as an additional function.

Cheers,

Hannes
Jens Axboe June 19, 2020, 6:38 p.m. UTC | #2
On 6/19/20 8:09 AM, Hannes Reinecke wrote:
> On 6/19/20 4:01 PM, Hannes Reinecke wrote:
>> blk_mq_tag_to_rq() is used from within the driver to map a tag
>> to a request. As such it should only return requests which are
>> already started (ie passed to the driver); otherwise the driver
>> might trip over requests which it has never seen and random
>> crashes will occur.
>>
>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>> ---
>>   block/blk-mq.c | 6 +++++-
>>   1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/block/blk-mq.c b/block/blk-mq.c
>> index 4f57d27bfa73..f02d18113f9e 100644
>> --- a/block/blk-mq.c
>> +++ b/block/blk-mq.c
>> @@ -815,9 +815,13 @@ EXPORT_SYMBOL(blk_mq_delay_kick_requeue_list);
>>   
>>   struct request *blk_mq_tag_to_rq(struct blk_mq_tags *tags, unsigned int tag)
>>   {
>> +	struct request *rq;
>> +
>>   	if (tag < tags->nr_tags) {
>>   		prefetch(tags->rqs[tag]);
>> -		return tags->rqs[tag];
>> +		rq = tags->rqs[tag];
>> +		if (blk_mq_request_started(rq))
>> +			return rq;
>>   	}
>>   
>>   	return NULL;
>>
> This becomes particularly obnoxious for SCSI drivers using 
> scsi_host_find_tag() for cleaning up stale commands (ie drivers like 
> qla4xxx, fnic, and snic).
> All other drivers use it from the completion routine, so one can expect 
> a valid (and started) tag here. So for those it shouldn't matter.
> 
> But still, if there are objections I could look at fixing it within the 
> SCSI stack; although that would most likely mean I'll have to implement 
> the above patch as an additional function.

The helper does exactly what it should, return a request associated
with a tag. Either add the logic to the caller, or provide a new
helper that does what you need. I'd be inclined to just add it to
the caller that needs it.
Bart Van Assche June 19, 2020, 9:59 p.m. UTC | #3
On 2020-06-19 07:09, Hannes Reinecke wrote:
> On 6/19/20 4:01 PM, Hannes Reinecke wrote:
>> blk_mq_tag_to_rq() is used from within the driver to map a tag
>> to a request. As such it should only return requests which are
>> already started (ie passed to the driver); otherwise the driver
>> might trip over requests which it has never seen and random
>> crashes will occur.
>>
>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>> ---
>>   block/blk-mq.c | 6 +++++-
>>   1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/block/blk-mq.c b/block/blk-mq.c
>> index 4f57d27bfa73..f02d18113f9e 100644
>> --- a/block/blk-mq.c
>> +++ b/block/blk-mq.c
>> @@ -815,9 +815,13 @@ EXPORT_SYMBOL(blk_mq_delay_kick_requeue_list);
>>     struct request *blk_mq_tag_to_rq(struct blk_mq_tags *tags,
>> unsigned int tag)
>>   {
>> +    struct request *rq;
>> +
>>       if (tag < tags->nr_tags) {
>>           prefetch(tags->rqs[tag]);
>> -        return tags->rqs[tag];
>> +        rq = tags->rqs[tag];
>> +        if (blk_mq_request_started(rq))
>> +            return rq;
>>       }
>>         return NULL;
>>
> This becomes particularly obnoxious for SCSI drivers using
> scsi_host_find_tag() for cleaning up stale commands (ie drivers like
> qla4xxx, fnic, and snic).
> All other drivers use it from the completion routine, so one can expect
> a valid (and started) tag here. So for those it shouldn't matter.
> 
> But still, if there are objections I could look at fixing it within the
> SCSI stack; although that would most likely mean I'll have to implement
> the above patch as an additional function.

Hi Hannes,

Will the above patch make the fast path of every block driver slightly
slower?

Shouldn't SCSI drivers (and other block drivers) use
blk_mq_tagset_busy_iter() to clean up stale commands instead of
iterating over all tags and calling blk_mq_tag_to_rq() directly?

Thanks,

Bart.
Ming Lei June 19, 2020, 11:49 p.m. UTC | #4
On Fri, Jun 19, 2020 at 12:38:01PM -0600, Jens Axboe wrote:
> On 6/19/20 8:09 AM, Hannes Reinecke wrote:
> > On 6/19/20 4:01 PM, Hannes Reinecke wrote:
> >> blk_mq_tag_to_rq() is used from within the driver to map a tag
> >> to a request. As such it should only return requests which are
> >> already started (ie passed to the driver); otherwise the driver
> >> might trip over requests which it has never seen and random
> >> crashes will occur.
> >>
> >> Signed-off-by: Hannes Reinecke <hare@suse.de>
> >> ---
> >>   block/blk-mq.c | 6 +++++-
> >>   1 file changed, 5 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/block/blk-mq.c b/block/blk-mq.c
> >> index 4f57d27bfa73..f02d18113f9e 100644
> >> --- a/block/blk-mq.c
> >> +++ b/block/blk-mq.c
> >> @@ -815,9 +815,13 @@ EXPORT_SYMBOL(blk_mq_delay_kick_requeue_list);
> >>   
> >>   struct request *blk_mq_tag_to_rq(struct blk_mq_tags *tags, unsigned int tag)
> >>   {
> >> +	struct request *rq;
> >> +
> >>   	if (tag < tags->nr_tags) {
> >>   		prefetch(tags->rqs[tag]);
> >> -		return tags->rqs[tag];
> >> +		rq = tags->rqs[tag];
> >> +		if (blk_mq_request_started(rq))
> >> +			return rq;
> >>   	}
> >>   
> >>   	return NULL;
> >>
> > This becomes particularly obnoxious for SCSI drivers using 
> > scsi_host_find_tag() for cleaning up stale commands (ie drivers like 
> > qla4xxx, fnic, and snic).
> > All other drivers use it from the completion routine, so one can expect 
> > a valid (and started) tag here. So for those it shouldn't matter.
> > 
> > But still, if there are objections I could look at fixing it within the 
> > SCSI stack; although that would most likely mean I'll have to implement 
> > the above patch as an additional function.
> 
> The helper does exactly what it should, return a request associated
> with a tag. Either add the logic to the caller, or provide a new
> helper that does what you need. I'd be inclined to just add it to
> the caller that needs it.

I agree, even though the idea is good for most of driver usage since driver
should only care started request.

It may cause kernel oops in some corner case, such as blk_mq_poll_hybrid(),
blk_poll() may see one not-started request, one example is nvme_execute_rq_polled().

Thanks,
Ming
Hannes Reinecke June 22, 2020, 2:13 p.m. UTC | #5
On 6/19/20 11:59 PM, Bart Van Assche wrote:
> On 2020-06-19 07:09, Hannes Reinecke wrote:
>> On 6/19/20 4:01 PM, Hannes Reinecke wrote:
>>> blk_mq_tag_to_rq() is used from within the driver to map a tag
>>> to a request. As such it should only return requests which are
>>> already started (ie passed to the driver); otherwise the driver
>>> might trip over requests which it has never seen and random
>>> crashes will occur.
>>>
>>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>>> ---
>>>    block/blk-mq.c | 6 +++++-
>>>    1 file changed, 5 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/block/blk-mq.c b/block/blk-mq.c
>>> index 4f57d27bfa73..f02d18113f9e 100644
>>> --- a/block/blk-mq.c
>>> +++ b/block/blk-mq.c
>>> @@ -815,9 +815,13 @@ EXPORT_SYMBOL(blk_mq_delay_kick_requeue_list);
>>>      struct request *blk_mq_tag_to_rq(struct blk_mq_tags *tags,
>>> unsigned int tag)
>>>    {
>>> +    struct request *rq;
>>> +
>>>        if (tag < tags->nr_tags) {
>>>            prefetch(tags->rqs[tag]);
>>> -        return tags->rqs[tag];
>>> +        rq = tags->rqs[tag];
>>> +        if (blk_mq_request_started(rq))
>>> +            return rq;
>>>        }
>>>          return NULL;
>>>
>> This becomes particularly obnoxious for SCSI drivers using
>> scsi_host_find_tag() for cleaning up stale commands (ie drivers like
>> qla4xxx, fnic, and snic).
>> All other drivers use it from the completion routine, so one can expect
>> a valid (and started) tag here. So for those it shouldn't matter.
>>
>> But still, if there are objections I could look at fixing it within the
>> SCSI stack; although that would most likely mean I'll have to implement
>> the above patch as an additional function.
> 
> Hi Hannes,
> 
> Will the above patch make the fast path of every block driver slightly
> slower?
> 
> Shouldn't SCSI drivers (and other block drivers) use
> blk_mq_tagset_busy_iter() to clean up stale commands instead of
> iterating over all tags and calling blk_mq_tag_to_rq() directly?
> 
You can only iterate over all tags with blk_mq_tag_to_rq() if requests 
are identified with tags, ie if there is a 1:1 mapping between tags and 
internal commands.
Quite some drivers have their internal housekeeping (like hpsa), or do 
not track all commands via tags (eg fnic or csiostor).
For those the block layer iterator will not work as designed.

I'm currently preparing a patchset to clean that up (cf my patchset 
'reserved tags for SCSI'), but that will probably take some time until 
it'll be accepted.

And even then some drivers have to rely on scsi_host_find_tag() in eg 
TMF completions to figure out if the command for which the TMF was sent
is still active.

But we can move the check into the drivers if you are worried about 
performance impacts; it's a bit lame if you ask me, but if that's the 
way it should be handled, fine by me.

Cheers,

Hannes

Patch
diff mbox series

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 4f57d27bfa73..f02d18113f9e 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -815,9 +815,13 @@  EXPORT_SYMBOL(blk_mq_delay_kick_requeue_list);
 
 struct request *blk_mq_tag_to_rq(struct blk_mq_tags *tags, unsigned int tag)
 {
+	struct request *rq;
+
 	if (tag < tags->nr_tags) {
 		prefetch(tags->rqs[tag]);
-		return tags->rqs[tag];
+		rq = tags->rqs[tag];
+		if (blk_mq_request_started(rq))
+			return rq;
 	}
 
 	return NULL;