diff mbox series

null_blk: execute complete callback for fake timeout request

Message ID 20230312123556.12298-1-akinobu.mita@gmail.com (mailing list archive)
State New, archived
Headers show
Series null_blk: execute complete callback for fake timeout request | expand

Commit Message

Akinobu Mita March 12, 2023, 12:35 p.m. UTC
When injecting a fake timeout into the null_blk driver by fail_io_timeout,
the request timeout handler doen't execute blk_mq_complete_request(), so
the complete callback will never be executed for that timed out request.

The null_blk driver also has a driver-specific fake timeout mechanism and
does not have the problem that occur when using the generic one.
Fix the problem by doing similar to what the driver-specific one does.

Fixes: de3510e52b0a ("null_blk: fix command timeout completion handling")
Cc: Damien Le Moal <damien.lemoal@wdc.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
---
 drivers/block/null_blk/main.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Damien Le Moal March 13, 2023, 1 a.m. UTC | #1
On 3/12/23 21:35, Akinobu Mita wrote:
> When injecting a fake timeout into the null_blk driver by fail_io_timeout,
> the request timeout handler doen't execute blk_mq_complete_request(), so
> the complete callback will never be executed for that timed out request.
> 
> The null_blk driver also has a driver-specific fake timeout mechanism and
> does not have the problem that occur when using the generic one.
> Fix the problem by doing similar to what the driver-specific one does.
> 
> Fixes: de3510e52b0a ("null_blk: fix command timeout completion handling")
> Cc: Damien Le Moal <damien.lemoal@wdc.com>
> Cc: Jens Axboe <axboe@kernel.dk>
> Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
> ---
>  drivers/block/null_blk/main.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/block/null_blk/main.c b/drivers/block/null_blk/main.c
> index 4c601ca9552a..69250b3cfecd 100644
> --- a/drivers/block/null_blk/main.c
> +++ b/drivers/block/null_blk/main.c
> @@ -1413,7 +1413,9 @@ static inline void nullb_complete_cmd(struct nullb_cmd *cmd)
>  	case NULL_IRQ_SOFTIRQ:
>  		switch (cmd->nq->dev->queue_mode) {
>  		case NULL_Q_MQ:
> -			if (likely(!blk_should_fake_timeout(cmd->rq->q)))
> +			if (unlikely(blk_should_fake_timeout(cmd->rq->q)))
> +				cmd->fake_timeout = true;
> +			else
>  				blk_mq_complete_request(cmd->rq);
>  			break;
>  		case NULL_Q_BIO:

I have not checked, but does this work ?

diff --git a/drivers/block/null_blk/main.c b/drivers/block/null_blk/main.c
index 4c601ca9552a..52d689aa3171 100644
--- a/drivers/block/null_blk/main.c
+++ b/drivers/block/null_blk/main.c
@@ -1413,7 +1413,7 @@ static inline void nullb_complete_cmd(struct nullb_cmd *cmd)
        case NULL_IRQ_SOFTIRQ:
                switch (cmd->nq->dev->queue_mode) {
                case NULL_Q_MQ:
-                       if (likely(!blk_should_fake_timeout(cmd->rq->q)))
+                       if (!cmd->fake_timeout)
                                blk_mq_complete_request(cmd->rq);
                        break;
                case NULL_Q_BIO:
@@ -1675,7 +1675,8 @@ static blk_status_t null_queue_rq(struct blk_mq_hw_ctx *hctx,
        cmd->rq = bd->rq;
        cmd->error = BLK_STS_OK;
        cmd->nq = nq;
-       cmd->fake_timeout = should_timeout_request(bd->rq);
+       cmd->fake_timeout = should_timeout_request(bd->rq) ||
+               blk_should_fake_timeout(bd->rq->q);

        blk_mq_start_request(bd->rq);


It is I think cleaner as it unifies the internal fake timeout and
blk_should_fake_timeout().
Akinobu Mita March 13, 2023, 4:39 p.m. UTC | #2
2023年3月13日(月) 10:00 Damien Le Moal <damien.lemoal@opensource.wdc.com>:
>
> On 3/12/23 21:35, Akinobu Mita wrote:
> > When injecting a fake timeout into the null_blk driver by fail_io_timeout,
> > the request timeout handler doen't execute blk_mq_complete_request(), so
> > the complete callback will never be executed for that timed out request.
> >
> > The null_blk driver also has a driver-specific fake timeout mechanism and
> > does not have the problem that occur when using the generic one.
> > Fix the problem by doing similar to what the driver-specific one does.
> >
> > Fixes: de3510e52b0a ("null_blk: fix command timeout completion handling")
> > Cc: Damien Le Moal <damien.lemoal@wdc.com>
> > Cc: Jens Axboe <axboe@kernel.dk>
> > Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
> > ---
> >  drivers/block/null_blk/main.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/block/null_blk/main.c b/drivers/block/null_blk/main.c
> > index 4c601ca9552a..69250b3cfecd 100644
> > --- a/drivers/block/null_blk/main.c
> > +++ b/drivers/block/null_blk/main.c
> > @@ -1413,7 +1413,9 @@ static inline void nullb_complete_cmd(struct nullb_cmd *cmd)
> >       case NULL_IRQ_SOFTIRQ:
> >               switch (cmd->nq->dev->queue_mode) {
> >               case NULL_Q_MQ:
> > -                     if (likely(!blk_should_fake_timeout(cmd->rq->q)))
> > +                     if (unlikely(blk_should_fake_timeout(cmd->rq->q)))
> > +                             cmd->fake_timeout = true;
> > +                     else
> >                               blk_mq_complete_request(cmd->rq);
> >                       break;
> >               case NULL_Q_BIO:
>
> I have not checked, but does this work ?

Yes it worked.

Tested-by: Akinobu Mita <akinobu.mita@gmail.com>

> diff --git a/drivers/block/null_blk/main.c b/drivers/block/null_blk/main.c
> index 4c601ca9552a..52d689aa3171 100644
> --- a/drivers/block/null_blk/main.c
> +++ b/drivers/block/null_blk/main.c
> @@ -1413,7 +1413,7 @@ static inline void nullb_complete_cmd(struct nullb_cmd *cmd)
>         case NULL_IRQ_SOFTIRQ:
>                 switch (cmd->nq->dev->queue_mode) {
>                 case NULL_Q_MQ:
> -                       if (likely(!blk_should_fake_timeout(cmd->rq->q)))
> +                       if (!cmd->fake_timeout)
>                                 blk_mq_complete_request(cmd->rq);
>                         break;
>                 case NULL_Q_BIO:
> @@ -1675,7 +1675,8 @@ static blk_status_t null_queue_rq(struct blk_mq_hw_ctx *hctx,
>         cmd->rq = bd->rq;
>         cmd->error = BLK_STS_OK;
>         cmd->nq = nq;
> -       cmd->fake_timeout = should_timeout_request(bd->rq);
> +       cmd->fake_timeout = should_timeout_request(bd->rq) ||
> +               blk_should_fake_timeout(bd->rq->q);
>
>         blk_mq_start_request(bd->rq);
>
>
> It is I think cleaner as it unifies the internal fake timeout and
> blk_should_fake_timeout().

I also prefer this one.
Keith Busch March 13, 2023, 4:50 p.m. UTC | #3
On Mon, Mar 13, 2023 at 10:00:30AM +0900, Damien Le Moal wrote:
> On 3/12/23 21:35, Akinobu Mita wrote:
> 
> diff --git a/drivers/block/null_blk/main.c b/drivers/block/null_blk/main.c
> index 4c601ca9552a..52d689aa3171 100644
> --- a/drivers/block/null_blk/main.c
> +++ b/drivers/block/null_blk/main.c
> @@ -1413,7 +1413,7 @@ static inline void nullb_complete_cmd(struct nullb_cmd *cmd)
>         case NULL_IRQ_SOFTIRQ:
>                 switch (cmd->nq->dev->queue_mode) {
>                 case NULL_Q_MQ:
> -                       if (likely(!blk_should_fake_timeout(cmd->rq->q)))
> +                       if (!cmd->fake_timeout)
>                                 blk_mq_complete_request(cmd->rq);

I think you can remove the fake_timeout check from here now since this function
is never called when it's true.

>                         break;
>                 case NULL_Q_BIO:
> @@ -1675,7 +1675,8 @@ static blk_status_t null_queue_rq(struct blk_mq_hw_ctx *hctx,
>         cmd->rq = bd->rq;
>         cmd->error = BLK_STS_OK;
>         cmd->nq = nq;
> -       cmd->fake_timeout = should_timeout_request(bd->rq);
> +       cmd->fake_timeout = should_timeout_request(bd->rq) ||
> +               blk_should_fake_timeout(bd->rq->q);
> 
>         blk_mq_start_request(bd->rq);
Damien Le Moal March 14, 2023, 3:56 a.m. UTC | #4
On 3/14/23 01:50, Keith Busch wrote:
> On Mon, Mar 13, 2023 at 10:00:30AM +0900, Damien Le Moal wrote:
>> On 3/12/23 21:35, Akinobu Mita wrote:
>>
>> diff --git a/drivers/block/null_blk/main.c b/drivers/block/null_blk/main.c
>> index 4c601ca9552a..52d689aa3171 100644
>> --- a/drivers/block/null_blk/main.c
>> +++ b/drivers/block/null_blk/main.c
>> @@ -1413,7 +1413,7 @@ static inline void nullb_complete_cmd(struct nullb_cmd *cmd)
>>         case NULL_IRQ_SOFTIRQ:
>>                 switch (cmd->nq->dev->queue_mode) {
>>                 case NULL_Q_MQ:
>> -                       if (likely(!blk_should_fake_timeout(cmd->rq->q)))
>> +                       if (!cmd->fake_timeout)
>>                                 blk_mq_complete_request(cmd->rq);
> 
> I think you can remove the fake_timeout check from here now since this function
> is never called when it's true.

Right. Sending a proper patch with that.
diff mbox series

Patch

diff --git a/drivers/block/null_blk/main.c b/drivers/block/null_blk/main.c
index 4c601ca9552a..69250b3cfecd 100644
--- a/drivers/block/null_blk/main.c
+++ b/drivers/block/null_blk/main.c
@@ -1413,7 +1413,9 @@  static inline void nullb_complete_cmd(struct nullb_cmd *cmd)
 	case NULL_IRQ_SOFTIRQ:
 		switch (cmd->nq->dev->queue_mode) {
 		case NULL_Q_MQ:
-			if (likely(!blk_should_fake_timeout(cmd->rq->q)))
+			if (unlikely(blk_should_fake_timeout(cmd->rq->q)))
+				cmd->fake_timeout = true;
+			else
 				blk_mq_complete_request(cmd->rq);
 			break;
 		case NULL_Q_BIO: