diff mbox

blk-mq: Return invalid cookie if bio was split

Message ID 20160927172536.53a1315f@tom-ThinkPad-T450 (mailing list archive)
State New, archived
Headers show

Commit Message

Ming Lei Sept. 27, 2016, 9:25 a.m. UTC
On Mon, 26 Sep 2016 19:00:30 -0400
Keith Busch <keith.busch@intel.com> wrote:

> The only user of polling requires its original request be completed in
> its entirety before continuing execution. If the bio needs to be split
> and chained for any reason, the direct IO path would have waited for just
> that split portion to complete, leading to potential data corruption if
> the remaining transfer has not yet completed.

The issue looks a bit tricky because there is no per-bio place for holding
the cookie, and generic_make_request() only returns the cookie for the
last bio in the current bio list, so maybe we need the following patch too.

Also seems merge case need to take care of too.

---


> 
> This patch has blk-mq return an invalid cookie if a bio requires splitting
> so that polling does not occur.
> 
> Signed-off-by: Keith Busch <keith.busch@intel.com>
> ---
>  block/blk-mq.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index c207fa9..6385985 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1311,6 +1311,7 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
>  	unsigned int request_count = 0;
>  	struct blk_plug *plug;
>  	struct request *same_queue_rq = NULL;
> +	struct bio *orig = bio;
>  	blk_qc_t cookie;
>  
>  	blk_queue_bounce(q, &bio);
> @@ -1389,7 +1390,7 @@ run_queue:
>  	}
>  	blk_mq_put_ctx(data.ctx);
>  done:
> -	return cookie;
> +	return bio == orig ? cookie : BLK_QC_T_NONE;
>  }
>  
>  /*
> @@ -1404,6 +1405,7 @@ static blk_qc_t blk_sq_make_request(struct request_queue *q, struct bio *bio)
>  	unsigned int request_count = 0;
>  	struct blk_map_ctx data;
>  	struct request *rq;
> +	struct bio *orig = bio;
>  	blk_qc_t cookie;
>  
>  	blk_queue_bounce(q, &bio);
> @@ -1467,7 +1469,7 @@ run_queue:
>  	}
>  
>  	blk_mq_put_ctx(data.ctx);
> -	return cookie;
> +	return bio == orig ? cookie : BLK_QC_T_NONE;
>  }
>  
>  /*

--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Keith Busch Sept. 27, 2016, 6:24 p.m. UTC | #1
On Tue, Sep 27, 2016 at 05:25:36PM +0800, Ming Lei wrote:
> On Mon, 26 Sep 2016 19:00:30 -0400
> Keith Busch <keith.busch@intel.com> wrote:
> 
> > The only user of polling requires its original request be completed in
> > its entirety before continuing execution. If the bio needs to be split
> > and chained for any reason, the direct IO path would have waited for just
> > that split portion to complete, leading to potential data corruption if
> > the remaining transfer has not yet completed.
> 
> The issue looks a bit tricky because there is no per-bio place for holding
> the cookie, and generic_make_request() only returns the cookie for the
> last bio in the current bio list, so maybe we need the following patch too.
> 
> Also seems merge case need to take care of too.

Yah, I think you're right. I'll try to work out a test case that more
readily exposes the problem so I can better verify the fix.
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Keith Busch Oct. 3, 2016, 10 p.m. UTC | #2
On Tue, Sep 27, 2016 at 05:25:36PM +0800, Ming Lei wrote:
> On Mon, 26 Sep 2016 19:00:30 -0400
> Keith Busch <keith.busch@intel.com> wrote:
> 
> > The only user of polling requires its original request be completed in
> > its entirety before continuing execution. If the bio needs to be split
> > and chained for any reason, the direct IO path would have waited for just
> > that split portion to complete, leading to potential data corruption if
> > the remaining transfer has not yet completed.
> 
> The issue looks a bit tricky because there is no per-bio place for holding
> the cookie, and generic_make_request() only returns the cookie for the
> last bio in the current bio list, so maybe we need the following patch too.

I'm looking more into this, and I can't see why we're returning a cookie
to poll on in the first place. blk_poll is only invoked when we could have
called io_schedule, so we expect the task state gets set to TASK_RUNNING
when all the work completes. So why do we need to poll for a specific
tag instead of polling until task state is set back to running?

I've tried this out and it seems to work fine, and should fix any issues
from split IO requests. It also helps direct IO polling, since it can
have a list of bios, but can only save one cookie.
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ming Lei Oct. 5, 2016, 3:19 a.m. UTC | #3
On Tue, Oct 4, 2016 at 6:00 AM, Keith Busch <keith.busch@intel.com> wrote:
> On Tue, Sep 27, 2016 at 05:25:36PM +0800, Ming Lei wrote:
>> On Mon, 26 Sep 2016 19:00:30 -0400
>> Keith Busch <keith.busch@intel.com> wrote:
>>
>> > The only user of polling requires its original request be completed in
>> > its entirety before continuing execution. If the bio needs to be split
>> > and chained for any reason, the direct IO path would have waited for just
>> > that split portion to complete, leading to potential data corruption if
>> > the remaining transfer has not yet completed.
>>
>> The issue looks a bit tricky because there is no per-bio place for holding
>> the cookie, and generic_make_request() only returns the cookie for the
>> last bio in the current bio list, so maybe we need the following patch too.
>
> I'm looking more into this, and I can't see why we're returning a cookie
> to poll on in the first place. blk_poll is only invoked when we could have
> called io_schedule, so we expect the task state gets set to TASK_RUNNING
> when all the work completes. So why do we need to poll for a specific
> tag instead of polling until task state is set back to running?

But .poll() need to check if the specific request is completed or not,
then blk_poll() can set 'current' as RUNNING if it is completed.

blk_poll():
                ...
                ret = q->mq_ops->poll(hctx, blk_qc_t_to_tag(cookie));
                if (ret > 0) {
                        hctx->poll_success++;
                        set_current_state(TASK_RUNNING);
                        return true;
                }
                ...


>
> I've tried this out and it seems to work fine, and should fix any issues
> from split IO requests. It also helps direct IO polling, since it can
> have a list of bios, but can only save one cookie.

I am glad to take a look the patch if you post it out.

Thanks,
Ming Lei
--
To unsubscribe from this list: send the line "unsubscribe linux-block" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/block/blk-core.c b/block/blk-core.c
index 14d7c0740dc0..f1ab547173f8 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1996,6 +1996,8 @@  blk_qc_t generic_make_request(struct bio *bio)
 {
 	struct bio_list bio_list_on_stack;
 	blk_qc_t ret = BLK_QC_T_NONE;
+	blk_qc_t lret;
+	const struct bio *orig = bio;
 
 	if (!generic_make_request_checks(bio))
 		goto out;
@@ -2036,7 +2038,9 @@  blk_qc_t generic_make_request(struct bio *bio)
 		struct request_queue *q = bdev_get_queue(bio->bi_bdev);
 
 		if (likely(blk_queue_enter(q, false) == 0)) {
-			ret = q->make_request_fn(q, bio);
+			lret = q->make_request_fn(q, bio);
+			if (bio == orig)
+				ret = lret;
 
 			blk_queue_exit(q);