diff mbox series

[3/4] dm: fix missing bio_split() pattern code in __split_and_process_bio()

Message ID 20190119180506.1300-4-snitzer@redhat.com (mailing list archive)
State New, archived
Headers show
Series dm: fix various issues with bio splitting code | expand

Commit Message

Mike Snitzer Jan. 19, 2019, 6:05 p.m. UTC
Use the same BIO_QUEUE_ENTERED pattern that was established by commit
cd4a4ae4683dc ("block: don't use blocking queue entered for recursive
bio submits") by setting BIO_QUEUE_ENTERED after bio_split() and before
recursing via generic_make_request().

Also add trace_block_split() because it provides useful context about
bio splits in blktrace.

Depends-on: cd4a4ae4683dc ("block: don't use blocking queue entered for recursive bio submits")
Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
Cc: stable@vger.kernel.org # 4.16+
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 drivers/md/dm.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Ming Lei Jan. 21, 2019, 3:21 a.m. UTC | #1
On Sat, Jan 19, 2019 at 01:05:05PM -0500, Mike Snitzer wrote:
> Use the same BIO_QUEUE_ENTERED pattern that was established by commit
> cd4a4ae4683dc ("block: don't use blocking queue entered for recursive
> bio submits") by setting BIO_QUEUE_ENTERED after bio_split() and before
> recursing via generic_make_request().
> 
> Also add trace_block_split() because it provides useful context about
> bio splits in blktrace.
> 
> Depends-on: cd4a4ae4683dc ("block: don't use blocking queue entered for recursive bio submits")
> Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
> Cc: stable@vger.kernel.org # 4.16+
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
>  drivers/md/dm.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index fbadda68e23b..6e29c2d99b99 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -1654,7 +1654,9 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
>  						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
>  				part_stat_unlock();
>  
> +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
>  				bio_chain(b, bio);
> +				trace_block_split(md->queue, b, bio->bi_iter.bi_sector);
>  				ret = generic_make_request(bio);
>  				break;
>  			}

In theory, BIO_QUEUE_ENTERED is only required when __split_and_process_bio() is
called from generic_make_request(). However, it may be called from dm_wq_work(),
this way might cause trouble on operation to q->q_usage_counter.

Thanks,
Ming
NeilBrown Jan. 21, 2019, 4:39 a.m. UTC | #2
On Sat, Jan 19 2019, Mike Snitzer wrote:

> Use the same BIO_QUEUE_ENTERED pattern that was established by commit
> cd4a4ae4683dc ("block: don't use blocking queue entered for recursive
> bio submits") by setting BIO_QUEUE_ENTERED after bio_split() and before
> recursing via generic_make_request().
>
> Also add trace_block_split() because it provides useful context about
> bio splits in blktrace.
>
> Depends-on: cd4a4ae4683dc ("block: don't use blocking queue entered for recursive bio submits")
> Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
> Cc: stable@vger.kernel.org # 4.16+
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
>  drivers/md/dm.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index fbadda68e23b..6e29c2d99b99 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -1654,7 +1654,9 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
>  						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
>  				part_stat_unlock();
>  
> +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
>  				bio_chain(b, bio);
> +				trace_block_split(md->queue, b, bio->bi_iter.bi_sector);
>  				ret = generic_make_request(bio);
>  				break;
>  			}
Thanks Mike...

If I understand this correctly, then we need to make the same change for
all other callers of bio_split(), except blk_queue_split().
Maybe we should just set the flag and do the trace in bio_split().
Do you see any harm with doing it that way (in the next merge-window, I
don't suggest you change this patch).

Thanks,
NeilBrown


> -- 
> 2.15.0
>
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
Mike Snitzer Jan. 21, 2019, 4:02 p.m. UTC | #3
On Sun, Jan 20 2019 at 10:21P -0500,
Ming Lei <ming.lei@redhat.com> wrote:

> On Sat, Jan 19, 2019 at 01:05:05PM -0500, Mike Snitzer wrote:
> > Use the same BIO_QUEUE_ENTERED pattern that was established by commit
> > cd4a4ae4683dc ("block: don't use blocking queue entered for recursive
> > bio submits") by setting BIO_QUEUE_ENTERED after bio_split() and before
> > recursing via generic_make_request().
> > 
> > Also add trace_block_split() because it provides useful context about
> > bio splits in blktrace.
> > 
> > Depends-on: cd4a4ae4683dc ("block: don't use blocking queue entered for recursive bio submits")
> > Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
> > Cc: stable@vger.kernel.org # 4.16+
> > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > ---
> >  drivers/md/dm.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > index fbadda68e23b..6e29c2d99b99 100644
> > --- a/drivers/md/dm.c
> > +++ b/drivers/md/dm.c
> > @@ -1654,7 +1654,9 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
> >  						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
> >  				part_stat_unlock();
> >  
> > +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
> >  				bio_chain(b, bio);
> > +				trace_block_split(md->queue, b, bio->bi_iter.bi_sector);
> >  				ret = generic_make_request(bio);
> >  				break;
> >  			}
> 
> In theory, BIO_QUEUE_ENTERED is only required when __split_and_process_bio() is
> called from generic_make_request(). However, it may be called from dm_wq_work(),
> this way might cause trouble on operation to q->q_usage_counter.

Good point, I've tweaked this patch to clear BIO_QUEUE_ENTERED in
dm_make_request().

And to Neil's point: yes, these changes really do need to made
common since it appears all bio_split() callers do go on to call
generic_make_request().

Anyway, here is the updated patch that is now staged in linux-next:

From: Mike Snitzer <snitzer@redhat.com>
Date: Fri, 18 Jan 2019 01:21:11 -0500
Subject: [PATCH v2] dm: fix missing bio_split() pattern code in __split_and_process_bio()

Use the same BIO_QUEUE_ENTERED pattern that was established by commit
cd4a4ae4683dc ("block: don't use blocking queue entered for recursive
bio submits") by setting BIO_QUEUE_ENTERED after bio_split() and before
recursing via generic_make_request().

Also add trace_block_split() because it provides useful context about
bio splits in blktrace.

Depends-on: cd4a4ae4683dc ("block: don't use blocking queue entered for recursive bio submits")
Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
Cc: stable@vger.kernel.org # 4.16+
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 drivers/md/dm.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index fbadda68e23b..25884f833a32 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1654,7 +1654,9 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
 						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
 				part_stat_unlock();
 
+				bio_set_flag(bio, BIO_QUEUE_ENTERED);
 				bio_chain(b, bio);
+				trace_block_split(md->queue, b, bio->bi_iter.bi_sector);
 				ret = generic_make_request(bio);
 				break;
 			}
@@ -1734,6 +1736,13 @@ static blk_qc_t dm_make_request(struct request_queue *q, struct bio *bio)
 
 	map = dm_get_live_table(md, &srcu_idx);
 
+	/*
+	 * Clear the bio-reentered-generic_make_request() flag,
+	 * will be set again as needed if bio needs to be split.
+	 */
+	if (bio_flagged(bio, BIO_QUEUE_ENTERED))
+		bio_clear_flag(bio, BIO_QUEUE_ENTERED);
+
 	/* if we're suspended, we have to queue this io for later */
 	if (unlikely(test_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags))) {
 		dm_put_live_table(md, srcu_idx);
Ming Lei Jan. 22, 2019, 2:46 a.m. UTC | #4
On Mon, Jan 21, 2019 at 11:02:04AM -0500, Mike Snitzer wrote:
> On Sun, Jan 20 2019 at 10:21P -0500,
> Ming Lei <ming.lei@redhat.com> wrote:
> 
> > On Sat, Jan 19, 2019 at 01:05:05PM -0500, Mike Snitzer wrote:
> > > Use the same BIO_QUEUE_ENTERED pattern that was established by commit
> > > cd4a4ae4683dc ("block: don't use blocking queue entered for recursive
> > > bio submits") by setting BIO_QUEUE_ENTERED after bio_split() and before
> > > recursing via generic_make_request().
> > > 
> > > Also add trace_block_split() because it provides useful context about
> > > bio splits in blktrace.
> > > 
> > > Depends-on: cd4a4ae4683dc ("block: don't use blocking queue entered for recursive bio submits")
> > > Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
> > > Cc: stable@vger.kernel.org # 4.16+
> > > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > > ---
> > >  drivers/md/dm.c | 2 ++
> > >  1 file changed, 2 insertions(+)
> > > 
> > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > > index fbadda68e23b..6e29c2d99b99 100644
> > > --- a/drivers/md/dm.c
> > > +++ b/drivers/md/dm.c
> > > @@ -1654,7 +1654,9 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
> > >  						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
> > >  				part_stat_unlock();
> > >  
> > > +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
> > >  				bio_chain(b, bio);
> > > +				trace_block_split(md->queue, b, bio->bi_iter.bi_sector);
> > >  				ret = generic_make_request(bio);
> > >  				break;
> > >  			}
> > 
> > In theory, BIO_QUEUE_ENTERED is only required when __split_and_process_bio() is
> > called from generic_make_request(). However, it may be called from dm_wq_work(),
> > this way might cause trouble on operation to q->q_usage_counter.
> 
> Good point, I've tweaked this patch to clear BIO_QUEUE_ENTERED in
> dm_make_request().
> 
> And to Neil's point: yes, these changes really do need to made
> common since it appears all bio_split() callers do go on to call
> generic_make_request().
> 
> Anyway, here is the updated patch that is now staged in linux-next:
> 
> From: Mike Snitzer <snitzer@redhat.com>
> Date: Fri, 18 Jan 2019 01:21:11 -0500
> Subject: [PATCH v2] dm: fix missing bio_split() pattern code in __split_and_process_bio()
> 
> Use the same BIO_QUEUE_ENTERED pattern that was established by commit
> cd4a4ae4683dc ("block: don't use blocking queue entered for recursive
> bio submits") by setting BIO_QUEUE_ENTERED after bio_split() and before
> recursing via generic_make_request().
> 
> Also add trace_block_split() because it provides useful context about
> bio splits in blktrace.
> 
> Depends-on: cd4a4ae4683dc ("block: don't use blocking queue entered for recursive bio submits")
> Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
> Cc: stable@vger.kernel.org # 4.16+
> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> ---
>  drivers/md/dm.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index fbadda68e23b..25884f833a32 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -1654,7 +1654,9 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
>  						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
>  				part_stat_unlock();
>  
> +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
>  				bio_chain(b, bio);
> +				trace_block_split(md->queue, b, bio->bi_iter.bi_sector);
>  				ret = generic_make_request(bio);
>  				break;
>  			}
> @@ -1734,6 +1736,13 @@ static blk_qc_t dm_make_request(struct request_queue *q, struct bio *bio)
>  
>  	map = dm_get_live_table(md, &srcu_idx);
>  
> +	/*
> +	 * Clear the bio-reentered-generic_make_request() flag,
> +	 * will be set again as needed if bio needs to be split.
> +	 */
> +	if (bio_flagged(bio, BIO_QUEUE_ENTERED))
> +		bio_clear_flag(bio, BIO_QUEUE_ENTERED);
> +
>  	/* if we're suspended, we have to queue this io for later */
>  	if (unlikely(test_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags))) {
>  		dm_put_live_table(md, srcu_idx);
> -- 
> 2.15.0
> 

Hi Mike,

I'd suggest to fix this kind issue in the following way, then we
can avoid to touch this flag from drivers:

diff --git a/block/blk-core.c b/block/blk-core.c
index 3c5f61ceeb67..e70103560ac2 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1024,6 +1024,8 @@ blk_qc_t generic_make_request(struct bio *bio)
 		else
 			bio_io_error(bio);
 		return ret;
+	} else {
+		bio_set_flag(bio, BIO_QUEUE_ENTERED);
 	}
 
 	if (!generic_make_request_checks(bio))
@@ -1074,6 +1076,8 @@ blk_qc_t generic_make_request(struct bio *bio)
 			if (blk_queue_enter(q, flags) < 0) {
 				enter_succeeded = false;
 				q = NULL;
+			} else {
+				bio_set_flag(bio, BIO_QUEUE_ENTERED);
 			}
 		}
 
diff --git a/block/blk-merge.c b/block/blk-merge.c
index b990853f6de7..8777e286bd3f 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -339,16 +339,6 @@ void blk_queue_split(struct request_queue *q, struct bio **bio)
 		/* there isn't chance to merge the splitted bio */
 		split->bi_opf |= REQ_NOMERGE;
 
-		/*
-		 * Since we're recursing into make_request here, ensure
-		 * that we mark this bio as already having entered the queue.
-		 * If not, and the queue is going away, we can get stuck
-		 * forever on waiting for the queue reference to drop. But
-		 * that will never happen, as we're already holding a
-		 * reference to it.
-		 */
-		bio_set_flag(*bio, BIO_QUEUE_ENTERED);
-
 		bio_chain(split, *bio);
 		trace_block_split(q, split, (*bio)->bi_iter.bi_sector);
 		generic_make_request(*bio);

Thanks,
Ming
Mike Snitzer Jan. 22, 2019, 3:17 a.m. UTC | #5
On Mon, Jan 21 2019 at  9:46pm -0500,
Ming Lei <ming.lei@redhat.com> wrote:

> On Mon, Jan 21, 2019 at 11:02:04AM -0500, Mike Snitzer wrote:
> > On Sun, Jan 20 2019 at 10:21P -0500,
> > Ming Lei <ming.lei@redhat.com> wrote:
> > 
> > > On Sat, Jan 19, 2019 at 01:05:05PM -0500, Mike Snitzer wrote:
> > > > Use the same BIO_QUEUE_ENTERED pattern that was established by commit
> > > > cd4a4ae4683dc ("block: don't use blocking queue entered for recursive
> > > > bio submits") by setting BIO_QUEUE_ENTERED after bio_split() and before
> > > > recursing via generic_make_request().
> > > > 
> > > > Also add trace_block_split() because it provides useful context about
> > > > bio splits in blktrace.
> > > > 
> > > > Depends-on: cd4a4ae4683dc ("block: don't use blocking queue entered for recursive bio submits")
> > > > Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
> > > > Cc: stable@vger.kernel.org # 4.16+
> > > > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > > > ---
> > > >  drivers/md/dm.c | 2 ++
> > > >  1 file changed, 2 insertions(+)
> > > > 
> > > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > > > index fbadda68e23b..6e29c2d99b99 100644
> > > > --- a/drivers/md/dm.c
> > > > +++ b/drivers/md/dm.c
> > > > @@ -1654,7 +1654,9 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
> > > >  						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
> > > >  				part_stat_unlock();
> > > >  
> > > > +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
> > > >  				bio_chain(b, bio);
> > > > +				trace_block_split(md->queue, b, bio->bi_iter.bi_sector);
> > > >  				ret = generic_make_request(bio);
> > > >  				break;
> > > >  			}
> > > 
> > > In theory, BIO_QUEUE_ENTERED is only required when __split_and_process_bio() is
> > > called from generic_make_request(). However, it may be called from dm_wq_work(),
> > > this way might cause trouble on operation to q->q_usage_counter.
> > 
> > Good point, I've tweaked this patch to clear BIO_QUEUE_ENTERED in
> > dm_make_request().
> > 
> > And to Neil's point: yes, these changes really do need to made
> > common since it appears all bio_split() callers do go on to call
> > generic_make_request().
> > 
> > Anyway, here is the updated patch that is now staged in linux-next:
> > 
> > From: Mike Snitzer <snitzer@redhat.com>
> > Date: Fri, 18 Jan 2019 01:21:11 -0500
> > Subject: [PATCH v2] dm: fix missing bio_split() pattern code in __split_and_process_bio()
> > 
> > Use the same BIO_QUEUE_ENTERED pattern that was established by commit
> > cd4a4ae4683dc ("block: don't use blocking queue entered for recursive
> > bio submits") by setting BIO_QUEUE_ENTERED after bio_split() and before
> > recursing via generic_make_request().
> > 
> > Also add trace_block_split() because it provides useful context about
> > bio splits in blktrace.
> > 
> > Depends-on: cd4a4ae4683dc ("block: don't use blocking queue entered for recursive bio submits")
> > Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
> > Cc: stable@vger.kernel.org # 4.16+
> > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > ---
> >  drivers/md/dm.c | 9 +++++++++
> >  1 file changed, 9 insertions(+)
> > 
> > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > index fbadda68e23b..25884f833a32 100644
> > --- a/drivers/md/dm.c
> > +++ b/drivers/md/dm.c
> > @@ -1654,7 +1654,9 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
> >  						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
> >  				part_stat_unlock();
> >  
> > +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
> >  				bio_chain(b, bio);
> > +				trace_block_split(md->queue, b, bio->bi_iter.bi_sector);
> >  				ret = generic_make_request(bio);
> >  				break;
> >  			}
> > @@ -1734,6 +1736,13 @@ static blk_qc_t dm_make_request(struct request_queue *q, struct bio *bio)
> >  
> >  	map = dm_get_live_table(md, &srcu_idx);
> >  
> > +	/*
> > +	 * Clear the bio-reentered-generic_make_request() flag,
> > +	 * will be set again as needed if bio needs to be split.
> > +	 */
> > +	if (bio_flagged(bio, BIO_QUEUE_ENTERED))
> > +		bio_clear_flag(bio, BIO_QUEUE_ENTERED);
> > +
> >  	/* if we're suspended, we have to queue this io for later */
> >  	if (unlikely(test_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags))) {
> >  		dm_put_live_table(md, srcu_idx);
> > -- 
> > 2.15.0
> > 
> 
> Hi Mike,
> 
> I'd suggest to fix this kind issue in the following way, then we
> can avoid to touch this flag from drivers:
> 
> diff --git a/block/blk-core.c b/block/blk-core.c
> index 3c5f61ceeb67..e70103560ac2 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -1024,6 +1024,8 @@ blk_qc_t generic_make_request(struct bio *bio)
>  		else
>  			bio_io_error(bio);
>  		return ret;
> +	} else {
> +		bio_set_flag(bio, BIO_QUEUE_ENTERED);
>  	}
>  
>  	if (!generic_make_request_checks(bio))
> @@ -1074,6 +1076,8 @@ blk_qc_t generic_make_request(struct bio *bio)
>  			if (blk_queue_enter(q, flags) < 0) {
>  				enter_succeeded = false;
>  				q = NULL;
> +			} else {
> +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
>  			}
>  		}
>  
> diff --git a/block/blk-merge.c b/block/blk-merge.c
> index b990853f6de7..8777e286bd3f 100644
> --- a/block/blk-merge.c
> +++ b/block/blk-merge.c
> @@ -339,16 +339,6 @@ void blk_queue_split(struct request_queue *q, struct bio **bio)
>  		/* there isn't chance to merge the splitted bio */
>  		split->bi_opf |= REQ_NOMERGE;
>  
> -		/*
> -		 * Since we're recursing into make_request here, ensure
> -		 * that we mark this bio as already having entered the queue.
> -		 * If not, and the queue is going away, we can get stuck
> -		 * forever on waiting for the queue reference to drop. But
> -		 * that will never happen, as we're already holding a
> -		 * reference to it.
> -		 */
> -		bio_set_flag(*bio, BIO_QUEUE_ENTERED);
> -
>  		bio_chain(split, *bio);
>  		trace_block_split(q, split, (*bio)->bi_iter.bi_sector);
>  		generic_make_request(*bio);
> 

Not opposed to this.

Thanks,
Mike
Mike Snitzer Jan. 22, 2019, 3:35 a.m. UTC | #6
On Mon, Jan 21 2019 at 10:17pm -0500,
Mike Snitzer <snitzer@redhat.com> wrote:

> On Mon, Jan 21 2019 at  9:46pm -0500,
> Ming Lei <ming.lei@redhat.com> wrote:
> 
> > On Mon, Jan 21, 2019 at 11:02:04AM -0500, Mike Snitzer wrote:
> > > On Sun, Jan 20 2019 at 10:21P -0500,
> > > Ming Lei <ming.lei@redhat.com> wrote:
> > > 
> > > > On Sat, Jan 19, 2019 at 01:05:05PM -0500, Mike Snitzer wrote:
> > > > > Use the same BIO_QUEUE_ENTERED pattern that was established by commit
> > > > > cd4a4ae4683dc ("block: don't use blocking queue entered for recursive
> > > > > bio submits") by setting BIO_QUEUE_ENTERED after bio_split() and before
> > > > > recursing via generic_make_request().
> > > > > 
> > > > > Also add trace_block_split() because it provides useful context about
> > > > > bio splits in blktrace.
> > > > > 
> > > > > Depends-on: cd4a4ae4683dc ("block: don't use blocking queue entered for recursive bio submits")
> > > > > Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
> > > > > Cc: stable@vger.kernel.org # 4.16+
> > > > > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > > > > ---
> > > > >  drivers/md/dm.c | 2 ++
> > > > >  1 file changed, 2 insertions(+)
> > > > > 
> > > > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > > > > index fbadda68e23b..6e29c2d99b99 100644
> > > > > --- a/drivers/md/dm.c
> > > > > +++ b/drivers/md/dm.c
> > > > > @@ -1654,7 +1654,9 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
> > > > >  						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
> > > > >  				part_stat_unlock();
> > > > >  
> > > > > +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
> > > > >  				bio_chain(b, bio);
> > > > > +				trace_block_split(md->queue, b, bio->bi_iter.bi_sector);
> > > > >  				ret = generic_make_request(bio);
> > > > >  				break;
> > > > >  			}
> > > > 
> > > > In theory, BIO_QUEUE_ENTERED is only required when __split_and_process_bio() is
> > > > called from generic_make_request(). However, it may be called from dm_wq_work(),
> > > > this way might cause trouble on operation to q->q_usage_counter.
> > > 
> > > Good point, I've tweaked this patch to clear BIO_QUEUE_ENTERED in
> > > dm_make_request().
> > > 
> > > And to Neil's point: yes, these changes really do need to made
> > > common since it appears all bio_split() callers do go on to call
> > > generic_make_request().
> > > 
> > > Anyway, here is the updated patch that is now staged in linux-next:
> > > 
> > > From: Mike Snitzer <snitzer@redhat.com>
> > > Date: Fri, 18 Jan 2019 01:21:11 -0500
> > > Subject: [PATCH v2] dm: fix missing bio_split() pattern code in __split_and_process_bio()
> > > 
> > > Use the same BIO_QUEUE_ENTERED pattern that was established by commit
> > > cd4a4ae4683dc ("block: don't use blocking queue entered for recursive
> > > bio submits") by setting BIO_QUEUE_ENTERED after bio_split() and before
> > > recursing via generic_make_request().
> > > 
> > > Also add trace_block_split() because it provides useful context about
> > > bio splits in blktrace.
> > > 
> > > Depends-on: cd4a4ae4683dc ("block: don't use blocking queue entered for recursive bio submits")
> > > Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
> > > Cc: stable@vger.kernel.org # 4.16+
> > > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > > ---
> > >  drivers/md/dm.c | 9 +++++++++
> > >  1 file changed, 9 insertions(+)
> > > 
> > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > > index fbadda68e23b..25884f833a32 100644
> > > --- a/drivers/md/dm.c
> > > +++ b/drivers/md/dm.c
> > > @@ -1654,7 +1654,9 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
> > >  						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
> > >  				part_stat_unlock();
> > >  
> > > +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
> > >  				bio_chain(b, bio);
> > > +				trace_block_split(md->queue, b, bio->bi_iter.bi_sector);
> > >  				ret = generic_make_request(bio);
> > >  				break;
> > >  			}
> > > @@ -1734,6 +1736,13 @@ static blk_qc_t dm_make_request(struct request_queue *q, struct bio *bio)
> > >  
> > >  	map = dm_get_live_table(md, &srcu_idx);
> > >  
> > > +	/*
> > > +	 * Clear the bio-reentered-generic_make_request() flag,
> > > +	 * will be set again as needed if bio needs to be split.
> > > +	 */
> > > +	if (bio_flagged(bio, BIO_QUEUE_ENTERED))
> > > +		bio_clear_flag(bio, BIO_QUEUE_ENTERED);
> > > +
> > >  	/* if we're suspended, we have to queue this io for later */
> > >  	if (unlikely(test_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags))) {
> > >  		dm_put_live_table(md, srcu_idx);
> > > -- 
> > > 2.15.0
> > > 
> > 
> > Hi Mike,
> > 
> > I'd suggest to fix this kind issue in the following way, then we
> > can avoid to touch this flag from drivers:
> > 
> > diff --git a/block/blk-core.c b/block/blk-core.c
> > index 3c5f61ceeb67..e70103560ac2 100644
> > --- a/block/blk-core.c
> > +++ b/block/blk-core.c
> > @@ -1024,6 +1024,8 @@ blk_qc_t generic_make_request(struct bio *bio)
> >  		else
> >  			bio_io_error(bio);
> >  		return ret;
> > +	} else {
> > +		bio_set_flag(bio, BIO_QUEUE_ENTERED);
> >  	}
> >  
> >  	if (!generic_make_request_checks(bio))
> > @@ -1074,6 +1076,8 @@ blk_qc_t generic_make_request(struct bio *bio)
> >  			if (blk_queue_enter(q, flags) < 0) {
> >  				enter_succeeded = false;
> >  				q = NULL;
> > +			} else {
> > +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
> >  			}
> >  		}
> >  
> > diff --git a/block/blk-merge.c b/block/blk-merge.c
> > index b990853f6de7..8777e286bd3f 100644
> > --- a/block/blk-merge.c
> > +++ b/block/blk-merge.c
> > @@ -339,16 +339,6 @@ void blk_queue_split(struct request_queue *q, struct bio **bio)
> >  		/* there isn't chance to merge the splitted bio */
> >  		split->bi_opf |= REQ_NOMERGE;
> >  
> > -		/*
> > -		 * Since we're recursing into make_request here, ensure
> > -		 * that we mark this bio as already having entered the queue.
> > -		 * If not, and the queue is going away, we can get stuck
> > -		 * forever on waiting for the queue reference to drop. But
> > -		 * that will never happen, as we're already holding a
> > -		 * reference to it.
> > -		 */
> > -		bio_set_flag(*bio, BIO_QUEUE_ENTERED);
> > -
> >  		bio_chain(split, *bio);
> >  		trace_block_split(q, split, (*bio)->bi_iter.bi_sector);
> >  		generic_make_request(*bio);
> > 
> 
> Not opposed to this.

But thinking further: when you have a stack of cascading
q->make_request_fn it could easily be that work done the next layer
down end up causing the bio to recurse to generic_make_request() but not
directly (e.g. dm_wq_work)... yet BIO_QUEUE_ENTERED will still be set
when it really isn't appropriate.

Getting too cute with setting bio flags but not clearing them on
different device boundaries could render the flags useless (or worse:
incorrect).

I'm not out for enaging in a focused audit/churn in this area that
becomes a slippery slope during the rest of 5.0-rcX.

That is why I was going for a local DM change for 5.0 and, in parallel,
work on the more generic fixes for 5.1.

So I'm back to preferring that...

But if you, Jens or others feel strongly about it I'm open to discuss it
further.

Think we need to set REQ_NOMERGE in the split too (like
blk_queue_split() is doing).  Again, a comprehensive cleanup and
consolidation of bio_split+generic_make_request pattern is needed.  MD
has a lot of it, DM has it, and then there is blk_queue_split().
Basically blk_queue_split()'s bio_split+bio_chain+generic_make_request
and all the flags that get set inbetween should be factored out for all
to use.

Mike
Ming Lei Jan. 22, 2019, 3:49 a.m. UTC | #7
On Mon, Jan 21, 2019 at 10:35:11PM -0500, Mike Snitzer wrote:
> On Mon, Jan 21 2019 at 10:17pm -0500,
> Mike Snitzer <snitzer@redhat.com> wrote:
> 
> > On Mon, Jan 21 2019 at  9:46pm -0500,
> > Ming Lei <ming.lei@redhat.com> wrote:
> > 
> > > On Mon, Jan 21, 2019 at 11:02:04AM -0500, Mike Snitzer wrote:
> > > > On Sun, Jan 20 2019 at 10:21P -0500,
> > > > Ming Lei <ming.lei@redhat.com> wrote:
> > > > 
> > > > > On Sat, Jan 19, 2019 at 01:05:05PM -0500, Mike Snitzer wrote:
> > > > > > Use the same BIO_QUEUE_ENTERED pattern that was established by commit
> > > > > > cd4a4ae4683dc ("block: don't use blocking queue entered for recursive
> > > > > > bio submits") by setting BIO_QUEUE_ENTERED after bio_split() and before
> > > > > > recursing via generic_make_request().
> > > > > > 
> > > > > > Also add trace_block_split() because it provides useful context about
> > > > > > bio splits in blktrace.
> > > > > > 
> > > > > > Depends-on: cd4a4ae4683dc ("block: don't use blocking queue entered for recursive bio submits")
> > > > > > Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
> > > > > > Cc: stable@vger.kernel.org # 4.16+
> > > > > > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > > > > > ---
> > > > > >  drivers/md/dm.c | 2 ++
> > > > > >  1 file changed, 2 insertions(+)
> > > > > > 
> > > > > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > > > > > index fbadda68e23b..6e29c2d99b99 100644
> > > > > > --- a/drivers/md/dm.c
> > > > > > +++ b/drivers/md/dm.c
> > > > > > @@ -1654,7 +1654,9 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
> > > > > >  						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
> > > > > >  				part_stat_unlock();
> > > > > >  
> > > > > > +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
> > > > > >  				bio_chain(b, bio);
> > > > > > +				trace_block_split(md->queue, b, bio->bi_iter.bi_sector);
> > > > > >  				ret = generic_make_request(bio);
> > > > > >  				break;
> > > > > >  			}
> > > > > 
> > > > > In theory, BIO_QUEUE_ENTERED is only required when __split_and_process_bio() is
> > > > > called from generic_make_request(). However, it may be called from dm_wq_work(),
> > > > > this way might cause trouble on operation to q->q_usage_counter.
> > > > 
> > > > Good point, I've tweaked this patch to clear BIO_QUEUE_ENTERED in
> > > > dm_make_request().
> > > > 
> > > > And to Neil's point: yes, these changes really do need to made
> > > > common since it appears all bio_split() callers do go on to call
> > > > generic_make_request().
> > > > 
> > > > Anyway, here is the updated patch that is now staged in linux-next:
> > > > 
> > > > From: Mike Snitzer <snitzer@redhat.com>
> > > > Date: Fri, 18 Jan 2019 01:21:11 -0500
> > > > Subject: [PATCH v2] dm: fix missing bio_split() pattern code in __split_and_process_bio()
> > > > 
> > > > Use the same BIO_QUEUE_ENTERED pattern that was established by commit
> > > > cd4a4ae4683dc ("block: don't use blocking queue entered for recursive
> > > > bio submits") by setting BIO_QUEUE_ENTERED after bio_split() and before
> > > > recursing via generic_make_request().
> > > > 
> > > > Also add trace_block_split() because it provides useful context about
> > > > bio splits in blktrace.
> > > > 
> > > > Depends-on: cd4a4ae4683dc ("block: don't use blocking queue entered for recursive bio submits")
> > > > Fixes: 18a25da84354 ("dm: ensure bio submission follows a depth-first tree walk")
> > > > Cc: stable@vger.kernel.org # 4.16+
> > > > Signed-off-by: Mike Snitzer <snitzer@redhat.com>
> > > > ---
> > > >  drivers/md/dm.c | 9 +++++++++
> > > >  1 file changed, 9 insertions(+)
> > > > 
> > > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > > > index fbadda68e23b..25884f833a32 100644
> > > > --- a/drivers/md/dm.c
> > > > +++ b/drivers/md/dm.c
> > > > @@ -1654,7 +1654,9 @@ static blk_qc_t __split_and_process_bio(struct mapped_device *md,
> > > >  						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
> > > >  				part_stat_unlock();
> > > >  
> > > > +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
> > > >  				bio_chain(b, bio);
> > > > +				trace_block_split(md->queue, b, bio->bi_iter.bi_sector);
> > > >  				ret = generic_make_request(bio);
> > > >  				break;
> > > >  			}
> > > > @@ -1734,6 +1736,13 @@ static blk_qc_t dm_make_request(struct request_queue *q, struct bio *bio)
> > > >  
> > > >  	map = dm_get_live_table(md, &srcu_idx);
> > > >  
> > > > +	/*
> > > > +	 * Clear the bio-reentered-generic_make_request() flag,
> > > > +	 * will be set again as needed if bio needs to be split.
> > > > +	 */
> > > > +	if (bio_flagged(bio, BIO_QUEUE_ENTERED))
> > > > +		bio_clear_flag(bio, BIO_QUEUE_ENTERED);
> > > > +
> > > >  	/* if we're suspended, we have to queue this io for later */
> > > >  	if (unlikely(test_bit(DMF_BLOCK_IO_FOR_SUSPEND, &md->flags))) {
> > > >  		dm_put_live_table(md, srcu_idx);
> > > > -- 
> > > > 2.15.0
> > > > 
> > > 
> > > Hi Mike,
> > > 
> > > I'd suggest to fix this kind issue in the following way, then we
> > > can avoid to touch this flag from drivers:
> > > 
> > > diff --git a/block/blk-core.c b/block/blk-core.c
> > > index 3c5f61ceeb67..e70103560ac2 100644
> > > --- a/block/blk-core.c
> > > +++ b/block/blk-core.c
> > > @@ -1024,6 +1024,8 @@ blk_qc_t generic_make_request(struct bio *bio)
> > >  		else
> > >  			bio_io_error(bio);
> > >  		return ret;
> > > +	} else {
> > > +		bio_set_flag(bio, BIO_QUEUE_ENTERED);
> > >  	}
> > >  
> > >  	if (!generic_make_request_checks(bio))
> > > @@ -1074,6 +1076,8 @@ blk_qc_t generic_make_request(struct bio *bio)
> > >  			if (blk_queue_enter(q, flags) < 0) {
> > >  				enter_succeeded = false;
> > >  				q = NULL;
> > > +			} else {
> > > +				bio_set_flag(bio, BIO_QUEUE_ENTERED);
> > >  			}
> > >  		}
> > >  
> > > diff --git a/block/blk-merge.c b/block/blk-merge.c
> > > index b990853f6de7..8777e286bd3f 100644
> > > --- a/block/blk-merge.c
> > > +++ b/block/blk-merge.c
> > > @@ -339,16 +339,6 @@ void blk_queue_split(struct request_queue *q, struct bio **bio)
> > >  		/* there isn't chance to merge the splitted bio */
> > >  		split->bi_opf |= REQ_NOMERGE;
> > >  
> > > -		/*
> > > -		 * Since we're recursing into make_request here, ensure
> > > -		 * that we mark this bio as already having entered the queue.
> > > -		 * If not, and the queue is going away, we can get stuck
> > > -		 * forever on waiting for the queue reference to drop. But
> > > -		 * that will never happen, as we're already holding a
> > > -		 * reference to it.
> > > -		 */
> > > -		bio_set_flag(*bio, BIO_QUEUE_ENTERED);
> > > -
> > >  		bio_chain(split, *bio);
> > >  		trace_block_split(q, split, (*bio)->bi_iter.bi_sector);
> > >  		generic_make_request(*bio);
> > > 
> > 
> > Not opposed to this.
> 
> But thinking further: when you have a stack of cascading
> q->make_request_fn it could easily be that work done the next layer
> down end up causing the bio to recurse to generic_make_request() but not
> directly (e.g. dm_wq_work)... yet BIO_QUEUE_ENTERED will still be set
> when it really isn't appropriate.

That is true, in theory, we need a per-queue stack variable to record
if queue usage counter is held. But it is quite hard to do that in
kernel because we don't have stack variable allocator, otherwise
this issue can be solved clean & simple.

> 
> Getting too cute with setting bio flags but not clearing them on
> different device boundaries could render the flags useless (or worse:
> incorrect).

How about clearing the flag just following q->make_request_fn() in
generic_make_request()?

> 
> I'm not out for enaging in a focused audit/churn in this area that
> becomes a slippery slope during the rest of 5.0-rcX.
> 
> That is why I was going for a local DM change for 5.0 and, in parallel,
> work on the more generic fixes for 5.1.
> 
> So I'm back to preferring that...
> 
> But if you, Jens or others feel strongly about it I'm open to discuss it
> further.

One concern is that if this flag starts to be used by drivers, sooner or
later it may be difficult to maintain.

> 
> Think we need to set REQ_NOMERGE in the split too (like
> blk_queue_split() is doing).  Again, a comprehensive cleanup and
> consolidation of bio_split+generic_make_request pattern is needed.  MD
> has a lot of it, DM has it, and then there is blk_queue_split().
> Basically blk_queue_split()'s bio_split+bio_chain+generic_make_request
> and all the flags that get set inbetween should be factored out for all
> to use.

Sounds a good topic and I am interested in,  maybe you can submit a lsfmm
proposal, :-)


Thanks,
Ming
diff mbox series

Patch

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index fbadda68e23b..6e29c2d99b99 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1654,7 +1654,9 @@  static blk_qc_t __split_and_process_bio(struct mapped_device *md,
 						   sectors[op_stat_group(bio_op(bio))], ci.sector_count);
 				part_stat_unlock();
 
+				bio_set_flag(bio, BIO_QUEUE_ENTERED);
 				bio_chain(b, bio);
+				trace_block_split(md->queue, b, bio->bi_iter.bi_sector);
 				ret = generic_make_request(bio);
 				break;
 			}