diff mbox

[v5,01/11] block: make generic_make_request handle arbitrarily sized bios

Message ID 1439919178.28022.3.camel@ssi (mailing list archive)
State Superseded, archived
Delegated to: Mike Snitzer
Headers show

Commit Message

Ming Lin Aug. 18, 2015, 5:32 p.m. UTC
On Tue, 2015-08-18 at 10:45 -0400, Mike Snitzer wrote:
> On Tue, Aug 18 2015 at  3:04am -0400,
> Ming Lin <mlin@kernel.org> wrote:
> 
> > On Mon, Aug 17, 2015 at 10:09 PM, Ming Lin <mlin@kernel.org> wrote:
> > > On Mon, Aug 10, 2015 at 8:02 AM, Mike Snitzer <snitzer@redhat.com> wrote:
> > >> p.s. I'll be working with Joe Thornber on optimizing DM (particularly
> > >> dm-thinp and dm-cache) once this patchset is included upstream.  You'll
> > >> see I've already added a couple WIP dm-thinp patches ontop.
> > >
> > > Hi Mike,
> > >
> > > Just to avoid duplicated work.
> > > Are you going to work on the dm-thinp/dm-cache discard rewritten?
> > 
> > Seems dm-stripe discard also needs rewrite.
> 
> Can you elaborate on what you feel needs re-writing in these targets?

dm-stripe also require discard size to be a multiple of chunk size.
See output of below debug patch for 4G discard.

root@bee:~# blkdiscard -o 0 -l 4294967296 /dev/striped_vol_group/striped_logical_volume

root@bee:~# dmesg |grep DEBUG
[   13.110224] DEBUG: discard ignored: stripe chunk size 128K bytes, bio size 512 bytes
[   13.113723] DEBUG: discard ignored: stripe chunk size 128K bytes, bio size 512 bytes
[   13.117098] DEBUG: discard ignored: stripe chunk size 128K bytes, bio size 512 bytes
[   13.120424] DEBUG: discard ignored: stripe chunk size 128K bytes, bio size 512 bytes
[   13.123800] DEBUG: discard ignored: stripe chunk size 128K bytes, bio size 512 bytes
[   13.127027] DEBUG: discard ignored: stripe chunk size 128K bytes, bio size 512 bytes
[   13.130161] DEBUG: discard ignored: stripe chunk size 128K bytes, bio size 512 bytes


> 
> This is the basic initial cleanup I had in mind for dm-thinp:
> http://git.kernel.org/cgit/linux/kernel/git/snitzer/linux.git/commit/?h=dm-4.4&id=cb0aca0a6bfad6b7f7146dde776f374082a73db6
> 
> A much more involved refactoring of the dm-cache and dm-thinp targets to
> eliminate the need for splitting will involve bio-prison range locking
> and a new metadata format for both targets to express ranges as opposed
> to blocks.  This line of work is on Joe's radar but it is much further
> out given the associated on-disk metadata format change.
> 
> That aside, I do need to look at DM core to see how we can do things
> differently so that block core's bio_split() et al is doing the
> splitting rather than DM core having a role.
> 
> I'd prefer to be the one working these DM changes.  But if you have
> ideas of how things should be cleaned up I'd be happy to consider them.
> 
> Thanks,
> Mike


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Comments

Mike Snitzer Aug. 18, 2015, 7:59 p.m. UTC | #1
On Tue, Aug 18 2015 at  1:32pm -0400,
Ming Lin <mlin@kernel.org> wrote:

> On Tue, 2015-08-18 at 10:45 -0400, Mike Snitzer wrote:
> > On Tue, Aug 18 2015 at  3:04am -0400,
> > Ming Lin <mlin@kernel.org> wrote:
> > 
> > > On Mon, Aug 17, 2015 at 10:09 PM, Ming Lin <mlin@kernel.org> wrote:
> > > > On Mon, Aug 10, 2015 at 8:02 AM, Mike Snitzer <snitzer@redhat.com> wrote:
> > > >> p.s. I'll be working with Joe Thornber on optimizing DM (particularly
> > > >> dm-thinp and dm-cache) once this patchset is included upstream.  You'll
> > > >> see I've already added a couple WIP dm-thinp patches ontop.
> > > >
> > > > Hi Mike,
> > > >
> > > > Just to avoid duplicated work.
> > > > Are you going to work on the dm-thinp/dm-cache discard rewritten?
> > > 
> > > Seems dm-stripe discard also needs rewrite.
> > 
> > Can you elaborate on what you feel needs re-writing in these targets?
> 
> dm-stripe also require discard size to be a multiple of chunk size.
> See output of below debug patch for 4G discard.
> 
> root@bee:~# blkdiscard -o 0 -l 4294967296 /dev/striped_vol_group/striped_logical_volume
> 
> root@bee:~# dmesg |grep DEBUG
> [   13.110224] DEBUG: discard ignored: stripe chunk size 128K bytes, bio size 512 bytes
> [   13.113723] DEBUG: discard ignored: stripe chunk size 128K bytes, bio size 512 bytes
> [   13.117098] DEBUG: discard ignored: stripe chunk size 128K bytes, bio size 512 bytes
> [   13.120424] DEBUG: discard ignored: stripe chunk size 128K bytes, bio size 512 bytes
> [   13.123800] DEBUG: discard ignored: stripe chunk size 128K bytes, bio size 512 bytes
> [   13.127027] DEBUG: discard ignored: stripe chunk size 128K bytes, bio size 512 bytes
> [   13.130161] DEBUG: discard ignored: stripe chunk size 128K bytes, bio size 512 bytes
> 
> diff --git a/block/blk-lib.c b/block/blk-lib.c
> index bd40292..1cab2ba 100644
> --- a/block/blk-lib.c
> +++ b/block/blk-lib.c
> @@ -82,7 +82,7 @@ int blkdev_issue_discard(struct block_device *bdev, sector_t sector,
>  			break;
>  		}
>  
> -		req_sects = min_t(sector_t, nr_sects, MAX_BIO_SECTORS);
> +		req_sects = min_t(sector_t, nr_sects, UINT_MAX>>9);
>  		end_sect = sector + req_sects;
>  
>  		bio->bi_iter.bi_sector = sector;
> diff --git a/drivers/md/dm-stripe.c b/drivers/md/dm-stripe.c
> index 484029d..a288bc2 100644
> --- a/drivers/md/dm-stripe.c
> +++ b/drivers/md/dm-stripe.c
> @@ -273,6 +273,8 @@ static int stripe_map_range(struct stripe_c *sc, struct bio *bio,
>  		return DM_MAPIO_REMAPPED;
>  	} else {
>  		/* The range doesn't map to the target stripe */
> +		printk("DEBUG: discard ignored: stripe chunk size %dK bytes, bio size %d bytes\n",
> +			sc->chunk_size>>1, bio->bi_iter.bi_size);
>  		bio_endio(bio);
>  		return DM_MAPIO_SUBMITTED;
>  	}

This is expected.  If a discard is only 512 bytes and the chunk size is
128K then every discard will only ever hit one stripe.

So each discard will have N - 1 "discard ignored" messages (when N is #
of stripes in the dm-stripe device).  So in your test device I'd assume
you have 8 stripes.

Basically your debugging looks like it is _very_ prone to false
positives here.  The dm-stripe code is working as expected.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Ming Lin Aug. 18, 2015, 9:16 p.m. UTC | #2
On Tue, 2015-08-18 at 15:59 -0400, Mike Snitzer wrote:
> On Tue, Aug 18 2015 at  1:32pm -0400,
> Ming Lin <mlin@kernel.org> wrote:
> 
> > On Tue, 2015-08-18 at 10:45 -0400, Mike Snitzer wrote:
> > > On Tue, Aug 18 2015 at  3:04am -0400,
> > > Ming Lin <mlin@kernel.org> wrote:
> > > 
> > > > On Mon, Aug 17, 2015 at 10:09 PM, Ming Lin <mlin@kernel.org> wrote:
> > > > > On Mon, Aug 10, 2015 at 8:02 AM, Mike Snitzer <snitzer@redhat.com> wrote:
> > > > >> p.s. I'll be working with Joe Thornber on optimizing DM (particularly
> > > > >> dm-thinp and dm-cache) once this patchset is included upstream.  You'll
> > > > >> see I've already added a couple WIP dm-thinp patches ontop.
> > > > >
> > > > > Hi Mike,
> > > > >
> > > > > Just to avoid duplicated work.
> > > > > Are you going to work on the dm-thinp/dm-cache discard rewritten?
> > > > 
> > > > Seems dm-stripe discard also needs rewrite.
> > > 
> > > Can you elaborate on what you feel needs re-writing in these targets?
> > 
> > dm-stripe also require discard size to be a multiple of chunk size.
> > See output of below debug patch for 4G discard.
> > 
> > root@bee:~# blkdiscard -o 0 -l 4294967296 /dev/striped_vol_group/striped_logical_volume
> > 
> > root@bee:~# dmesg |grep DEBUG
> > [   13.110224] DEBUG: discard ignored: stripe chunk size 128K bytes, bio size 512 bytes
> > [   13.113723] DEBUG: discard ignored: stripe chunk size 128K bytes, bio size 512 bytes
> > [   13.117098] DEBUG: discard ignored: stripe chunk size 128K bytes, bio size 512 bytes
> > [   13.120424] DEBUG: discard ignored: stripe chunk size 128K bytes, bio size 512 bytes
> > [   13.123800] DEBUG: discard ignored: stripe chunk size 128K bytes, bio size 512 bytes
> > [   13.127027] DEBUG: discard ignored: stripe chunk size 128K bytes, bio size 512 bytes
> > [   13.130161] DEBUG: discard ignored: stripe chunk size 128K bytes, bio size 512 bytes
> > 
> > diff --git a/block/blk-lib.c b/block/blk-lib.c
> > index bd40292..1cab2ba 100644
> > --- a/block/blk-lib.c
> > +++ b/block/blk-lib.c
> > @@ -82,7 +82,7 @@ int blkdev_issue_discard(struct block_device *bdev, sector_t sector,
> >  			break;
> >  		}
> >  
> > -		req_sects = min_t(sector_t, nr_sects, MAX_BIO_SECTORS);
> > +		req_sects = min_t(sector_t, nr_sects, UINT_MAX>>9);
> >  		end_sect = sector + req_sects;
> >  
> >  		bio->bi_iter.bi_sector = sector;
> > diff --git a/drivers/md/dm-stripe.c b/drivers/md/dm-stripe.c
> > index 484029d..a288bc2 100644
> > --- a/drivers/md/dm-stripe.c
> > +++ b/drivers/md/dm-stripe.c
> > @@ -273,6 +273,8 @@ static int stripe_map_range(struct stripe_c *sc, struct bio *bio,
> >  		return DM_MAPIO_REMAPPED;
> >  	} else {
> >  		/* The range doesn't map to the target stripe */
> > +		printk("DEBUG: discard ignored: stripe chunk size %dK bytes, bio size %d bytes\n",
> > +			sc->chunk_size>>1, bio->bi_iter.bi_size);
> >  		bio_endio(bio);
> >  		return DM_MAPIO_SUBMITTED;
> >  	}
> 
> This is expected.  If a discard is only 512 bytes and the chunk size is
> 128K then every discard will only ever hit one stripe.

The discard was actually 4G bytes.
# blkdiscard -o 0 \
      -l 4294967296 /dev/striped_vol_group/striped_logical_volume

In the above debug patch, I changed MAX_BIO_SECTORS to UINT_MAX>>9
to show the problem.

The 512 bytes comes from blkdev_issue_discard() split the 4G bytes to
(UINT_MAX>>9) sectors + 1 sector.

> 
> So each discard will have N - 1 "discard ignored" messages (when N is #
> of stripes in the dm-stripe device).  So in your test device I'd assume
> you have 8 stripes.

Yes.

> 
> Basically your debugging looks like it is _very_ prone to false
> positives here.  The dm-stripe code is working as expected.

With current 2G cap in blkdev_issue_discard(), dm-stripe works OK.
But if in future we change it to UINT_MAX, then dm-stripe discard will
have problem as dm-thinp/dm-cache.


--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Mike Snitzer Aug. 18, 2015, 9:22 p.m. UTC | #3
On Tue, Aug 18 2015 at  5:16pm -0400,
Ming Lin <mlin@kernel.org> wrote:

> On Tue, 2015-08-18 at 15:59 -0400, Mike Snitzer wrote:
> > On Tue, Aug 18 2015 at  1:32pm -0400,
> > Ming Lin <mlin@kernel.org> wrote:
> > 
> > > On Tue, 2015-08-18 at 10:45 -0400, Mike Snitzer wrote:
> > > > On Tue, Aug 18 2015 at  3:04am -0400,
> > > > Ming Lin <mlin@kernel.org> wrote:
> > > > 
> > > > > On Mon, Aug 17, 2015 at 10:09 PM, Ming Lin <mlin@kernel.org> wrote:
> > > > > > On Mon, Aug 10, 2015 at 8:02 AM, Mike Snitzer <snitzer@redhat.com> wrote:
> > > > > >> p.s. I'll be working with Joe Thornber on optimizing DM (particularly
> > > > > >> dm-thinp and dm-cache) once this patchset is included upstream.  You'll
> > > > > >> see I've already added a couple WIP dm-thinp patches ontop.
> > > > > >
> > > > > > Hi Mike,
> > > > > >
> > > > > > Just to avoid duplicated work.
> > > > > > Are you going to work on the dm-thinp/dm-cache discard rewritten?
> > > > > 
> > > > > Seems dm-stripe discard also needs rewrite.
> > > > 
> > > > Can you elaborate on what you feel needs re-writing in these targets?
> > > 
> > > dm-stripe also require discard size to be a multiple of chunk size.
> > > See output of below debug patch for 4G discard.
> > > 
> > > root@bee:~# blkdiscard -o 0 -l 4294967296 /dev/striped_vol_group/striped_logical_volume
> > > 
> > > root@bee:~# dmesg |grep DEBUG
> > > [   13.110224] DEBUG: discard ignored: stripe chunk size 128K bytes, bio size 512 bytes
> > > [   13.113723] DEBUG: discard ignored: stripe chunk size 128K bytes, bio size 512 bytes
> > > [   13.117098] DEBUG: discard ignored: stripe chunk size 128K bytes, bio size 512 bytes
> > > [   13.120424] DEBUG: discard ignored: stripe chunk size 128K bytes, bio size 512 bytes
> > > [   13.123800] DEBUG: discard ignored: stripe chunk size 128K bytes, bio size 512 bytes
> > > [   13.127027] DEBUG: discard ignored: stripe chunk size 128K bytes, bio size 512 bytes
> > > [   13.130161] DEBUG: discard ignored: stripe chunk size 128K bytes, bio size 512 bytes
> > > 
> > > diff --git a/block/blk-lib.c b/block/blk-lib.c
> > > index bd40292..1cab2ba 100644
> > > --- a/block/blk-lib.c
> > > +++ b/block/blk-lib.c
> > > @@ -82,7 +82,7 @@ int blkdev_issue_discard(struct block_device *bdev, sector_t sector,
> > >  			break;
> > >  		}
> > >  
> > > -		req_sects = min_t(sector_t, nr_sects, MAX_BIO_SECTORS);
> > > +		req_sects = min_t(sector_t, nr_sects, UINT_MAX>>9);
> > >  		end_sect = sector + req_sects;
> > >  
> > >  		bio->bi_iter.bi_sector = sector;
> > > diff --git a/drivers/md/dm-stripe.c b/drivers/md/dm-stripe.c
> > > index 484029d..a288bc2 100644
> > > --- a/drivers/md/dm-stripe.c
> > > +++ b/drivers/md/dm-stripe.c
> > > @@ -273,6 +273,8 @@ static int stripe_map_range(struct stripe_c *sc, struct bio *bio,
> > >  		return DM_MAPIO_REMAPPED;
> > >  	} else {
> > >  		/* The range doesn't map to the target stripe */
> > > +		printk("DEBUG: discard ignored: stripe chunk size %dK bytes, bio size %d bytes\n",
> > > +			sc->chunk_size>>1, bio->bi_iter.bi_size);
> > >  		bio_endio(bio);
> > >  		return DM_MAPIO_SUBMITTED;
> > >  	}
> > 
> > This is expected.  If a discard is only 512 bytes and the chunk size is
> > 128K then every discard will only ever hit one stripe.
> 
> The discard was actually 4G bytes.
> # blkdiscard -o 0 \
>       -l 4294967296 /dev/striped_vol_group/striped_logical_volume
> 
> In the above debug patch, I changed MAX_BIO_SECTORS to UINT_MAX>>9
> to show the problem.
> 
> The 512 bytes comes from blkdev_issue_discard() split the 4G bytes to
> (UINT_MAX>>9) sectors + 1 sector.
> 
> > 
> > So each discard will have N - 1 "discard ignored" messages (when N is #
> > of stripes in the dm-stripe device).  So in your test device I'd assume
> > you have 8 stripes.
> 
> Yes.
> 
> > 
> > Basically your debugging looks like it is _very_ prone to false
> > positives here.  The dm-stripe code is working as expected.
> 
> With current 2G cap in blkdev_issue_discard(), dm-stripe works OK.
> But if in future we change it to UINT_MAX, then dm-stripe discard will
> have problem as dm-thinp/dm-cache.

No you're still missing my point.  dm-stripe isn't dropping the partial
discard completely.  It is just that the discard only applies to one of
the 8 stripes in your test.

With the 2G cap it just so happens that each discard hits each stripe.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
diff mbox

Patch

diff --git a/block/blk-lib.c b/block/blk-lib.c
index bd40292..1cab2ba 100644
--- a/block/blk-lib.c
+++ b/block/blk-lib.c
@@ -82,7 +82,7 @@  int blkdev_issue_discard(struct block_device *bdev, sector_t sector,
 			break;
 		}
 
-		req_sects = min_t(sector_t, nr_sects, MAX_BIO_SECTORS);
+		req_sects = min_t(sector_t, nr_sects, UINT_MAX>>9);
 		end_sect = sector + req_sects;
 
 		bio->bi_iter.bi_sector = sector;
diff --git a/drivers/md/dm-stripe.c b/drivers/md/dm-stripe.c
index 484029d..a288bc2 100644
--- a/drivers/md/dm-stripe.c
+++ b/drivers/md/dm-stripe.c
@@ -273,6 +273,8 @@  static int stripe_map_range(struct stripe_c *sc, struct bio *bio,
 		return DM_MAPIO_REMAPPED;
 	} else {
 		/* The range doesn't map to the target stripe */
+		printk("DEBUG: discard ignored: stripe chunk size %dK bytes, bio size %d bytes\n",
+			sc->chunk_size>>1, bio->bi_iter.bi_size);
 		bio_endio(bio);
 		return DM_MAPIO_SUBMITTED;
 	}