diff mbox

[2/1] dm: do not allocate any mempools for blk-mq request-based DM

Message ID 20150428010303.GB31039@redhat.com (mailing list archive)
State Rejected, archived
Delegated to: Mike Snitzer
Headers show

Commit Message

Mike Snitzer April 28, 2015, 1:03 a.m. UTC
Do not allocate the io_pool mempool for blk-mq request-based DM
(DM_TYPE_MQ_REQUEST_BASED) in dm_alloc_rq_mempools().

Also refine __bind_mempools() to have more precise awareness of which
mempools each type of DM device uses -- avoids mempool churn when
reloading DM tables (particularly for DM_TYPE_REQUEST_BASED).

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
---
 drivers/md/dm-table.c |  4 +--
 drivers/md/dm.c       | 69 ++++++++++++++++++++++++++++-----------------------
 2 files changed, 40 insertions(+), 33 deletions(-)

Comments

Christoph Hellwig April 28, 2015, 6:28 a.m. UTC | #1
On Mon, Apr 27, 2015 at 09:03:04PM -0400, Mike Snitzer wrote:
> Do not allocate the io_pool mempool for blk-mq request-based DM
> (DM_TYPE_MQ_REQUEST_BASED) in dm_alloc_rq_mempools().
> 
> Also refine __bind_mempools() to have more precise awareness of which
> mempools each type of DM device uses -- avoids mempool churn when
> reloading DM tables (particularly for DM_TYPE_REQUEST_BASED).

Btw, I looked into this code and didn't dare to touch it before I
understood how we deal with the case of dm-mpath using blk-mq and
low level driver not or the other way around.

As far as I can see we'd need the request mempool as long as the
low level driver does not use dm-mq, independent of what dm-mpath
itsel uses.

The code doesn't seem to handle this right, but as mentioned I'm not
100% sure and need to dive deeper into this.  Is there a place enforcing
dm-mpath is using the same request type as the underlying devices?

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Mike Snitzer April 28, 2015, 10:22 a.m. UTC | #2
On Tue, Apr 28 2015 at  2:28am -0400,
Christoph Hellwig <hch@lst.de> wrote:

> On Mon, Apr 27, 2015 at 09:03:04PM -0400, Mike Snitzer wrote:
> > Do not allocate the io_pool mempool for blk-mq request-based DM
> > (DM_TYPE_MQ_REQUEST_BASED) in dm_alloc_rq_mempools().
> > 
> > Also refine __bind_mempools() to have more precise awareness of which
> > mempools each type of DM device uses -- avoids mempool churn when
> > reloading DM tables (particularly for DM_TYPE_REQUEST_BASED).
> 
> Btw, I looked into this code and didn't dare to touch it before I
> understood how we deal with the case of dm-mpath using blk-mq and
> low level driver not or the other way around.

Current code does deal with:
1) blk-mq queue on non-blk-mq underlying devices
2) blk-mq queue on blk-mq underlying devices
3) non-blk-mq queue on non-blk-mq underlying devices
4) non-blk-mq queue on blk-mq underlying devices

But all these permutations do definitely make the code less
approachable and maintainable.

> As far as I can see we'd need the request mempool as long as the
> low level driver does not use dm-mq, independent of what dm-mpath
> itsel uses.
> 
> The code doesn't seem to handle this right, but as mentioned I'm not
> 100% sure and need to dive deeper into this.  Is there a place enforcing
> dm-mpath is using the same request type as the underlying devices?

It is dm-table.c:dm_table_set_type() that determines whether a given DM
table load references underlying devices that are all blk-mq or not.
Based on what it finds it either establishes the type as
DM_TYPE_REQUEST_BASED or DM_TYPE_MQ_REQUEST_BASED.

DM_TYPE_REQUEST_BASED will make use of io_pool, whereas
DM_TYPE_MQ_REQUEST_BASED won't.  There is an awkward duality where
use_blk_mq governs which mempools are created based on filter_md_type().
filter_md_type() is needed because the table's type is really only
concerned with the underlying devices' types -- not the top-level DM
queue.  So things are awkward until we establish the top-level queue
(then we just make use of q->mq_ops like normal).

It is dm.c:dm_setup_md_queue() that establishes the top-level DM queue
based on the type.

How the DM queue's requests are cloned depends the underlying devices
(e.g. blk-mq vs not).

Top-level blk-mq DM device uses the md's type (DM_TYPE_REQUEST_BASED vs
DM_TYPE_MQ_REQUEST_BASED) to judge how to clone the request.
DM_TYPE_REQUEST_BASED implies legacy path.

There is still some awkward branching in the IO path but it ended up
being surprisingly manageable.  Only concern I have is: in the long-run
will all this duality in the code compromise maintainability?  But
hopefully we can kill the old request_fn path in block core (if/when
blk-mq IO scheduling lands) and the DM code can be cleaned up
accordingly.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
diff mbox

Patch

diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index 3662b2e..2000fea 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -962,8 +962,8 @@  static int dm_table_alloc_md_mempools(struct dm_table *t, struct mapped_device *
 		return -EINVAL;
 	}
 
-	if (!t->mempools)
-		return -ENOMEM;
+	if (IS_ERR(t->mempools))
+		return PTR_ERR(t->mempools);
 
 	return 0;
 }
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 77ed371..1c0bc67 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -2309,39 +2309,52 @@  static void free_dev(struct mapped_device *md)
 	kfree(md);
 }
 
+static unsigned filter_md_type(unsigned type, struct mapped_device *md)
+{
+	if (type == DM_TYPE_BIO_BASED)
+		return type;
+
+	return !md->use_blk_mq ? DM_TYPE_REQUEST_BASED : DM_TYPE_MQ_REQUEST_BASED;
+}
+
 static void __bind_mempools(struct mapped_device *md, struct dm_table *t)
 {
 	struct dm_md_mempools *p = dm_table_get_md_mempools(t);
 
-	if (md->bs) {
-		/* The md already has necessary mempools. */
-		if (dm_table_get_type(t) == DM_TYPE_BIO_BASED) {
+	switch (filter_md_type(dm_table_get_type(t), md)) {
+	case DM_TYPE_BIO_BASED:
+		if (md->bs && md->io_pool) {
 			/*
+			 * This bio-based md already has necessary mempools.
 			 * Reload bioset because front_pad may have changed
 			 * because a different table was loaded.
 			 */
 			bioset_free(md->bs);
 			md->bs = p->bs;
 			p->bs = NULL;
+			goto out;
 		}
-		/*
-		 * There's no need to reload with request-based dm
-		 * because the size of front_pad doesn't change.
-		 * Note for future: If you are to reload bioset,
-		 * prep-ed requests in the queue may refer
-		 * to bio from the old bioset, so you must walk
-		 * through the queue to unprep.
-		 */
-		goto out;
+		break;
+	case DM_TYPE_REQUEST_BASED:
+		if (md->rq_pool && md->io_pool)
+			/*
+			 * This request-based md already has necessary mempools.
+			 */
+			goto out;
+		break;
+	case DM_TYPE_MQ_REQUEST_BASED:
+		BUG_ON(p); /* No mempools needed */
+		return;
 	}
 
+	BUG_ON(!p || md->io_pool || md->rq_pool || md->bs);
+
 	md->io_pool = p->io_pool;
 	p->io_pool = NULL;
 	md->rq_pool = p->rq_pool;
 	p->rq_pool = NULL;
 	md->bs = p->bs;
 	p->bs = NULL;
-
 out:
 	/* mempool bind completed, no longer need any mempools in the table */
 	dm_table_free_md_mempools(t);
@@ -2721,14 +2734,6 @@  out_tag_set:
 	return err;
 }
 
-static unsigned filter_md_type(unsigned type, struct mapped_device *md)
-{
-	if (type == DM_TYPE_BIO_BASED)
-		return type;
-
-	return !md->use_blk_mq ? DM_TYPE_REQUEST_BASED : DM_TYPE_MQ_REQUEST_BASED;
-}
-
 /*
  * Setup the DM device's queue based on md's type
  */
@@ -3450,7 +3455,7 @@  struct dm_md_mempools *dm_alloc_bio_mempools(unsigned integrity,
 
 	pools = kzalloc(sizeof(*pools), GFP_KERNEL);
 	if (!pools)
-		return NULL;
+		return ERR_PTR(-ENOMEM);
 
 	front_pad = roundup(per_bio_data_size, __alignof__(struct dm_target_io)) +
 		offsetof(struct dm_target_io, clone);
@@ -3469,24 +3474,26 @@  struct dm_md_mempools *dm_alloc_bio_mempools(unsigned integrity,
 	return pools;
 out:
 	dm_free_md_mempools(pools);
-	return NULL;
+	return ERR_PTR(-ENOMEM);
 }
 
 struct dm_md_mempools *dm_alloc_rq_mempools(struct mapped_device *md,
 					    unsigned type)
 {
-	unsigned int pool_size = dm_get_reserved_rq_based_ios();
+	unsigned int pool_size;
 	struct dm_md_mempools *pools;
 
+	if (filter_md_type(type, md) == DM_TYPE_MQ_REQUEST_BASED)
+		return NULL; /* No mempools needed */
+
+	pool_size = dm_get_reserved_rq_based_ios();
 	pools = kzalloc(sizeof(*pools), GFP_KERNEL);
 	if (!pools)
-		return NULL;
+		return ERR_PTR(-ENOMEM);
 
-	if (filter_md_type(type, md) == DM_TYPE_REQUEST_BASED) {
-		pools->rq_pool = mempool_create_slab_pool(pool_size, _rq_cache);
-		if (!pools->rq_pool)
-			goto out;
-	}
+	pools->rq_pool = mempool_create_slab_pool(pool_size, _rq_cache);
+	if (!pools->rq_pool)
+		goto out;
 
 	pools->io_pool = mempool_create_slab_pool(pool_size, _rq_tio_cache);
 	if (!pools->io_pool)
@@ -3495,7 +3502,7 @@  struct dm_md_mempools *dm_alloc_rq_mempools(struct mapped_device *md,
 	return pools;
 out:
 	dm_free_md_mempools(pools);
-	return NULL;
+	return ERR_PTR(-ENOMEM);
 }
 
 void dm_free_md_mempools(struct dm_md_mempools *pools)