Patchwork [v2,05/13] dm-mpath: Make it easier to analyze requeuing behavior

login
register
mail settings
Submitter Bart Van Assche
Date April 27, 2017, 5:11 p.m.
Message ID <20170427171126.26814-6-bart.vanassche@sandisk.com>
Download mbox | patch
Permalink /patch/9703245/
State New
Headers show

Comments

Bart Van Assche - April 27, 2017, 5:11 p.m.
When debugging the dm-mpath driver it is important to know what
decisions have been taken with regard to requeuing. Hence this
patch that adds pr_debug() statements that report what decisions
have been taken.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Cc: Christoph Hellwig <hch@lst.de>
---
 drivers/md/dm-mpath.c | 21 ++++++++++++++++++---
 drivers/md/dm-rq.c    |  5 ++++-
 2 files changed, 22 insertions(+), 4 deletions(-)
Mike Snitzer - April 27, 2017, 7:29 p.m.
On Thu, Apr 27 2017 at  1:11P -0400,
Bart Van Assche <bart.vanassche@sandisk.com> wrote:

> When debugging the dm-mpath driver it is important to know what
> decisions have been taken with regard to requeuing. Hence this
> patch that adds pr_debug() statements that report what decisions
> have been taken.
> 
> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
> Reviewed-by: Hannes Reinecke <hare@suse.com>
> Cc: Christoph Hellwig <hch@lst.de>

I've not used pr_debug() before.. I generally sprinkle debugging when
I'm actively chasing an issue and then throw it away once I sort the
problem out.

Documentation/process/coding-style.rst says:
"Coming up with good debugging messages can be quite a challenge; and once
you have them, they can be a huge help for remote troubleshooting.  However
debug message printing is handled differently than printing other non-debug
messages.  While the other pr_XXX() functions print unconditionally,
pr_debug() does not; it is compiled out by default, unless either DEBUG is
defined or CONFIG_DYNAMIC_DEBUG is set."

So I assume you're leveraging DYNAMIC_DEBUG.

Anyway, I'm not liking the idea of making this debugging part of the
mpath code.  But if there is a convincing argument for it please
elaborate.

Are you finding that things are going wrong on production systems and
enabling pr_debug() in these paths would have, or has, saved you?

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Bart Van Assche - April 27, 2017, 7:57 p.m.
On Thu, 2017-04-27 at 15:29 -0400, Mike Snitzer wrote:
> Documentation/process/coding-style.rst says:
> "Coming up with good debugging messages can be quite a challenge; and once
> you have them, they can be a huge help for remote troubleshooting.  However
> debug message printing is handled differently than printing other non-debug
> messages.  While the other pr_XXX() functions print unconditionally,
> pr_debug() does not; it is compiled out by default, unless either DEBUG is
> defined or CONFIG_DYNAMIC_DEBUG is set."
> 
> So I assume you're leveraging DYNAMIC_DEBUG.
> 
> Anyway, I'm not liking the idea of making this debugging part of the
> mpath code.  But if there is a convincing argument for it please
> elaborate.
> 
> Are you finding that things are going wrong on production systems and
> enabling pr_debug() in these paths would have, or has, saved you?

Hello Mike,

If the dm-mpath driver is busy with requeuing requests in a loop today there
is no way to figure out why that continuous requeuing happens. The pr_debug()
statements introduced by this patch provide an easy way to figure out on
development systems why the requeuing happens. Please note that there is no
guarantee on production systems that CONFIG_DYNAMIC_DEBUG=y.

Bart.

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel

Patch

diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
index d739688246a0..fc3e7f028110 100644
--- a/drivers/md/dm-mpath.c
+++ b/drivers/md/dm-mpath.c
@@ -501,8 +501,10 @@  static int multipath_clone_and_map(struct dm_target *ti, struct request *rq,
 		pgpath = choose_pgpath(m, nr_bytes);
 
 	if (!pgpath) {
-		if (must_push_back_rq(m))
+		if (must_push_back_rq(m)) {
+			pr_debug("no path - requeueing\n");
 			return DM_MAPIO_DELAY_REQUEUE;
+		}
 		return -EIO;	/* Failed */
 	} else if (test_bit(MPATHF_QUEUE_IO, &m->flags) ||
 		   test_bit(MPATHF_PG_INIT_REQUIRED, &m->flags)) {
@@ -1425,17 +1427,23 @@  static void pg_init_done(void *data, int errors)
 		/* Activations of other paths are still on going */
 		goto out;
 
+	pr_debug("pg_init_in_progress = %d\n",
+		 atomic_read(&m->pg_init_in_progress));
+
 	if (test_bit(MPATHF_PG_INIT_REQUIRED, &m->flags)) {
 		if (delay_retry)
 			set_bit(MPATHF_PG_INIT_DELAY_RETRY, &m->flags);
 		else
 			clear_bit(MPATHF_PG_INIT_DELAY_RETRY, &m->flags);
 
-		if (__pg_init_all_paths(m))
+		if (__pg_init_all_paths(m)) {
+			pr_debug("__pg_init_all_paths() reported that initialization is ongoing\n");
 			goto out;
+		}
 	}
 	clear_bit(MPATHF_QUEUE_IO, &m->flags);
 
+	pr_debug("Processing queued I/O list\n");
 	process_queued_io_list(m);
 
 	/*
@@ -1932,8 +1940,11 @@  static int multipath_busy(struct dm_target *ti)
 	struct pgpath *pgpath;
 
 	/* pg_init in progress */
-	if (atomic_read(&m->pg_init_in_progress))
+	if (atomic_read(&m->pg_init_in_progress)) {
+		pr_debug("pg_init_in_progress = %d\n",
+			 atomic_read(&m->pg_init_in_progress));
 		return true;
+	}
 
 	/* no paths available, for blk-mq: rely on IO mapping to delay requeue */
 	if (!atomic_read(&m->nr_valid_paths) && test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags))
@@ -1980,6 +1991,10 @@  static int multipath_busy(struct dm_target *ti)
 		busy = false;
 	}
 
+
+	if (busy)
+		pr_debug("all active paths are busy\n");
+
 	return busy;
 }
 
diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
index c53debdcd7dc..ba5694be55a4 100644
--- a/drivers/md/dm-rq.c
+++ b/drivers/md/dm-rq.c
@@ -737,8 +737,10 @@  static int dm_mq_queue_rq(struct blk_mq_hw_ctx *hctx,
 		dm_put_live_table(md, srcu_idx);
 	}
 
-	if (ti->type->busy && ti->type->busy(ti))
+	if (ti->type->busy && ti->type->busy(ti)) {
+		pr_debug("ti->type->busy()\n");
 		return BLK_MQ_RQ_QUEUE_BUSY;
+	}
 
 	dm_start_request(md, rq);
 
@@ -756,6 +758,7 @@  static int dm_mq_queue_rq(struct blk_mq_hw_ctx *hctx,
 		rq_end_stats(md, rq);
 		rq_completed(md, rq_data_dir(rq), false);
 		blk_mq_delay_run_hw_queue(hctx, 500/*ms*/);
+		pr_debug("DM_MAPIO_REQUEUE\n");
 		return BLK_MQ_RQ_QUEUE_BUSY;
 	}