diff mbox

dm-mq and end_clone_request()

Message ID 20160726011607.GA77078@redhat.com (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

Mike Snitzer July 26, 2016, 1:16 a.m. UTC
On Mon, Jul 25 2016 at  6:00P -0400,
Bart Van Assche <bart.vanassche@sandisk.com> wrote:

> On 07/25/2016 02:23 PM, Mike Snitzer wrote:
> >So I'd be curious to know if your debugging has enabled you to identify
> >exactly where in the dm-mapth.c code the -EIO return is being
> >established.  do_end_io() is the likely candidate -- but again the
> >__must_push_back() check should prevent it and DM_ENDIO_REQUEUE should
> >be returned.
> 
> Hello Mike,
> 
> Thanks for looking further into this. The pr_info() statement that I had
> added in the following code block in __multipath_map() fired what told me
> that the following code block triggered the -EIO return:
> 
> 	if (!pgpath) {
> 		if (!must_push_back(m))
> 			r = -EIO;	/* Failed */
> 		pr_info("%s(): (a) returning %d\n", __func__, r);
> 		return r;
> 	}
> 
> From the system log:
> 
> kernel: mpath 254:0: queue_if_no_path 1 -> 0
> kernel: __multipath_map(): (a) returning -5
> 
> The code that I had added in queue_if_no_path() is as follows:
> 
> 	old = test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags);
>         [ ... ]
> 	pr_info("mpath %s: queue_if_no_path %d -> %d\n",
> 		dm_device_name(dm_table_get_md(m->ti->table)), old,
> 		queue_if_no_path);

Hi Bart,

Please try this patch to see if it fixes your issue, thanks.

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Bart Van Assche July 26, 2016, 10:51 p.m. UTC | #1
On 07/25/2016 06:15 PM, Mike Snitzer wrote:
> Please try this patch to see if it fixes your issue, thanks.
> 
> diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
> index 52baf8a..287caa7 100644
> --- a/drivers/md/dm-mpath.c
> +++ b/drivers/md/dm-mpath.c
> @@ -433,10 +433,17 @@ failed:
>   */
>  static int must_push_back(struct multipath *m)
>  {
> -	return (test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags) ||
> -		((test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags) !=
> -		  test_bit(MPATHF_SAVED_QUEUE_IF_NO_PATH, &m->flags)) &&
> -		 dm_noflush_suspending(m->ti)));
> +	bool r;
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&m->lock, flags);
> +	r = (test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags) ||
> +	     ((test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags) !=
> +	       test_bit(MPATHF_SAVED_QUEUE_IF_NO_PATH, &m->flags)) &&
> +	      dm_noflush_suspending(m->ti)));
> +	spin_unlock_irqrestore(&m->lock, flags);
> +
> +	return r;
>  }
>  
>  /*
 
Hello Mike,

Thank you for having made this patch available. Unfortunately even
with this patch applied I still see fio reporting I/O errors and the
following text still appears in the system log immediately before the
I/O errors are reported (due to debug statements I added in the device
mapper; mpath 254:0 and mpathbe refer to the same dm device):

Jul 26 15:40:37 ion-dev-ib-ini kernel: mpath 254:0: queue_if_no_path 0 -> 1
Jul 26 15:40:37 ion-dev-ib-ini kernel: executing DM ioctl DEV_SUSPEND on mpathbe
Jul 26 15:40:37 ion-dev-ib-ini kernel: mpath 254:0: queue_if_no_path 1 -> 0

BTW, I have not yet been able to determine which user space code
triggers the DEV_SUSPEND ioctl. A condlog() call I had added just
above dm_simplecmd_flush(DM_DEVICE_SUSPEND, mapname, 0) call in
multipathd did not produce any output.

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mike Snitzer July 27, 2016, 2:08 p.m. UTC | #2
On Tue, Jul 26 2016 at  6:51pm -0400,
Bart Van Assche <bart.vanassche@sandisk.com> wrote:

> On 07/25/2016 06:15 PM, Mike Snitzer wrote:
> > Please try this patch to see if it fixes your issue, thanks.
> > 
> > diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
> > index 52baf8a..287caa7 100644
> > --- a/drivers/md/dm-mpath.c
> > +++ b/drivers/md/dm-mpath.c
> > @@ -433,10 +433,17 @@ failed:
> >   */
> >  static int must_push_back(struct multipath *m)
> >  {
> > -	return (test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags) ||
> > -		((test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags) !=
> > -		  test_bit(MPATHF_SAVED_QUEUE_IF_NO_PATH, &m->flags)) &&
> > -		 dm_noflush_suspending(m->ti)));
> > +	bool r;
> > +	unsigned long flags;
> > +
> > +	spin_lock_irqsave(&m->lock, flags);
> > +	r = (test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags) ||
> > +	     ((test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags) !=
> > +	       test_bit(MPATHF_SAVED_QUEUE_IF_NO_PATH, &m->flags)) &&
> > +	      dm_noflush_suspending(m->ti)));
> > +	spin_unlock_irqrestore(&m->lock, flags);
> > +
> > +	return r;
> >  }
> >  
> >  /*
>  
> Hello Mike,
> 
> Thank you for having made this patch available. Unfortunately even
> with this patch applied I still see fio reporting I/O errors and the
> following text still appears in the system log immediately before the
> I/O errors are reported (due to debug statements I added in the device
> mapper; mpath 254:0 and mpathbe refer to the same dm device):
> 
> Jul 26 15:40:37 ion-dev-ib-ini kernel: mpath 254:0: queue_if_no_path 0 -> 1
> Jul 26 15:40:37 ion-dev-ib-ini kernel: executing DM ioctl DEV_SUSPEND on mpathbe
> Jul 26 15:40:37 ion-dev-ib-ini kernel: mpath 254:0: queue_if_no_path 1 -> 0

This is all as expected.  Only question I have: is
dm_noflush_suspending() false? -- I assume so given must_push_back() is
returning false.

I'm struggling to appreciate why must_push_back() is only true if
noflush is used.  Regardless of which type, if there are no paths and
queue_if_no_path was configured (implied by current != saved) then we
shouldn't be returning -EIO back up the stack.

> BTW, I have not yet been able to determine which user space code
> triggers the DEV_SUSPEND ioctl. A condlog() call I had added just
> above dm_simplecmd_flush(DM_DEVICE_SUSPEND, mapname, 0) call in
> multipathd did not produce any output.

I need to dig into the multipath-tools userspace code more to be able to
answer.  But I've cc'd Ben Marzinski explicitly to get his insight.

Curious if multipath-tools _always_ use the noflush variant of suspend?
If not then we're setting ourselves up to return -EIO when we shouldn't.

Mike
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Benjamin Marzinski July 27, 2016, 3:52 p.m. UTC | #3
On Wed, Jul 27, 2016 at 10:08:28AM -0400, Mike Snitzer wrote:
> On Tue, Jul 26 2016 at  6:51pm -0400,
> Bart Van Assche <bart.vanassche@sandisk.com> wrote:
> 
> > On 07/25/2016 06:15 PM, Mike Snitzer wrote:
> > > Please try this patch to see if it fixes your issue, thanks.
> > > 
> > > diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
> > > index 52baf8a..287caa7 100644
> > > --- a/drivers/md/dm-mpath.c
> > > +++ b/drivers/md/dm-mpath.c
> > > @@ -433,10 +433,17 @@ failed:
> > >   */
> > >  static int must_push_back(struct multipath *m)
> > >  {
> > > -	return (test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags) ||
> > > -		((test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags) !=
> > > -		  test_bit(MPATHF_SAVED_QUEUE_IF_NO_PATH, &m->flags)) &&
> > > -		 dm_noflush_suspending(m->ti)));
> > > +	bool r;
> > > +	unsigned long flags;
> > > +
> > > +	spin_lock_irqsave(&m->lock, flags);
> > > +	r = (test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags) ||
> > > +	     ((test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags) !=
> > > +	       test_bit(MPATHF_SAVED_QUEUE_IF_NO_PATH, &m->flags)) &&
> > > +	      dm_noflush_suspending(m->ti)));
> > > +	spin_unlock_irqrestore(&m->lock, flags);
> > > +
> > > +	return r;
> > >  }
> > >  
> > >  /*
> >  
> > Hello Mike,
> > 
> > Thank you for having made this patch available. Unfortunately even
> > with this patch applied I still see fio reporting I/O errors and the
> > following text still appears in the system log immediately before the
> > I/O errors are reported (due to debug statements I added in the device
> > mapper; mpath 254:0 and mpathbe refer to the same dm device):
> > 
> > Jul 26 15:40:37 ion-dev-ib-ini kernel: mpath 254:0: queue_if_no_path 0 -> 1
> > Jul 26 15:40:37 ion-dev-ib-ini kernel: executing DM ioctl DEV_SUSPEND on mpathbe
> > Jul 26 15:40:37 ion-dev-ib-ini kernel: mpath 254:0: queue_if_no_path 1 -> 0
> 
> This is all as expected.  Only question I have: is
> dm_noflush_suspending() false? -- I assume so given must_push_back() is
> returning false.
> 
> I'm struggling to appreciate why must_push_back() is only true if
> noflush is used.  Regardless of which type, if there are no paths and
> queue_if_no_path was configured (implied by current != saved) then we
> shouldn't be returning -EIO back up the stack.
> 
> > BTW, I have not yet been able to determine which user space code
> > triggers the DEV_SUSPEND ioctl. A condlog() call I had added just
> > above dm_simplecmd_flush(DM_DEVICE_SUSPEND, mapname, 0) call in
> > multipathd did not produce any output.

if you look in drivers/md/dm-ioctl.c at do_resume(), device mapper
internally does a suspend when you call resume with a new table loaded.
That's when these suspends are happening.

In the userspace tools, this happens in the DM_DEVICE_RESUME calls after
dm_addmap_reload(), which loads the new table.  These all happen in the
domap() function.
 
> I need to dig into the multipath-tools userspace code more to be able to
> answer.  But I've cc'd Ben Marzinski explicitly to get his insight.
> 
> Curious if multipath-tools _always_ use the noflush variant of suspend?
> If not then we're setting ourselves up to return -EIO when we shouldn't.

There is only one time when we don't use noflush. That's when we resize
the table, and that's because we could otherwise have IOs that are past
the end of the device. It's been a known issue for a while now that you
cannot resize a multipath device with no working paths. Or (to say it in
a way that doesn't assume that people are stupid) if you lose all of
your paths while resizing a multipath device, IOs will fail. I've never
heard of anyone pushing back on that limitation.

-Ben

> Mike
> 
> --
> dm-devel mailing list
> dm-devel@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bart Van Assche July 27, 2016, 7:06 p.m. UTC | #4
On 07/27/2016 08:52 AM, Benjamin Marzinski wrote:
> if you look in drivers/md/dm-ioctl.c at do_resume(), device mapper
> internally does a suspend when you call resume with a new table loaded.
> That's when these suspends are happening.
>
> In the userspace tools, this happens in the DM_DEVICE_RESUME calls after
> dm_addmap_reload(), which loads the new table.  These all happen in the
> domap() function.

Thanks Ben for chiming in. As far as I can see md->map is only used in 
the I/O submission path but not in the I/O completion path. So why to 
suspend and resume I/O before activating the new map? Do you think it 
would be safe to activate the new map without suspending and resuming I/O?

Thanks,

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mike Snitzer July 27, 2016, 7:54 p.m. UTC | #5
On Wed, Jul 27 2016 at  3:06pm -0400,
Bart Van Assche <bart.vanassche@sandisk.com> wrote:

> On 07/27/2016 08:52 AM, Benjamin Marzinski wrote:
> >if you look in drivers/md/dm-ioctl.c at do_resume(), device mapper
> >internally does a suspend when you call resume with a new table loaded.
> >That's when these suspends are happening.
> >
> >In the userspace tools, this happens in the DM_DEVICE_RESUME calls after
> >dm_addmap_reload(), which loads the new table.  These all happen in the
> >domap() function.

multipathd's only call to dm_addmap_reload() with flush = true is the
ACT_RESIZE case in do_map().

> Thanks Ben for chiming in. As far as I can see md->map is only used
> in the I/O submission path but not in the I/O completion path. So
> why to suspend and resume I/O before activating the new map? Do you
> think it would be safe to activate the new map without suspending
> and resuming I/O?

That is the DM state machine.  Before a new table can be swapped in, via
resume, the old map needs to first be suspended.

I appreciate you just hunting for _why_ on this but questioning this
aspect of userspace <-> kernel ioctl interface between multipathd and
dm-mpath isn't productive.

I'll focus on reviewing 4.7's flag management (if there is anywhere that
queue_if_no_path and the saved_ variant are managed without holding the
lock).  Basically the 4.7 changes that reduced/dropped holding the
m->lock in so many places could have allowed for some race that is
causing must_push_back() to return false (in 4.7) when it previously
return true (<= 4.6).
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
index 52baf8a..287caa7 100644
--- a/drivers/md/dm-mpath.c
+++ b/drivers/md/dm-mpath.c
@@ -433,10 +433,17 @@  failed:
  */
 static int must_push_back(struct multipath *m)
 {
-	return (test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags) ||
-		((test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags) !=
-		  test_bit(MPATHF_SAVED_QUEUE_IF_NO_PATH, &m->flags)) &&
-		 dm_noflush_suspending(m->ti)));
+	bool r;
+	unsigned long flags;
+
+	spin_lock_irqsave(&m->lock, flags);
+	r = (test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags) ||
+	     ((test_bit(MPATHF_QUEUE_IF_NO_PATH, &m->flags) !=
+	       test_bit(MPATHF_SAVED_QUEUE_IF_NO_PATH, &m->flags)) &&
+	      dm_noflush_suspending(m->ti)));
+	spin_unlock_irqrestore(&m->lock, flags);
+
+	return r;
 }
 
 /*